The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Sampling » Stratification
Stratification [message #1276] Mon, 03 February 2014 20:09 Go to previous message
hlyons is currently offline  hlyons
Messages: 5
Registered: February 2014
Location: Seattle
Member
Hello -- I was wondering if there are some known "strata" variable errors in some of the surveys? My example is a random one, Zambia DHS IV:

Using the children's recode and R for computing, here is a somewhat fake example at the bottom -- fake in the sense the outcome (urban/rural split for children) is not something I'm really looking at. Basically, it looks like there are too many strata (v022). A less theoretical example would be DPT3 coverage nationally for urban and rural areas -- I think the standard errors for urban in the final report are more compatible with using v022. It's hard to say for sure because I do get some differences trying to duplicate the results in the final report. Maybe I'll write another post about that later...

Thoughts, anybody?

Thanks!

Hil

# R example
library(survey)
tmp.data = read.dta("ZMKR42FL.DTA")

# PSU's, checks out: 320
length(unique(tmp.data$v021))

# province, check out: 9
length(unique(tmp.data$v024))

# province and urban/rural combinations: 18
nrow(unique(tmp.data[,c("v023","v025")]))

# strata: 153 instead of 18
length(unique(tmp.data$v022))

# example of standard errors under two designs
# first, stratify on v022; second, on province and u/r combination
DHSdesign.v022 = svydesign(id = tmp.svy$v021, strata=~tmp.svy$v022, weights = tmp.svy$v005/1000000, data=tmp.data)
DHSdesign.prov.ur = svydesign(id = tmp.svy$v021, strata=~tmp.svy$v023+tmp.svy$v025, weights = tmp.svy$v005/1000000, data=tmp.data)

# proportion urban amongst children
svymean(~v025, design = DHSdesign.v022) #SE = 0.0099
svymean(~v025, design = DHSdesign.prov.ur) #SE = 0.0227
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Specific description of the data
Next Topic: Two or Three Sampling Stages
Goto Forum:
  


Current Time: Sat Apr 27 11:36:54 Coordinated Universal Time 2024