Normalizing weight for region/province [message #2178] |
Sun, 18 May 2014 02:57 |
jcon
Messages: 3 Registered: May 2014 Location: United States
|
Member |
|
|
DHS normalizes weights so that the national unweighted n = weighted n.
The province/region sampling error tables show both unweighted and weighted n, with the weighted n normalized at the national level. This means that in some oversampled provinces the weighted n is very small. Are province/region level confidence intervals calculated with the weighted or unweighted n?
I am doing an endline evaluation for a project that covered three provinces in Lao PDR. The baseline is the 2011/12 MICS/DHS (combined). Currently, I'm trying to estimate power/sample size for comparing baseline to a future endline. I need to keep the sample weighted so that it is representative at the provincial level, but with the DHS national normalized weights I have an unweighted n of 2200 and a weighted n of 1300 (all of the provinces were oversampled). Can I re-normalize weights so that unweighted n = weighted n in these three provinces? There is no mention of this in the DHS manuals, which only state that provincial level estimates must use weights.
DHS does not normalize to provincial level in any of their tables; always showing the national normalized n at the provincial level. As the sample is representative at province level, it seems like it would make more sense to normalize weights at the provincial level when looking exclusively at specific provinces (something DHS reports are not designed to do).
Any suggestions would be greatly appreciated.
|
|
|
Re: Normalizing weight for region/province [message #2179 is a reply to message #2178] |
Mon, 19 May 2014 00:23 |
Reduced-For(u)m
Messages: 292 Registered: March 2013
|
Senior Member |
|
|
I hesitate to offer an option, but here is one way to think about it. I'm supposing that those surveys are representative at the regional level - some aren't, and I don't know about this particular one.
Assuming they are regionally representative, you could normalize first within each region so that the region sums to 1. Then you could use outside data on regional populations to overlay population weighting on the within-region probability weights, doing that separately for each survey round.
The thinking is that even if the weights for some region sum to a number you aren't interested in, the relative size of the weights within a region contains all of the probability of selection information (within the region information). So by forcing those weights to sum to one, you preserve the probability weighting and lose the problem of the weights summing to the wrong number. Then, you multiply all those weights by the population of the region so that the region itself has weight summing to its population. It would look something like this in Stata:
*region total sum of weights
egen region_tot_weight = total(weight), by(region)
*re-normalize within region
gen region_norm_weight = weight/region_tot_weight
*overlay population weight
gen final_weight = region_norm_weight * region_population
I've run this by some people who should know, and they generally seem to think it makes sense, but I wouldn't say that this is guaranteed right. I've never seen it used in a paper. Then again, most papers don't really address the weighting problem or at least don't provide any information on how they actually adjusted the weights. I think the widespread use of multiple-round DHS analysis is pretty recent and demand is a bit ahead of the technical expertise in this area.
|
|
|
|
Re: Normalizing weight for region/province [message #2415 is a reply to message #2413] |
Sun, 15 June 2014 15:03 |
Reduced-For(u)m
Messages: 292 Registered: March 2013
|
Senior Member |
|
|
RE: Comparing different provinces in same survey:
Wouldn't another way to do that be to use the full sample with the regular weights, and then use separate dummy variables for each province and do an F-test that all coefficients are the same? That might be a way to get around the re-weighting/re-normalizing problem, at least for some analyses.
|
|
|