weighting data for a subset of main data file [message #9360] |
Sat, 19 March 2016 09:58 |
HumAta
Messages: 2 Registered: March 2016
|
Member |
|
|
Dear experts at DHS
I am using PKIR61FLV.SAV file from Pakistan's DHS data 2012-13. My aim is to determine the quality of antenatal care received by mothers who had their last pregnancy in the five years preceding the survey and those who had, at least, one ANC consultation in their last pregnancy. I need some guidance on the use of sample weights to a subset of DHS data like mine.
I have learned that DHS uses 'normalized standards weights for both households (HH) and individuals so that the number of weighted cases coincides with that of unweighted cases at the national level" in all the DHS final reports. The DHS sampling manual also states that all the weights in DHS recode files are thus relative weights. These can be used to calculate unbiased estimates of mean, proportions, rate and ratios etc at the national level because the normalization factor is canceled out when used both in the numerator and the denominators. So it recommends that normalization must be done at the national level estimates and not at the regional level because at regional level it introduces bias in the calculated values.(DHS sampling manual..page 26)
Page 33 of DHS recode manual states about the "V005' which is the sample weight variable for women individual recode file that "....it is normalized such that the weighted number of cases is identical to the unweighted number of cases when using the full dataset with no selection". This is because the sum of standardized or normalized weights equals the sum of cases over the entire sample (Guide to DHs statistics.. page 14).
I am not using the full dataset of women recode file, rather I am using a subset of women who had, at least, one ANC consultation in their last pregnancy (5522/13553 women). So am I right it in understanding that I must not use sample weights in my calculations?
Looking forward to your response.
Best regards
HUmera
|
|
|
Re: weighting data for a subset of main data file [message #9439 is a reply to message #9360] |
Mon, 28 March 2016 10:28 |
Bridgette-DHS
Messages: 3216 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS Stata Specialist, Tom Pullum:
Some of the recommendations about weighting in the DHS Guide to Statistics are misleading or out of date and will be modified in the next version. You should use the sample weights no matter what subset of the data you are using. Otherwise your estimates will be biased and will over-represent the subpopulations that were over-sampled and under-represent the subpopulations that were under-sampled.
You do not have to worry about the normalization of the weights. If you are using Stata, with the "pweight" option (as in svyset) then the weights are automatically re-normalized so that the total weight equals the number of actual cases in the analysis. (The other three weight options in Stata do NOT re-normalize.) Other packages such as SPSS may also re-normalize the weights, although I cannot say for sure. If you have any doubt whether the weights are automatically re-normalized within the package and procedure you are using, you can run something with v005, which includes a factor of 1,000,000, and see whether the factor of 1,000,000 has been removed. If it has, then the weights were automatically re-normalized.
|
|
|
|