The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Correct use of weights for subsamples
Correct use of weights for subsamples Thu, 16 November 2023 08:06
 Christian Bommer Messages: 13Registered: June 2015 Member
Dear DHS team,

I have a fairly general and probably simple question but I couldn't find the answer by browsing the forum (sorry if I missed anything). I am using multiple Women's Recode (IR) datasets from various countries. For each of them I want to calculate aggregate statistics. I know that the surveys include standard weight variables such as V005 that should be used when trying to estimate aggregate nationally representative statistics. However, I was wondering if additional weights are required when I look at certain subpopulations.

For instance, say, I create a binary indicator that captures whether a women below the age of 18 (at interview) has at least one child (using V201 - number of children, and V012 - age of respondent). The aggregate statistic I want to derive from this variable is the percentage of women below the age of 18 who have at least one child (so all women below the age of 18 are the denominator). Is it sufficient to use the weight V005 for this or do I need an additional weight that accounts for the fact that I only look at a subset of women (those below the age of 18)?

Another example that is slightly different: I want to know the percentage of women who have at least one child (regardless of women's age at interview) but I want to know this percentage for the following subgroups:
- rural households (V102)
- poorest-quintile households (based on the asset index) (V190)
- women who are literate (V155)
- women who did not complete primary education (V106)

In case you need a specific survey for the answer: One of the surveys I want to use is the 2015/16 DHS from Tanzania (TZIR7B). I will work with Stata.

Best regards,
Christian

Re: Correct use of weights for subsamples [message #28155 is a reply to message #28127] Mon, 20 November 2023 08:08
 Bridgette-DHS Messages: 3115Registered: February 2013 Senior Member

Following is a response from Senior DHS staff member, Tom Pullum:

The sampling weight coded in the data files is a characteristic of the cases. It is proportional to the inverse of the sampling fraction. The sampling fraction varies according to the survey design, mainly from one stratum (v023) to another, although it includes a small adjustment for nonresponse. Stratum and nonresponse are the only sources of variation in the weights. Weights do not vary by covariates and do not need to be adjusted if you select sub populations. (If your selection involved subsampling with different sampling fractions for different subpopulations, then an adjustment would be required, but I don't think that's ever done.)

Except for the factor of 1,000,000, the weighted and unweighted numbers of cases in a subpopulation is about the same, but never exactly the same, and such differences are not a problem. The purpose of the weights is to adjust the sample so that estimates of means, proportions, etc., are unbiased, and this requires weighting up or weighting down, depending on whether a stratum was under-sampled or over-sampled.
Re: Correct use of weights for subsamples [message #28158 is a reply to message #28155] Mon, 20 November 2023 09:07
 Christian Bommer Messages: 13Registered: June 2015 Member
Thank you!
 Previous Topic: Survey weight for a Merged IR file of NFHS-4 and NFHS-5 Next Topic: Primary sampling unit
Goto Forum:

Current Time: Sat Jul 20 14:53:45 Coordinated Universal Time 2024