The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Resampling data to obviate the need for weighting
Resampling data to obviate the need for weighting [message #25839] Fri, 16 December 2022 16:58 Go to next message
Kene David Nwosu is currently offline  Kene David Nwosu
Messages: 2
Registered: December 2022
Member
I am helping to teach an *intro to data analysis* class, and want students to be able to use DHS data for their work. But students are not yet familiar with weights.

Is it possible for me to resample the datasets (women's recode) such that the resampled data is nationally representative WITHOUT requiring weights?

And could you point me in the right direction on how I might do this (in R or STATA)?

Thank you so much!
Re: Resampling data to obviate the need for weighting [message #25848 is a reply to message #25839] Mon, 19 December 2022 09:31 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member

Following is a response from Senior DHS staff member, Tom Pullum:

Thanks for including DHS data in your courses! Unfortunately, weights are always required to produce unbiased estimates. You are right that much--maybe most--of the variation in weights is across strata, because large strata tend to be under-sampled and small strata tend to be over-sampled, to increase the efficiency (that is, to reduce the standard errors) of the national estimates. However, they also vary considerably across clusters in the same stratum. Variation in weights across clusters arises because (a) the observed number of households within a cluster typically differs from the number in the sampling frame and (b) an adjustment is made for nonresponse.

Within a cluster, the weights (hv005, v005, mv005, or d005) are the same for all cases (households, women, men, or respondents to the DV module). If you wanted to calculate, say, the mean number of children ever born (v201) or the proportion of women with no schooling (v106=0) for a specific cluster, you would not need to use weights. However, cluster-level analysis is almost never done. It IS done if you want to develop cluster-level covariates as contextual variables, but that's not a good example for an introduction to data analysis.

I suggest that you go ahead and introduce the students to data analysis just ignoring the weights and tell them that this is often done for exploratory work. Personally, I often start a new analysis without weights or any other adjustments, just to see whether there appears to be something going on in the data. And for data quality analysis it is normal to give equal weight to each observation, which is equivalent to ignoring the weights. Some researchers, mainly economists, prefer not to use weights at all, because it is more important for them to minimize the standard errors than to minimize the bias. For teaching purposes, I'd say it's ok to omit weights, but please do tell the students that their results are approximate/

Re: Resampling data to obviate the need for weighting [message #25849 is a reply to message #25848] Mon, 19 December 2022 10:03 Go to previous message
Kene David Nwosu is currently offline  Kene David Nwosu
Messages: 2
Registered: December 2022
Member
Thank you very much for the detailed answers! I appreciate the explanation of the reasons why weights are required even within strata. I will proceed as you suggest, by having students work without weights, but mentioning that their results are approximate. Thank you again for your world-improving work.
Previous Topic: analysing change over time and weights normalisation
Next Topic: Apply Weights to household members (PR) files
Goto Forum:
  


Current Time: Sat Nov 23 16:53:23 Coordinated Universal Time 2024