The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Weighting in pooled data
Weighting in pooled data [message #9772] Mon, 16 May 2016 07:55 Go to next message
owraza is currently offline  owraza
Messages: 36
Registered: December 2013
Location: Karachi, Pakistan
Member
I have tried to read extensively about weighting posted on this forum (also, re-watched the DHS webinar on weighting) and while doing so I found one comment by Tom Pullum (DHS) in message # 6672, quoting:

"If you construct a cluster-level variable using the collapse command, it is not necessary to use weights at all, because everyone in the same cluster has the same weight. To confirm this, you could collapse WITH weights and then collapse WITHOUT weights, and compare the two sets of numbers. They should be exactly the same.

However, if you want to collapse for a larger aggregate, such as a district or region, which includes more than one cluster, you definitely should use weights as part of the collapse."

My concern in this regard is, does this information still valid for a scenario where I have to pool various countries together (after collapsing at cluster level) and run regression analyses? Any suggestion where do I apply weighting?
Re: Weighting in pooled data [message #9779 is a reply to message #9772] Tue, 17 May 2016 05:59 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3172
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

My comment in message #6672 may have been incomplete. If you calculate a cluster-level mean, proportion, standard deviation, etc., it will be the same whether or not you use weights. However, for analyses that include the clusters as units, you do need to save the total weight for the cluster.

When pooling multiple surveys, I would first re-scale the weights (e.g. hv005) in each survey by a factor. For example, if you have S surveys, Ni total (weighted=unweighted) cases in survey i, and a total of N cases in all S surveys (N=sum Ni) then you could decide to give equal weight to each survey. You then want the weights in survey i to add to N/S, rather than to Ni. To do that, you multiply the weights in survey i by the ratio (N/S) / Ni. (I think of this as the target total divided by the original total.) You can actually do this re-scaling later, not necessarily just at the beginning....

Then when you do the collapsing to get cluster-level means you can ignore the weights for the calculation of cluster level means, etc., as I said, but you must save the weighted total for each cluster. For example, say you are using the IR file and want the mean CEB (which is v201) for each cluster. Part of the within-survey collapse would look like this: "collapse (mean) v201 (sum) v005, by(v001)". Then in your analysis you would treat the collapsed (summed) v005 as the weight. You should also carry along the stratum code and use svyset to adjust for weights and strata, although not for clusters, because the clusters are now your units. Hope this helps.
Re: Weighting in pooled data [message #9787 is a reply to message #9779] Wed, 18 May 2016 11:16 Go to previous messageGo to next message
owraza is currently offline  owraza
Messages: 36
Registered: December 2013
Location: Karachi, Pakistan
Member
Thanks for your post. Just one more question, will I be able to re-scale once I have merged and collapsed data (at cluster level)?
Re: Weighting in pooled data [message #9795 is a reply to message #9787] Thu, 19 May 2016 10:13 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3172
Registered: February 2013
Senior Member
Another response from Tom Pullum:

Yes, you should be able to do the rescaling later. I would go back to the original files and calculate the survey-specific ratios or factors that I described. That's one number for each survey. Some factors will be greater than one, some will be less than one. Then go to the final (merged and collapsed) files and multiply the weights (or weighted frequencies) by those factors, survey by survey. To check, you can confirm that in the final file each survey has the same total weight. There are other ways to get the same result but this should work.
Re: Weighting in pooled data [message #9797 is a reply to message #9795] Thu, 19 May 2016 16:52 Go to previous message
owraza is currently offline  owraza
Messages: 36
Registered: December 2013
Location: Karachi, Pakistan
Member
I will follow this process and let you know the result.
Thanks
Previous Topic: Weighting in pooled data
Next Topic: weighting data
Goto Forum:
  


Current Time: Fri Oct 11 19:57:15 Coordinated Universal Time 2024