Home » Data » Weighting data » Domestic violence weight, denormalize, pooled crosssection, crosstabulation
Re: Domestic violence weight, denormalize, pooled crosssection, crosstabulation [message #10057 is a reply to message #10047] 
Mon, 20 June 2016 11:34 
BridgetteDHS
Messages: 3090 Registered: February 2013

Senior Member 


Following is a response from Senior DHS Stata Specialist, Tom Pullum:
Pooling surveys into a single file is convenient for data processing and for calculating differences, but as you imply, the reference population is not well defined. I do not recommend calculating a mean (or something like a mean) for all surveys combined. But sometimes people do this and there's no law against it. If you decide to do this, I would recommend giving equal weight to each survey, which means rescaling v005 or d005 in each survey so that the weighted total is the same in each survey. That is, if there are 10 surveys, and the UNWEIGHTED total number of women with d005<. in all 10 surveys is N, then rescale d005 in each survey so that the WEIGHTED total in each survey is N/10. As I said, however, I would be reluctant to pool the surveys this way.
I would prefer to use the pooled data to do regressions that include "survey" as a categorical variable for fixed effects OR, if you have a lot of surveys, a random effect for the intercept. For such regressions, you do not need to rescale the weights, but can leave them as they are in each survey. Then the total weighted number of cases will equal the total unweighted number of cases in each survey and for the combination of all surveys. From a statistical perspective, this is good because, as you say, the actual number of cases is what you need for a valid estimate of sampling error. And you are not producing an estimate of an overall mean (or proportion, etc.).
Yes, you can renormalize d005 in the same way as v005. (I prefer "renormalize" to "denormalize". I don't think the latter term means the same thing for everyone.)
Stata recommends that you use the subpop option within svyset. I have done some checking and the difference between using subpop and NOT using subpop is always very small, much smaller than sampling error, but there are good theoretical reasons for using it. You refer to it but I don't see that option in your svyset statement.





Domestic violence weight, denormalize, pooled crosssection, crosstabulation
By: RenaM on Mon, 20 June 2016 05:37



Re: Domestic violence weight, denormalize, pooled crosssection, crosstabulation



Re: Domestic violence weight, denormalize, pooled crosssection, crosstabulation
By: RenaM on Wed, 29 June 2016 17:55

Goto Forum:
Current Time: Thu Jun 20 12:14:30 Coordinated Universal Time 2024
