Pakistan 2014 - Using correct weights for domestic violence [message #8478] |
Sun, 01 November 2015 23:50 |
gspek
Messages: 4 Registered: November 2015
|
Member |
|
|
I am using the Pakistan 2014 women's dataset for analysis of associations between women's empowerment (and individual and community level) and domestic violence experiences (at individual level). I am currently struggling with a couple of issues:
1. How do I correctly weight community-level independent variables? I tried to create community-level weighted means for average percentage of women working in a community, average decision making index value, and percentage in community accepting of any domestic violence justification scenarios (with participant's values excluded). However, I'm not sure how to approach the weights in regressions where both the community and individual level independent variables are included.
foreach var of varlist works dm_sum dvaccept_any{
gen `var'_w = `var'*wgt if !missing(`var') & nocomm == 0
bys v001: gen `var'_agg_w = sum(`var'_w) if nocomm == 0
gen `var'_comm_w = (`var'_agg_w - `var'_w)/(clustersz - 1) if nocomm == 0
}
2. For stata logic analysis (and summary statistics), do I use iweights or pweights?
Thank you!
|
|
|
|
|
Re: Pakistan 2014 - Using correct weights for domestic violence [message #8546 is a reply to message #8533] |
Thu, 12 November 2015 09:46 |
Bridgette-DHS
Messages: 3203 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS Stata Specialist, Tom Pullum:
Yes, you should use the strata. Because stratification makes the sample more efficient, the stratum adjustment will usually slightly reduce the standard errors and confidence intervals. The stratum variable is v022 or v023, usually the combinations of urban/rural and region. Sometimes there is ambiguity about the stratum variable. DHS is preparing a list for all the surveys.
If you merge the male data with the DV data then you are probably talking about a couples file. Usually the recommendation is that you use the male weights (mv005) for couples. In your situation, it could be better to use dv005. The general rule is to use the weights that have been calculated for the subsample with the highest level of nonresponse. Men have more nonresponse than women. Among women, there is a higher level of nonresponse for the DV questions than for non-DV questions. It could be a judgment call, whether you would use the men's weights or the DV weights for these couples. If there is ambiguity about what weights to use, I suggest you do some analysis with one set and then repeat with the other set. I would expect very little difference.
To construct your cluster-level variables, you can do a collapse, within which you can use the weight option, and then attach the cluster-level variables to the individual-level records. You should use the weight option rather than multiplying the variables by the weights, which you did in some earlier code.
|
|
|
|