How to control for the year of sample [message #21906] |
Thu, 07 January 2021 08:44 |
olympiaca
Messages: 9 Registered: January 2021
|
Member |
|
|
Hello,
I'm using the domestic violence data to look at the likelihood of a woman receiving violence from her husband. Currently I'm combining three Jordan DHS: the 2007, 2012, and 2017. The data has structure in that it is grouped into regions and also grouped by cohort.
I'm interested in the difference in likelihood of receiving violence between cohorts as well as some other variables.
Is it appropriate to do a two-level model where I use Region as my grouping variable, and then have cohort as a predictor variable? Or should I be performing a three level model with individuals nested in cohorts nested in regions?
Statistically I'm not sure which one is appropriate?
Many thanks
O
|
|
|
|
|
|
|
|
Re: How to control for the year of sample [message #21932 is a reply to message #21911] |
Fri, 08 January 2021 09:45 |
Bridgette-DHS
Messages: 3195 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
There have been many posts on this topic (pooling or comparing surveys), but mostly in terms of Stata because mostly we use Stata.
The strata in successive surveys should have different codes for the purpose of adjusting for the survey design, even when the strata are the same in each survey. You need some mechanism for assigning different ID codes in the different surveys. For example, if you number the surveys 1, 2, 3, you could construct a variable "stratumid" as "V022 + 1000*(survey-1). In Stata we would use "egen stratumid=group(survey v022)" to construct distinct numbers.
There are different possibilities for the weights. Say that the number of cases in each survey is n1, n2, n3. You could construct a new weight "v005rev" that would be proportional to v005 within each survey but would add to (n1+n2+n3)/3 within each survey. This is spelled out in other posts. Alternatively, you could leave the weights alone, but then the estimates would be biased toward the largest survey.
You should still include region as a predictor in the regression. Including it (via V022) in the survey adjustments will only adjust for the design. It does not control for region in the analysis.
|
|
|