The DHS Program User Forum: Sampling » Accounting for different sampling areas over different years

Home » Data » Sampling » Accounting for different sampling areas over different years

Show: Today's Messages :: Show Polls :: Message Navigator

Accounting for different sampling areas over different years [message #3783]

Tue, 10 February 2015 23:57

UAB_user
Messages: 21
Registered: September 2014
Location: Alabama

Member

Hello,

I am using the Nepal DHS to look at factors affecting migration across the 01, 06, and 11 survey years.

I have de-normalized the weights for each year according to Ruilin's suggestions, but do I have to somehow account for the different sampling areas for each year. Would it be ok to merge all three years and use the cluster (V001) and strata (V023) variables in my analysis and assume the areas are the same for each survey round?

If I do have to adjust them, how do you recommend I go about doing so?

Thank you
Derek

Report message to a moderator

Re: Accounting for different sampling areas over different years [message #3791 is a reply to message #3783]

Wed, 11 February 2015 10:44

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is a response from Senior DHS Specialists Ruilin Ren & Trevor Croft:

You need to do something to distinguish the cluster numbers (V001) that have the same value, but they actually came from different surveys. Each cluster from each survey should stand alone as a cluster in your merged file.

You can create a new cluster number as follows:
gen year =.
{ Convert Nepali years to unique survey years }
replace year=2011 if v007 == 2067 | v007 == 2068
replace year=2006 if v007 == 2062 | v007 == 2063
replace year=2001 if v007 == 2057 | v007 == 2058
egen newcluster = group(year v001)

Report message to a moderator

Re: Accounting for different sampling areas over different years [message #3794 is a reply to message #3791]

Wed, 11 February 2015 18:14

UAB_user
Messages: 21
Registered: September 2014
Location: Alabama

Member

Great!

Thank you
Derek

Report message to a moderator

Re: Accounting for different sampling areas over different years [message #3813 is a reply to message #3791]

Tue, 17 February 2015 15:56

UAB_user
Messages: 21
Registered: September 2014
Location: Alabama

Member

Would i have to do this to V023 as well?

Report message to a moderator

Re: Accounting for different sampling areas over different years [message #3825 is a reply to message #3813]

Wed, 18 February 2015 07:58

Trevor-DHS
Messages: 805
Registered: January 2013

Senior Member

While the strata in v023 are consistent across the 3 surveys and represent the same areas (unlike v001 which are different clusters in each survey year), I would recommend following the same procedure to create a separate strata for each survey year as for v001.

Report message to a moderator

Re: Accounting for different sampling areas over different years [message #4202 is a reply to message #3825]

Thu, 16 April 2015 13:08

mmr-UMICH
Messages: 21
Registered: February 2015
Location: A2, MI

Member

Strata are consistent across surveys for a country indicates that the codes/values of strata variable (after combining region and residence variables) are the same across the survey waves (e.g, 2001, 2006, 2011). If country has 5 regions and urban/rural, so there are 10 strata codes (say, 1 to 10) for each survey year. My understanding is that in pooled data set the number of strata is still to be 10. Because the stratification was the same but the sampling of clusters within stratum was different for each survey year, so cluster codes must be the different for identical strata across the survey waves. If we treat strata codes different across the surveys, the variance estimation is not only affected but also the degrees of freedom, confidence intervals, and p-value calculations.

Report message to a moderator

Re: Accounting for different sampling areas over different years [message #4203 is a reply to message #4202]

Thu, 16 April 2015 16:05

Reduced-For(u)m
Messages: 292
Registered: March 2013

Senior Member

My intuition is that you would want to use different strata too - the idea being that the stratification was done separately by survey round, even if they overlap - but I think this is probably, if not an open question in the survey analysis literature, at least sufficiently esoteric that there is no agreed-upon course of action. That said, I do have two points I'm more sure about:

1 - you say "If we treat strata codes different across the surveys, the variance estimation is not only affected but also the degrees of freedom, confidence intervals, and p-value calculations." But variance estimation will always affect CIs and P-values, and the effect of the loss of DF should not affect critical values, given the large number.

2 - depending on your variables of interest and how those are constructed, you might want to use a standard error estimator that accounts for more robust correlations than those you would use if you were just looking at a single, individual-level covariate from one survey. Error terms are likely correlated across time within region (worse if you are using aggregated or constructed variables on the right hand side of your regression) and the standard DHS method won't account for this, but clustering by spatial region across survey rounds would.

Report message to a moderator

Previous Topic:	Working with Wealth Index Quintiles
Next Topic:	Mali DHS sub-sample analysis

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Fri Jul 11 04:51:10 Coordinated Universal Time 2025