The DHS Program User Forum: India » Districts as cluster-level for multi-level model

Home » Countries » India » Districts as cluster-level for multi-level model (Weighting for multi-level modelling)

Show: Today's Messages :: Show Polls :: Message Navigator

Districts as cluster-level for multi-level model [message #18796]

Sun, 23 February 2020 05:28

dgodha
Messages: 44
Registered: November 2016
Location: India

Member

Hello,

I will appreciate your expert guidance on my query. We usually use 'psu' as the cluster level in DHS data. In my case, the group size is too small if I use 'psu'.

  	
Group Variable |     #Groups    Minimum    Average    Maximum
           psu |     25,063          1        3.1         16

Since NFHS-4 is representative at the district level and we have to anyway create a variable for the cluster-weight, I am wondering if it is possible to use district as the cluster-level. I tried changing my weighting command for psu to district but as you can see in the output, I don't get the p-values and CIs.

*Rescaling of weights
	gen wt=v005/1000000
	
*Level 1 weights using scaling method 1: New weights sum to district sample size
	gen sqw = wt*wt 
	egen sumsqw = sum(sqw), by(sdistri) 
	egen sumw = sum(wt), by(sdistri) 
	gen pwt11 = wt*sumw/sumsqw 

* Survey setting
	gen wt2=1
	svyset sdistri, weight(wt2) strata(v023) , singleunit(centered) || _n, weight(pwt11)

*Output
*******
Number of strata   =     2,509                  Number of obs     =  1,538,126
Number of PSUs     =     2,509                  Population size   =  1,438,715
Subpop. no. obs   =     78,446
Subpop. size      =  73,653.12
Design df         =          0
F(   0,      0)   =          .
Prob > F          =          .


Linearized
 y           Coef.    Std. Err.      t       P>t     [95% Conf. Interval]

_cons     -1.585093   .0192937   -82.16       .            .           .

sdistri      
var(_cons) .1527032   .0153514                             .           .

Note: 5 strata omitted because they contain no subpopulation members.
Note: Strata with single sampling unit centered at overall mean.

I am not sure what is going wrong and will appreciate any understanding.
Thank you
Deepali

Deepali

Report message to a moderator

Re: Districts as cluster-level for multi-level model [message #18953 is a reply to message #18796]

Tue, 24 March 2020 14:36

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

The purpose of the svy adjustment is to compensate for the similarities of respondents within clusters, the under- and over-weighting of clusters, and stratification.

I would not recommend that you change to districts as the sampling units. The adjustments for clustering and sampling weights will be seriously thrown off.

The clusters, by definition, are the primary sampling units. If you shift to districts you will capture some if the intra-class correlation that goes into the svy calculation, but not nearly all of it, and the weighting adjustment, no matter how you do it, will be incorrect. The new weights would affect all the estimates and tabulations.

Report message to a moderator

Re: Districts as cluster-level for multi-level model [message #18959 is a reply to message #18953]

Wed, 25 March 2020 04:57

dgodha
Messages: 44
Registered: November 2016
Location: India

Member

Many thanks for your response.
I do have a follow-up question. If I don't use survey weights, then I can go ahead with using districts as clusters. Is that correct? I need to use districts because 85% of my PSUs have 5 or less observations.

Deepali

Report message to a moderator

Re: Districts as cluster-level for multi-level model [message #18982 is a reply to message #18959]

Mon, 30 March 2020 16:05

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is another response from DHS Research & Data Analysis Director, Tom Pullum:

If you ignore the weights entirely, your estimates won't mean anything. They will not be corrected for the under- and over-sampling in the survey design. They will not be unbiased estimates of population values.

I often ignore the survey design for a data quality assessment or for initial data exploration or for testing a program. However, if you want to do more than that, you need to use the weights to get unbiased estimates and use the clustering and stratification adjustments to get robust standard errors.

In other words, I recommend that you do not treat districts as clusters.

Report message to a moderator

Previous Topic:	Average cost of delivery
Next Topic:	Hosehold Member

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Mon Dec 15 12:51:17 Coordinated Universal Time 2025