The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Sample weight and use of svy command in regression
Sample weight and use of svy command in regression Tue, 19 December 2017 07:40
 rkchettri Messages: 18Registered: November 2017 Member
Hi DHS experts,
I am using Nepal DHS 2016 data, I am using IR file. For weighting purpose I am using following command:

gen rweight=v005/1000000
svyset v021 [pweight=rweight], strata(v023) vce(linearized) singleunit(missing)

And for the regression purpose I using this command:
svy: logit outcome variable (eg 4anc) predictor variable (eg age of women) , or

I am want to be sure either I am doing correct or not?

Regards,
Resham
Re: Sample weight and use of svy command in regression [message #13782 is a reply to message #13752] Thu, 21 December 2017 17:55
 Bridgette-DHS Messages: 2549Registered: February 2013 Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

Yes, your command is fine. You could simplify it slightly in two ways and get the same results. First, for pweights it is not necessary to divide v005 by 1000000. Stata will do this automatically, because pweights are automatically re-scaled to have a mean of 1. Second, vce(linearized) is a default and does not need to be specified. Thus your svyset could be simply this: svyset v021 [pweight=v005], strata(v023) singleunit(missing).

Re: Sample weight and use of svy command in regression [message #13783 is a reply to message #13782] Thu, 21 December 2017 19:25
 rkchettri Messages: 18Registered: November 2017 Member
Thank you for response and new ideas.
I checked and found the same results in regression analysis. But while run frequencies without dividing v005 by 1000000, I found different results, for example,

. tab m14ANCcat[iweight=v005]

RECODE of
m14_1
(number of
antenatal
visits
during
pregnancy) Freq. Percent Cum.

No ANC 71978871 3.64 3.64
1-3ANC 504713483 25.52 29.15
4ANC 1401330967 70.85 100.00

Total 1978023321 100.00

This is different when v005 is divided by 1000000

. tab m14ANCcat[iw=v005/1000000]

RECODE of
m14_1
(number of
antenatal
visits
during
pregnancy) Freq. Percent Cum.

No ANC 71.978871 3.64 3.64
1-3ANC 504.713483 25.52 29.15
4ANC 1,401.331 70.85 100.00

Total 1,978.0233 100.00

So, to calculate the frequencies, we need to v005 by 1000000. Am I right?

Looking forward to your response.

With kind regards,
Resham
Re: Sample weight and use of svy command in regression [message #13785 is a reply to message #13783] Fri, 22 December 2017 07:31
 Bridgette-DHS Messages: 2549Registered: February 2013 Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

Yes, that's right. When using iweight you do need to divide by 1000000. It is only with pweight that Stata will automatically re-scale the weights to have a mean of 1.
Re: Sample weight and use of svy command in regression [message #13839 is a reply to message #13782] Wed, 10 January 2018 07:55
 rkchettri Messages: 18Registered: November 2017 Member
Dear DHS expert team,
One more thing to ask regarding the svyset command and final regression model. Say if we set the svy command using: svyset v021 [pweight=v005], strata(v024) singleunit(missing). Can I include strata (v024) ( in my case v023 is provinces which has important predictor) as explanatory in the final regression model?
I have conducted multivariate logistic regression analysis using command like this.
svy: logit outcome variable (eg 4ANC) varlist of explanatory variables (eg i.ethnicicity i.wealth rank ......v024), or

svy: logit 4anc i.v024 i.v025 i.v130 i.v131,or

Is this right command ?

Best

Re: Sample weight and use of svy command in regression [message #13842 is a reply to message #13839] Thu, 11 January 2018 08:16
 Bridgette-DHS Messages: 2549Registered: February 2013 Senior Member

Following is a response from Senior DHS Stata Specialist, Tom Pullum:

Yes, a variable that is in the svyset command can also be used in the analysis. There is no problem with your logit regression command. It's likely that religion and ethnicity are associated with place of residence and they may have different effects in different areas. But those are analytical issues.

 Previous Topic: Weighting Data for Pooled BDHS Household Members Dataset Next Topic: Weighting in multilevel model with pooled data
Goto Forum:

Current Time: Sun Aug 14 14:47:07 Coordinated Universal Time 2022