The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Dataset use in Stata » Performing a manual backward stepwise logistic regression in Stata
Performing a manual backward stepwise logistic regression in Stata [message #10711] Mon, 05 September 2016 02:22 Go to next message
npolle is currently offline  npolle
Messages: 6
Registered: August 2016
Location: MOMBASA, KENYA
Member
I am new in using stata and interested in using the 2011 Uganda Demographic and Health Survey to determine prevalence of disability and associated risk factors. I need assistance in performing a manual backward stepwise logistic regression in Stata. I have read that in performing a manual backward stepwise logistic regression in Stata, I first need to run the full model (with all covariates), followed by testing all variables for statistical significance at p<0.05 starting with the bottom variable.
I need assistance on how to test the variables for statistical significance. This is how I approached it, however the test command does not run.

*GENERATING THE SUBPOPULATION
generate overfive=.
replace overfive = 1 if hv105>=5 & hv105<96 & hv103==1
replace overfive = 0 if overfive!=1
*CONDUCTING MULTIVARIATE LOGISTICS REGRESSION
*Where 'onedisability' is the variable for difficulty in at least one functional
area
svy, subpop (overfive): logistic onedisability ///
hv104 ///
hv106 ///
hv025 ///
hv270 ///
hv024
test hv024

Thank you
Re: Performing a manual backward stepwise logistic regression in Stata [message #10728 is a reply to message #10711] Tue, 06 September 2016 12:44 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3032
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

I have revised some of your Stata code below. Rather than manual stepwise selection I would use the Stata command "stepwise". There are limitations with this command. It does not work with "i." for categorical variables, so you must form sets of dummies with "xi". Then if one of the dummies is dropped from the model, it means that the category has been consolidated with the reference category. It also does not work with svyset, but it DOES work with the pweight and cluster specifications within an estimation command, so all you are missing is the stratum adjustment and subpop, which you can add after you have selected the final model. In the example, I am setting a p value of .1 as the threshold for retention in the model. This number can be changed. I would repeat the procedure with various values ranging from, say, .5 to .05. I personally prefer logit to logistic, but have used logistic below since that's what you had.
use e:\DHS\DHS_data\PR_files\UGPR60FL.dta

generate overfive=.
replace overfive = 1 if hv105>=5 & hv105<96 & hv103==1
replace overfive = 0 if overfive!=1

* construct onedisability
describe sh24-sh29
tab1 sh24-sh29,m

gen sh24r=0
replace sh24r=1 if sh24>=2 & sh24<=4
replace sh24r=. if sh24==.

gen sh25r=0
replace sh25r=1 if sh25>=2 & sh25<=4
replace sh25r=. if sh25==.

gen sh26r=0
replace sh26r=1 if sh26>=2 & sh26<=4
replace sh26r=. if sh26==.

gen sh27r=0
replace sh27r=1 if sh27>=2 & sh27<=4
replace sh27r=. if sh27==.

gen sh28r=0
replace sh28r=1 if sh28>=2 & sh28<=4
replace sh28r=. if sh28==.

gen sh29r=0
replace sh29r=1 if sh29>=2 & sh29<=4
replace sh29r=. if sh29==.

egen onedisability=rowtotal(sh24r-sh29r), missing
replace onedisability=1 if onedisability>1 & onedisability<. 

tab onedisability,m

*CONDUCTING MULTIVARIATE LOGISTICS REGRESSION
*Where 'onedisability' is the variable for difficulty in at least one functionalarea

* specify svyset, but stepwise cannot work with svyset.  Use pweight and cluster, and do not
* include strata and subpop until you get to the final model.

* stepwise cannot handle i. for categorical variables; must form sets of dummies

xi, prefix(v_) i.hv024 i.hv025 i.hv104 i.hv106 i.hv270

logistic onedisability v_* [pweight=hv005], cluster(hv021)

stepwise, pr(.1): logistic onedisability v_* [pweight=hv005], cluster(hv021)
Re: Performing a manual backward stepwise logistic regression in Stata [message #10742 is a reply to message #10728] Wed, 07 September 2016 17:20 Go to previous message
npolle is currently offline  npolle
Messages: 6
Registered: August 2016
Location: MOMBASA, KENYA
Member
Thank you for your continued support, this has solved my problem.
Nicholas
Previous Topic: Svy command, r(2000) error message, NFHS-3 India
Next Topic: Dropping observations
Goto Forum:
  


Current Time: Thu Apr 18 02:26:42 Coordinated Universal Time 2024