The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Dataset use in Stata » Variance Inflation Factor
Variance Inflation Factor [message #11698] Mon, 30 January 2017 19:21 Go to next message
nwegbus is currently offline  nwegbus
Messages: 15
Registered: December 2015
Member
I'm working with the domestic violence module of Nigeria DHS (2013) using Stata 13.

I'm trying to examine my predictor variables for collinearity using the VIF score. Please see the attached document for all the output (you might want to zoom to 125%).

Here;s a summary of what I did:

I weighted my data using the following command:

generate wgt5 = d005/1000000
svyset [pweight = wgt5],psu(v021) strata(v022) singleunit(centered)

Then I ran a regression model of my predictor variables using this command:

svy: logit exipv agemarr4 religion weduc say attwb wipv polyg hseek chcomm

Then I issued the VIF command:

display "tolerance = " 1-e(r2) " VIF = " 1/(1-e(r2))

However this is the output I got:

tolerance = . VIF = .

It seems that the tolerance and VIF are missing? I'm wondering what I'm doing wrong, as I don't know what to make of a missing VIF score. Thanks in advance for your help.

SN
Re: Variance Inflation Factor [message #11704 is a reply to message #11698] Tue, 31 January 2017 17:16 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3013
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

Sorry, but this question is not about DHS data or even about Stata. I am not familiar with this procedure and don't know what it does.

I do notice that your logit regression seems to treat all variables as interval-level, whereas most of them are categorical and should be preceded by "i.". And if they are all categorical, you will have at least 2^9=512 combinations. You will have many combinations that are empty for one or both categories of the outcome and the regression will be unstable. The solution for this is to use fewer variables on the right hand side of the equation.

Re: Variance Inflation Factor [message #11715 is a reply to message #11704] Fri, 03 February 2017 03:24 Go to previous messageGo to next message
nwegbus is currently offline  nwegbus
Messages: 15
Registered: December 2015
Member
Thank you for your reply.

According to the UCLA Stata webpage here:

http://www.ats.ucla.edu/stat/stata/faq/svycollin.htm

the VIF and tolerance values are the way to check for multicollinearity (i.e., redundancy) among predictors for survey data.

You did point out that I had too many predictors in my model. I agree. I'm trying to use the VIF and tolerance values to see which ones are best to drop from my model.

The problem I'm having is that, even though I followed the directions on the UCLA webpage step by step, I end up with an output of:

Tolerance=. and VIF=.

Could this have anything to do with the way I weighted the data? i.e.,
generate wgt = d005/1000000
svyset [pweight = wgt],psu(v021) strata(v022) singleunit(centered)

Thank you once again for your help and patience.
Som
Re: Variance Inflation Factor [message #11717 is a reply to message #11715] Fri, 03 February 2017 09:27 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3013
Registered: February 2013
Senior Member
Another response from Tom Pullum:

This question or problem goes beyond what we would normally give advice on. If I were in your situation, I would try to do the backward selection with successive reductions of the svy adjustments, until (hopefully!) it worked. That is, first remove the stratum adjustment, and try it. If it still doesn't work, remove the cluster adjustment, and try it. If it still doesn't work, remove the weight adjustment, and try it. I HOPE that at one stage of this process, at least the final one (with no svy adjustments at all) the procedure will work.

When you include all of the svy adjustments, the models are not fitted with maximum-likelihood methods. The measures of fit and the optimization of fit are not as solid as with ML (at least that's my understanding). However, I think that model selection procedures are fairly robust with respect to inclusion/omission of the svy adjustments. Plus, my general strategy when running into a complex problem is to simplify, simplify, simplify, until I get a strategy or a solution! That's all I can suggest.
Re: Variance Inflation Factor [message #11719 is a reply to message #11717] Fri, 03 February 2017 10:23 Go to previous messageGo to next message
nwegbus is currently offline  nwegbus
Messages: 15
Registered: December 2015
Member
Thank you Tom. I will try this and see where it gets me.

Som
Re: Variance Inflation Factor [message #12233 is a reply to message #11719] Thu, 13 April 2017 15:25 Go to previous messageGo to next message
bakerchowdhury
Messages: 23
Registered: April 2014
Member
Hi,
I see VIF for SVY command gives us an overall tolerance for the model. However, I am wondering if there a way of calculating VIF for each predictor variable in SVY command (i.e. more like the VIF command under normal regression)?
Thank you
Baker
Re: Variance Inflation Factor [message #12236 is a reply to message #12233] Fri, 14 April 2017 07:46 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3013
Registered: February 2013
Senior Member
Response from Senior DHS Stata Specialist, Tom Pullum:


This is not a question that DHS staff can answer. Perhaps other users can help you.

Re: Variance Inflation Factor [message #12654 is a reply to message #12233] Wed, 28 June 2017 14:03 Go to previous message
AnvitaDixit is currently offline  AnvitaDixit
Messages: 7
Registered: June 2017
Location: usa
Member
Hi Baker, did you find a solution? Im having the same problem and would really appreciate if you could share how you got to individual VIFs for your independent variable in the model. Thanks!
Previous Topic: Stratum with single sampling unit error
Next Topic: Svy details Colombia 2010
Goto Forum:
  


Current Time: Tue Mar 19 03:37:08 Coordinated Universal Time 2024