| Re: Convert DHS (SPSS?) missing value codes to Stata codes in Stata dataset [message #6968 is a reply to message #6874] | 
			Fri, 07 August 2015 09:23    | 
		 
		
			
				
				
				
					
						  
						Bridgette-DHS
						 Messages: 3230 Registered: February 2013 
						
					 | 
					Senior Member  | 
					 | 
		 
		 
	 | 
 
	
		Following is a response fron Senior DHS Stata Specialist, Tom Pullum: 
 
 
I can suggest three different ways to deal with these kinds of missing value codes.  I use them all the time. 
 
As an example, take hw70, the height-for-age z-score.  The anthropometry z-scores have several special codes in the vicinity of 9999.  Sometimes you will find values in that vicinity that do not even have a label, but all such values must be excluded.   
 
One approach would be simply to have a line such as "replace hw70=. If hw70>9000".  Values with "." Are always considered by Stata to be missing and will be ignored from calculations.  The problem with this is that you have now lost the original hw70.  A second approach would be "gen hw70r=hw70" and "replace hw70r=. If hw70>9000".  I add "r" for this kind of simple recode.  Then any analysis would use hw70r in place of hw70, and you still have the original hw70.  A third approach, when you have several related variables, could be something like the following.  "gen hw7x_missing=0", "replace hw7x_missing=1 if hw70>9000 | hw71>9000 | hw72>9000".  Then in your analysis, you could limit yourself to the cases that are non-missing on all variables by including "if hw7x_missing==0".  I use this third approach if, say, I want to do a series of regression on exactly the same cases.  
 
One more thing --in the DHS data files, the code "." Always means "not applicable".  You should not confuse that meaning with what I have implied above, which is "please ignore in any calculations"!  
 
		
		
		
 |  
	| 
		
	 | 
 
 
 |