| 
		
			| Household crowding [message #19808] | Mon, 17 August 2020 08:28  |  
			| 
				
				
					|  j_lewis98 Messages: 7
 Registered: July 2020
 | Member |  |  |  
	| I am using the SADHS 2016 dataset. I want to create a household crowding variable from the PR file. What existing variables could I use for this; and what would the STATA code be? 
 Many thanks
 |  
	|  |  | 
	|  | 
	| 
		
			| Re: Household crowding [message #19812 is a reply to message #19811] | Mon, 17 August 2020 14:03   |  
			| 
				
				
					|  j_lewis98 Messages: 7
 Registered: July 2020
 | Member |  |  |  
	| Many thanks for your reply- this was very helpful for me! 
 I have another question, if you may be able to assist me. Using the SADHS 2016 dataset, I want to run a multilevel logistic regression model with self-reported presence TB as the outcome of interest (generated from the variables s1410 and sm1105 from the female and male adult health recode files, respectively). I dropped the 'I don't know' observations from this variable as they were very few and therefore the resulting TB variable is binary and appropriate to use in a logistic regression model.
 However, for the three levels I intend to use individuals (adult health respondents) as level 1, households as level 2 and MUNICIPALITIES as level 3. Given that cluster shall not be the intended level 3, how do I format this in the svyset command? Could someone help me out with what STATA code I would need to use?
 
 NB: I have matched clusters to municipalities, using GADM and merged this with the PR file so I know what households are in what municipalities, and I have h_munic as a municipality variable.
 
 Best,
 
 Jadene
 |  
	|  |  | 
	| 
		
			| Re: Household crowding [message #19854 is a reply to message #19812] | Thu, 20 August 2020 13:53  |  
			| 
				
				
					|  Bridgette-DHS Messages: 3230
 Registered: February 2013
 | Senior Member |  |  |  
	| Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
 
 I don't believe the distinction between level 1 and level 2 will be useful.  The only leverage you will get on estimating an intra-class correlation between individuals in the same household will come from households in which more than one person has TB. If the number of persons with TB in the same household is always 0 or 1, you have no information.  If there are very few exceptions to 0 and 1, the estimate will be very uncertain.  The number of persons tested in the household is also relevant
 
 To use municipality, rather than cluster, at level 3 (or 2) requires constant weights and I don't think you have them.  Clusters, not municipalities, were the PSUs in the sample design.
 
 We just issued a methodological report related to multilevel models, and I recommend that you look at it: https://www.dhsprogram.com/pubs/pdf/MR27/MR27.pdf. I hope other users have suggestions.
 
 |  
	|  |  |