| 
		
			| Weights and Strata for Pooled Samples [message #19761] | Mon, 10 August 2020 12:27  |  
			| 
				
				
					|  Yawo Messages: 45
 Registered: February 2019
 | Member |  |  |  
	| Hello, 
 I've used IPUMS to create a combined dataset for men and women for 23 African countries.
 
 As the listing below shows, the weights are not constant within each PSU as they should be. In fact we have two different weights within each PSU, one for males and one for females.
 
 
 
 
 list sample sex weight weight2 psupool stratapool in 115431/115435
        +----------------------------------------------------------------+
        |       sample      sex    weight   weight2   psupool   strata~l |
        |----------------------------------------------------------------|
115431. | Lesotho 2014     male   .515554         1      4753        336 |
115432. | Lesotho 2014   female    .48581         1      4753        350 |
115433. | Lesotho 2014   female    .48581         1      4753        350 |
115434. | Lesotho 2014     male   .515554         1      4753        336 |
115435. | Lesotho 2014   female    .48581         1      4753        350 |
        |----------------------------------------------------------------|
This issue has been the source of "weights not constant within PSU" error I have been receiving when running svy: melogit models.
 
 
 However, I thought I can adjust the strata and the psu by taking gender into account.  Doing so results in constant weights within each psu, as the code and listing below shows.
 
 
 
 
egen psupool= group(idhspsu sex)
egen stratapool= group(strata sex)
list sample sex idhspid idhspsu idhsstrata psupool stratapool weight in 10/15
     +----------------------------------------------------------------------------------------------------------+
     |      sample      sex                   idhspid      idhspsu   idhsstrata   psupool   strata~l     weight |
     |----------------------------------------------------------------------------------------------------------|
 10. | Angola 2015     male       2401    00010001  3   2401000001   2401000018         1        321    .979475 |
 11. | Angola 2015     male       2401    00010012  3   2401000001   2401000018         1        321    .979475 |
 12. | Angola 2015     male       2401    00010026  1   2401000001   2401000018         1        321    .979475 |
 13. | Angola 2015   female       2401    00010001 02   2401000001    240100018         2         18   1.001989 |
 14. | Angola 2015   female       2401    00010002 03   2401000001    240100018         2         18   1.001989 |
     |----------------------------------------------------------------------------------------------------------|
 15. | Angola 2015   female       2401    00010002 02   2401000001    240100018         2         18   1.001989 |
     +----------------------------------------------------------------------------------------------------------
My question:
 
 1. Is this necessary to adjust the strata, psu and even the weights when we append male and female data into one file.
 
 2. And if so, is my <egen> code the right way to readjust the strata. If not, is there any other way?
 
 3. I was able to run my melogit models successfully after applying the adjustment to the strata and the psu, but again, I want to be sure the adjustment is ok, else I would have to revert to running separate models for men and women.
 
 thanks - cY
 
 
 |  
	|  |  | 
	| 
		
			| Re: Weights and Strata for Pooled Samples [message #19762 is a reply to message #19761] | Mon, 10 August 2020 14:57   |  
			| 
				
				
					|  Trevor-DHS Messages: 808
 Registered: January 2013
 | Senior Member |  |  |  
	| Unfortunately you can't just merge the datasets together without adjusting the sample weights.  The sample weights are relative weights and are normalized separately to the total number of women and total number of men in the sample.  For example for Lesotho DHS 2014, 6621 women were interviewed, but only 2931 men (and that includes men age 15-59, not just 15-49).  In Appendix A, it states that "In addition, in a subsample of households (every second household), all men age 15-59 who were usual residents of the households or stayed in the households on the night before the interview were eligible for interview". If you used the merged data without adjusting the weights you would be assuming that there were more than twice as many women in Lesotho than men! 
 Typically this adjustment is made by applying a constant factor to the weights for women and the weights for men.  These constant factors are taken by using estimates of the male and female population age 15-49 in Lesotho from some external source, such as the UN's World Population Prospects or from census data, and dividing by the total sample size for women and for men age 15-49 from the survey.
 
 This same issue applies when you pool data from multiple surveys or multiple countries, each of which have their own weights that will need adjusting.
 
 As for the code adjusting the PSU and stata, I don't know melogit, but I cannot think of another solution if you need to include both women and men in the same model.  Can you run separate models for women and for men? Otherwise I don't have a better option.
 |  
	|  |  | 
	|  | 
	|  | 
	|  | 
	|  |