The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Dealing with merged data from different countries (handling, PSU, stratums and weights in merged data sets of different countries )
Dealing with merged data from different countries [message #29794] Tue, 06 August 2024 12:00 Go to next message
Ashlesha Pal is currently offline  Ashlesha Pal
Messages: 3
Registered: July 2024
Member
I have appended IR files of India, Pakistan and Bangladesh for my analysis and only kept the currently married women, age 40-49 (sub-population) for my analysis. Do I need to make any adjustments in PSUs, Stratums, and re-normalize weights. also tabulating the stratums
Re: Dealing with merged data from different countries [message #29801 is a reply to message #29794] Wed, 07 August 2024 08:30 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3159
Registered: February 2013
Senior Member

Following is a response from Senior DHS staff member, Tom Pullum:

You have done an impressive amount of preparation for this study.

You appended the data from India, Pakistan, and Bangladesh into a single file, but I think mostly you will be analyzing the three countries separately. The main reason I can see for appending the countries is that you can do statistical tests of whether there are differences. I would recommend against making pooled estimates for the three countries combined. India would be such a large part of the total that the pooled estimates would basically be the estimates for India.

You do need to construct unique ID codes for clusters and strata. It would be sufficient to use the following Stata lines: "egen clusterID=group(v000 v001)" and "egen stratumID=group(v000 v023)" and then use clusterID and stratumID appropriately in the svyset command. You do not need to change the weights. You would only need to change the weights if you were going to make pooled estimates, and I have advised against that. For making tests, for example a test of the null hypothesis that the mean difference between v201 and ideal ideal number of children is the same in the three countries, you need to put clusterID and stratumID into svyset, but you would use the survey weights unchanged.

When comparing actual and ideal number of children for women age 45-49, you may want to take child mortality into account. The data files include the number of living sons and the number of living daughters. You could use the sum of those two numbers, rather than v201, which includes children who died. Just a suggestion.

Previous Topic: Mali 2012/13 - Multilevel Modeling Weights
Next Topic: Question about weights to use
Goto Forum:
  


Current Time: Fri Sep 20 12:41:56 Coordinated Universal Time 2024