The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Merging data files » Appending datasets of one country for several years
Appending datasets of one country for several years [message #15442] Mon, 23 July 2018 01:14 Go to next message
Kamola is currently offline  Kamola
Messages: 5
Registered: April 2018

I have a problem when appending the datasets of one country but several years. For example, children's recode for 2 years: 2002 and 2012.
Since I'm using survey data analysis in Stata (svyset), I need to define strata and PSU. Strata for one country is not difficult to do, however identifying PSUs remains a challenge since PSU number in each dataset is different.

Could you please let me know how to deal with an issue like this?

Beforehand thank you

Best regards,
Re: Appending datasets of one country for several years [message #15756 is a reply to message #15442] Mon, 10 September 2018 20:45 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2411
Registered: February 2013
Senior Member
Following is a response from Senior DHS Specialist, Tom Pullum:

Yes, you definitely need to construct unique ID codes for the clusters (PSUs) in the separate rounds. I suggest the following. First, give separate survey numbers to the 2002 and 2012 rounds, during the appending. For example, say that they are survey=1 and survey=2. You then construct a unique PSU ID with "egen clusterid=group(survey v001)". Similarly, if v023 is the stratum variable, you can construct a unique stratum ID with "egen stratumid=group(survey v023)". Then you put clusterid and stratumid in the appropriate places in svyset. There have been other posts on this topic.
Re: Appending datasets of one country for several years [message #15765 is a reply to message #15756] Tue, 11 September 2018 12:56 Go to previous message
boyle014 is currently offline  boyle014
Messages: 76
Registered: December 2015
Location: Minneapolis
Senior Member
Dear Kamola,

Just a quick note that IPUMS DHS includes unique PSU and Strata variables so users don't have to construct them. They're called idhspsu and idhsstrata, respectively. They are "preselected," meaning they are automatically added to any dataset a user creates at the IPUMS DHS website.

IPUMS DHS is a tool that makes using DHS data files easier. You just select the samples, years, and variables you want. Then you click on View Data Cart to create a single, integrated dataset, with all variables fully harmonized. Use the website itself as a reference--every variable name takes you to important documentation about the variable, such as the skip pattern that produced it (this is under the Universe tab). To use IPUMS DHS, you have to be registered to use DHS data.

Liz Boyle

Professor Elizabeth Boyle
Sociology & Law, University of Minnesota, USA
Principal Investigator, IPUMS-DHS
Previous Topic: Duplicates in KR file and Empty values for HV005 after merging.
Next Topic: No of children in the household for diarrhoeal prevalence at household level
Goto Forum:

Current Time: Wed Oct 27 07:03:25 Coordinated Universal Time 2021