The DHS Program User Forum      
Discussions regarding The DHS Program data and results
Home » Countries » Ethiopia » Merging datasets in SPSS
Merging datasets in SPSS [message #11997] Thu, 16 March 2017 21:45 Go to next message
rufus.benaud11 is currently offline  rufus.benaud11
Messages: 5
Registered: January 2017
Hi, I've been trying to understand and work with DHS dataset on SPSS. My research question is to explore predictors of breastfeeding initiation, exclusive breastfeeding and bottle feeding in Ethiopia.

I have some queries around merging datasets in terms of bringing one variable from one dataset to another. For example, I want to bring a variable "Antenatal care" from household dataset to children's dataset. Given the IDs are different in both datasets, I am quite confused how to work this out. Can you please advise on this.

Re: Merging datasets in SPSS [message #12021 is a reply to message #11997] Mon, 20 March 2017 17:16 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1315
Registered: February 2013
Senior Member
Following is a response from DHS Senior Research Associate, Cameron Taylor:

Thanks for the question! First you should check out our Merging Datasets page on the DHS Program website.
This page overviews the unique case identifiers in each data file, matching variables, and steps for merging datasets. However, before embarking on merging two datasets I would first make sure that a merge is necessary. The KR (kids file) already includes antenatal care information from the woman's file. Additionally, I am not sure why antenatal care would be in the household dataset (HR file) or the peoples recode file (PR) but if you are sure that you need to merge two datasets here is some guidance.

For example if you were going to merge the Peoples Recode (PR) and Kids Recode (KR) files in SPSS here is some guidance:

Using syntax
1) Open the PR file
2) Rename unique identifiers (cluster, household, and line numbers) in the using file to match the master file
• rename variables (hv001=v001).
• rename variables (hv002=v002).
• rename variables (hvidx=b16).
3) Sort PR file on these unique identifiers
• sort cases by v001(a) v002(a) b16(a).
4) Save PR file under a temporary name
5) Open KR file
6) Sort cluster, household, and line numbers in KR file
• Sort cases by v001(a) v002(a) b16(a).

Then using drop downs
DataMerge Files Add Variables. PR is "external" data file.
Move v001 v002 b16 from "excluded" to "key variables" box
Check "Match cases on key variables"
Check "Indicate case source as variable"

Paste to syntax file! Highlight and run

You will then see the variable source01. This variable created during merging has a value of 0 for cases from the active dataset and a value of 1 for cases from the external data file. In our merge example KR is the active dataset and PR is the external dataset.

Question: So which children could be in source01=0?
Answer: A child who is in the KR file but not the PR file does not live in the household with their mother perhaps the child has died (check b5). Or they are older than 59 months

Question: Which children could be in source01=1?
Answer: a child who is in the PR file but not the KR file means that their mother wasn't interviewed.

As always please carefully review the questionnaires in the back of the report to fully understand who is being asked which questions. This will help you better understand who is in which data file and whether your merge is necessary

Let us know if you have additional questions!
Previous Topic: Getting weighted data from Ethiopian DHS raw data
Next Topic: Calculating Exclusive Breastfeeding
Goto Forum:

Current Time: Sun Apr 22 06:25:43 Eastern Daylight Time 2018