The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » Uganda » A n Urgent Request: how to merge data files (Intergenerational Education Mobility via DHS 2016 for Uganda)
Re: A n Urgent Request: how to merge data files [message #28922 is a reply to message #28903] Thu, 28 March 2024 15:15 Go to previous messageGo to previous message
Janet-DHS is currently offline  Janet-DHS
Messages: 771
Registered: April 2022
Senior Member
Following is a response from DHS staff member, Tom Pullum:

You have to use hv112 and hv114 to match children with their mothers and fathers.  You can't do it with hv101.

Below I will paste a Stata program to do this.  I don't use R but hope you can convert to R. It could be more efficient but it runs very quickly. This constructs a new file, with the variables renamed to have suffix "child", "mother" and "father". The cases are children age 0-17.  The education variables are hv106-hv109 and hv121-hv124. A 2x2 table is produced  describing the matches.  Hope this helps.

* specify a workspace
cd e:\DHS\DHS_data\scratch

* The education variables are hv106-hv109, hv121-hv124

* Make a file of potential mothers
use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\UGPR7BFL.DTA", clear
keep if hv104==2
drop if hv105<=17
keep hv*
rename hv001 cluster
rename hv002 hh
rename hvidx mo_line
rename hv114 fa_line
rename hv* hv*_mother
save mother.dta, replace
* Make a file of potential fathers
use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\UGPR7BFL.DTA", clear
keep if hv104==1
drop if hv105<=17
keep hv*
rename hv001 cluster
rename hv002 hh
rename hvidx fa_line
rename hv* hv*_father
save father.dta, replace

* Make a file of children age 0-17
use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\UGPR7BFL.DTA", clear
keep if hv105<=17
keep hv*
rename hv001 cluster
rename hv002 hh
rename hv112 mo_line
rename hv114 fa_line
rename hv* hv*_child

gen child=1
tab child
label define noyes 1 "No" 3 "Yes"
quietly merge m:1 cluster hh mo_line using mother.dta
rename _merge mother_matched
label values mother_matched noyes
keep if child==1
quietly merge m:1 cluster hh fa_line using father.dta
rename _merge father_matched
label values father_matched noyes
keep if child==1

tab *matched
 
. tab *matched

mother_mat |    father_matched
      ched |        No        Yes |     Total
-----------+----------------------+----------
        No |    10,095      3,095 |    13,190
       Yes |    10,828     26,454 |    37,282
-----------+----------------------+----------
     Total |    20,923     29,549 |    50,472

* save this file
 
Read Message
Read Message
Read Message
Read Message
Previous Topic: Matching Domestic Violence Variable Data
Next Topic: 2022 UDHS Data Release
Goto Forum:
  


Current Time: Thu Jul 18 02:32:17 Coordinated Universal Time 2024