The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Merging data files » Merging DHS Pakistan 2017-2018 (Merging Womens individual file (IR) and Household member (PR))
Merging DHS Pakistan 2017-2018 [message #30370] Thu, 14 November 2024 17:08 Go to next message
shabina1129 is currently offline  shabina1129
Messages: 4
Registered: November 2024
Member
Hi,

My current research is focusing on mother-in-law's educational attainment and how that can alter their daughter-in-law's autonomy within the household. I have 2 questions.

1) I am trying to merge PR and IR files together. This is what I have done so far:
use "/Users/shabina/Desktop/Fall 2024 /PK_2017-18_DHS_08072024_1755_200211/Household Member/household 1.dta"

rename hv001 v001

rename hv002 v002

rename hvidx v003

sort v001 v002 v003

save "/Users/shabina/Desktop/Fall 2024 /PK_2017-18_DHS_08072024_1755_200211/Household Member/household 1.dta", replace

*open womens data
save " /Users/shabina/Desktop/kristinbietsch-MIL-Analysis-c4d7973/w omen1.dta "

use " /Users/shabina/Desktop/kristinbietsch-MIL-Analysis-c4d7973/w omen1.dta "

sort v001 v002 v003

keep if _merge==3


*Attempted to merge the household file onto the womens only file.
/*
Result Number of obs
-----------------------------------------
Not matched 85,801
from master 0 (_merge==1)
from using 85,801 (_merge==2)

Matched 15,068 (_merge==3)
-----------------------------------------
This resulted in only 15,068 matches which the original womens only file contains, Leaving out around 30,000 of the sample. Is there another way to merge the datasets while keeping the sample I have which is 51,044?

I keep seeing that this is the only way to merge PR and IR but I wanted to know if there is another way to merge these datasets while still keeping the number of obs.

2) My second question is that if I were to try to run a regression on mother-in-law's education on standardized
autonomy there would either be no cases if I only run the regression on daughter-in-
laws, or the regression would run solely on mother-in-laws since they are the only rows
that have values for both the educational level obtained and standardized autonomy
scale.
- Is there a way that we can construct the mother-in-law education variable to be available
or on the same row as the daughter-in-law respondents so that regression analyses can
be executed? If you know the codes for this. It would be so appreciated. Thanks.
Re: Merging DHS Pakistan 2017-2018 [message #30372 is a reply to message #30370] Fri, 15 November 2024 17:27 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Following is a response from Senior DHS staff member, Tom Pullum:

Your merge worked correctly. I did it a little differently and got the same result (see below).

The PR file includes all household members, including males of all ages and females who are not eligible for the women's interview. When I do this merge I first reduce the PR file to household members who are eligible for the women's interview. These are the cases with hv117=1.

By the way, the lines you gave were incomplete. You did not include the "merge" command. Also your first "save" line has the same filename as the file in the first "use" line. That command would over-write the first file, something to be avoided at all costs and something I'm pretty sure you did not do. Anyway, your merge did work correctly.

* Specify workspace
cd e:\DHS\DHS_data\scratch

use "...PKPR71FL.DTA", clear

* Reduce to women who are eligible for the IR file
keep if hv117==1
gen cluster=hv001
gen hh=hv002
gen line=hvidx
save temp.dta, replace

use "...PKIR71FL.DTA", clear
gen cluster=v001
gen hh=v002
gen line=v003
merge 1:1 cluster hh line using temp.dta
tab _merge

* women with _merge=2 were eligible but not interviewed; they count as nonresponse cases
keep if _merge==3
drop _merge

* Save the merged file with a different name 

[Updated on: Fri, 15 November 2024 17:28]

Report message to a moderator

Re: Merging DHS Pakistan 2017-2018 [message #30373 is a reply to message #30372] Fri, 15 November 2024 17:31 Go to previous message
shabina1129 is currently offline  shabina1129
Messages: 4
Registered: November 2024
Member
Thank you for your response. I did do keep if _merge==3 I just forgot to include that. If you do know the answer to this question too that would be really helpful as well!

2) My second question is that if I were to try to run a regression on mother-in-law's education on standardized
autonomy there would either be no cases if I only run the regression on daughter-in-
laws, or the regression would run solely on mother-in-laws since they are the only rows
that have values for both the educational level obtained and standardized autonomy
scale.
- Is there a way that we can construct the mother-in-law education variable to be available
or on the same row as the daughter-in-law respondents so that regression analyses can
be executed? If you know the codes for this. It would be so appreciated. Thanks.
Previous Topic: Merge KR and PR
Goto Forum:
  


Current Time: Thu Nov 21 04:58:38 Coordinated Universal Time 2024