Merging DHS Pakistan 2017-2018 [message #30370] |
Thu, 14 November 2024 17:08 |
shabina1129
Messages: 4 Registered: November 2024
|
Member |
|
|
Hi,
My current research is focusing on mother-in-law's educational attainment and how that can alter their daughter-in-law's autonomy within the household. I have 2 questions.
1) I am trying to merge PR and IR files together. This is what I have done so far:
use "/Users/shabina/Desktop/Fall 2024 /PK_2017-18_DHS_08072024_1755_200211/Household Member/household 1.dta"
rename hv001 v001
rename hv002 v002
rename hvidx v003
sort v001 v002 v003
save "/Users/shabina/Desktop/Fall 2024 /PK_2017-18_DHS_08072024_1755_200211/Household Member/household 1.dta", replace
*open womens data
save " /Users/shabina/Desktop/kristinbietsch-MIL-Analysis-c4d7973/w omen1.dta "
use " /Users/shabina/Desktop/kristinbietsch-MIL-Analysis-c4d7973/w omen1.dta "
sort v001 v002 v003
keep if _merge==3
*Attempted to merge the household file onto the womens only file.
/*
Result Number of obs
-----------------------------------------
Not matched 85,801
from master 0 (_merge==1)
from using 85,801 (_merge==2)
Matched 15,068 (_merge==3)
-----------------------------------------
This resulted in only 15,068 matches which the original womens only file contains, Leaving out around 30,000 of the sample. Is there another way to merge the datasets while keeping the sample I have which is 51,044?
I keep seeing that this is the only way to merge PR and IR but I wanted to know if there is another way to merge these datasets while still keeping the number of obs.
2) My second question is that if I were to try to run a regression on mother-in-law's education on standardized
autonomy there would either be no cases if I only run the regression on daughter-in-
laws, or the regression would run solely on mother-in-laws since they are the only rows
that have values for both the educational level obtained and standardized autonomy
scale.
- Is there a way that we can construct the mother-in-law education variable to be available
or on the same row as the daughter-in-law respondents so that regression analyses can
be executed? If you know the codes for this. It would be so appreciated. Thanks.
|
|
|
Re: Merging DHS Pakistan 2017-2018 [message #30372 is a reply to message #30370] |
Fri, 15 November 2024 17:27 |
Bridgette-DHS
Messages: 3203 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS staff member, Tom Pullum:
Your merge worked correctly. I did it a little differently and got the same result (see below).
The PR file includes all household members, including males of all ages and females who are not eligible for the women's interview. When I do this merge I first reduce the PR file to household members who are eligible for the women's interview. These are the cases with hv117=1.
By the way, the lines you gave were incomplete. You did not include the "merge" command. Also your first "save" line has the same filename as the file in the first "use" line. That command would over-write the first file, something to be avoided at all costs and something I'm pretty sure you did not do. Anyway, your merge did work correctly.
* Specify workspace
cd e:\DHS\DHS_data\scratch
use "...PKPR71FL.DTA", clear
* Reduce to women who are eligible for the IR file
keep if hv117==1
gen cluster=hv001
gen hh=hv002
gen line=hvidx
save temp.dta, replace
use "...PKIR71FL.DTA", clear
gen cluster=v001
gen hh=v002
gen line=v003
merge 1:1 cluster hh line using temp.dta
tab _merge
* women with _merge=2 were eligible but not interviewed; they count as nonresponse cases
keep if _merge==3
drop _merge
* Save the merged file with a different name
[Updated on: Fri, 15 November 2024 17:28] Report message to a moderator
|
|
|
Re: Merging DHS Pakistan 2017-2018 [message #30373 is a reply to message #30372] |
Fri, 15 November 2024 17:31 |
shabina1129
Messages: 4 Registered: November 2024
|
Member |
|
|
Thank you for your response. I did do keep if _merge==3 I just forgot to include that. If you do know the answer to this question too that would be really helpful as well!
2) My second question is that if I were to try to run a regression on mother-in-law's education on standardized
autonomy there would either be no cases if I only run the regression on daughter-in-
laws, or the regression would run solely on mother-in-laws since they are the only rows
that have values for both the educational level obtained and standardized autonomy
scale.
- Is there a way that we can construct the mother-in-law education variable to be available
or on the same row as the daughter-in-law respondents so that regression analyses can
be executed? If you know the codes for this. It would be so appreciated. Thanks.
|
|
|