Zambia 2018 -- duplicates in cluster number and household number [message #24978] |
Fri, 12 August 2022 11:47 |
nchesterman
Messages: 1 Registered: August 2022
|
Member |
|
|
Hi all,
I am attempting to merge variables from the male and female datasets onto the household dataset. The DHS help site for this (https://dhsprogram.com/data/Merging-Datasets.cfm) indicates that v001 and v002 (cluster and household numbers) should be unique for the datasets. However, I am finding that these two variables do not uniquely identify all observations; there are many duplicates of these two variables in the dataset.
I also checked my variables of interest, and found that the data differs between duplicated cases. So the households are not true duplicates.
Has anyone else encountered this issue for the Zambia dataset, and have suggestions for how to successfully merge on cluster and household numbers with the household dataset?
Thanks,
Nathan
|
|
|
Re: Zambia 2018 -- duplicates in cluster number and household number [message #25125 is a reply to message #24978] |
Thu, 01 September 2022 16:30 |
Janet-DHS
Messages: 899 Registered: April 2022
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
In the PR file, individuals are identified by hv001 hv002 hvidx. In the IR file, by v001 v002 v003. In the MR file by mv001 mv002 mv003. For this kind of merge, you just use those three files. NOT the HR file, which perhaps you have been using.
We apologize for the delay in this response and hope it will still be useful.
|
|
|