The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » Other countries » Zambia 2018 -- duplicates in cluster number and household number (Trouble merging datasets due to duplicates)
Zambia 2018 -- duplicates in cluster number and household number [message #24978] Fri, 12 August 2022 11:47 Go to next message
nchesterman is currently offline  nchesterman
Messages: 1
Registered: August 2022
Hi all,

I am attempting to merge variables from the male and female datasets onto the household dataset. The DHS help site for this ( indicates that v001 and v002 (cluster and household numbers) should be unique for the datasets. However, I am finding that these two variables do not uniquely identify all observations; there are many duplicates of these two variables in the dataset.

I also checked my variables of interest, and found that the data differs between duplicated cases. So the households are not true duplicates.

Has anyone else encountered this issue for the Zambia dataset, and have suggestions for how to successfully merge on cluster and household numbers with the household dataset?

Re: Zambia 2018 -- duplicates in cluster number and household number [message #25125 is a reply to message #24978] Thu, 01 September 2022 16:30 Go to previous message
Janet-DHS is currently offline  Janet-DHS
Messages: 773
Registered: April 2022
Senior Member
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

In the PR file, individuals are identified by hv001 hv002 hvidx. In the IR file, by v001 v002 v003. In the MR file by mv001 mv002 mv003. For this kind of merge, you just use those three files. NOT the HR file, which perhaps you have been using.

We apologize for the delay in this response and hope it will still be useful.
Previous Topic: Analyse multiniveau
Next Topic: Values in contraceptive calendar - South Africa
Goto Forum:

Current Time: Tue Jul 23 00:53:04 Coordinated Universal Time 2024