The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » General Data Questions » Cluster sampling (Pooling of clusters)
Cluster sampling [message #22596] Thu, 08 April 2021 17:46 Go to next message
okunloladavid is currently offline  okunloladavid
Messages: 3
Registered: February 2020
Dear everyone,

Is it possible for the cluster and household randomly selected in a previous DHS be chosen randomly again in a later survey? How can one detect that for instance in 2013 and 2018 Nigeria DHS?

I am asking this question because I assume this might have some implications for appending DHS datasets because I assume ot is possible to have the same cluster id but different cluster in different datasets, but appending such datasets will combine these different clusters but with the same IDs as the same cluster.

It might also be possible that the same cluster with diffferent IDs in different datasets will be treated as different clusters when these datasets are merged.

This also have implications for cluster effect on in multilevel modelling using pooled data because we don't know which cluster is affects the outcome because a respondent might be in different clusters in the pooled data, especially if the same respondent emerged in the different but pooled data.

I will be pleased to read anyone's take on this. Thank you.

Re: Cluster sampling [message #22600 is a reply to message #22596] Fri, 09 April 2021 07:34 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2307
Registered: February 2013
Senior Member

Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

It is very unlikely that the same cluster will be selected in successive surveys, and especially unlikely that the same households or individuals will be sampled. However, there is no way to tell whether that has happened. The identifiers for the selected clusters and households are not retained. There is therefore no way to eliminate the possibility, for example by removing previously selected clusters from the sampling frame. I believe that if you did a simulation study that allowed for repeated selection, you would find that any adjustments, for example a finite population correction, would have a negligible effect.

Previous Topic: Getting data on the siblings of respondents
Goto Forum:

Current Time: Sat Jun 19 12:19:18 Coordinated Universal Time 2021