The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Sampling » Combining individuals DHS datasets in Python (MR or IR with AR: cluster, stratum and weights in Python)
Combining individuals DHS datasets in Python [message #17951] Thu, 25 July 2019 08:22 Go to next message
Triphon is currently offline  Triphon
Messages: 1
Registered: July 2019

I am merging MR and IR datasets with HIV status (AR) for multiple countries. From what I've read, weights are hiv05 (preferred over v005/mv005), cluster (id) is v021/mv021 and stratum depends but let's say v023/mv023.
I am coding in Python (for various reasons) and there is no package/library such as svydesign in R. I could use R and then back to Python but switching from pandas categorical to R factor is not straightforward (data type not string). In Python, I can handle the weights, but not the cluster and the stratum (to my knowledge). Does someone knows what would be the best option here? For prediction and variables selection purposes, not taking into account cluster and stratum would be problematic?
Re: Combining individuals DHS datasets in Python [message #21856 is a reply to message #17951] Thu, 31 December 2020 16:43 Go to previous message
Mamadou S DIallo is currently offline  Mamadou S DIallo
Messages: 1
Registered: December 2020
Dear Triphon

I came across this post. I know it is old but in case you are still interested in using Python to analyze survey samples, samplics may be useful for you.

samplics is a Python package that I have been developing. It is not yet feature-complete as survey R but the pieces included so far have been extensively tested. I am working on adding needed features. Feedback from users will be extremely valuable to guide the next items to add to the package. If you are still analyzing survey data, I will be very interested in your feedback. You will find samplics' documentation at If you are interested, do not hesitate to send me an email at

Happy new year and all the best for 2021.
Previous Topic: Design Effects across surveys
Next Topic: Missing observations in Mali BR
Goto Forum:

Current Time: Mon May 17 15:49:29 Coordinated Universal Time 2021