The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Topics » Mortality » Calculating Under-5 Mortality Rate for Regression Analysis Using DHS 2022 Data in R-Studio (Seeking advice on calculating individual-level under-5 mortality rate (U5MR) for a regression analysis in my bachelor's thesis using DHS 2022 data.)
Calculating Under-5 Mortality Rate for Regression Analysis Using DHS 2022 Data in R-Studio Thu, 01 August 2024 05:35
 n.borgmann Messages: 10Registered: August 2024 Member
Hi everyone,

I am currently writing my bachelor's thesis, which investigates the correlation between the under-5 mortality rate (U5MR) and maternal education in Kenya using DHS 2022 data. I am using R-Studio for my analysis.

However, I am having difficulties obtaining the U5MR at an individual level. As I have read in this forum, it seems to be quite complex to calculate. I would like to know the best approach to take for this.

My goal is to perform a linear regression with fixed effects, where the dependent variable is the U5MR and the independent variable is the mother's years of schooling. Has anyone had experience with calculating the U5MR for use in regression analysis? In my calculations, I have never matched the published data from the reports.

I would greatly appreciate any help.

Thank you,

Nils Borgmann
Re: Calculating Under-5 Mortality Rate for Regression Analysis Using DHS 2022 Data in R-Studio [message #29784 is a reply to message #29773] Mon, 05 August 2024 04:51
 n.borgmann Messages: 10Registered: August 2024 Member
Hi everyone

I've made some initial progress and have performed a linear regression with fixed effects, where the dependent variable is the U5MR and the independent variable is the mother's years of schooling. However, I am seeking feedback on my methodology and results to ensure the robustness and accuracy of my analysis.

I have compiled my preliminary findings and the regression results in an HTML document, which I am attaching here. I would greatly appreciate it if anyone with experience in this area could take a look and provide some feedback.

I am aware that this is not a particularly precise calculation, but for a bachelor's thesis, a simple calculation is sufficient. I am curious if I have made any significant errors or if the calculations are somewhat correct. I would be very grateful for any tips and help.

Here is the link: file:///Users/nilsborgmannprivate/Desktop/Bachelor%20Arbeit% 20R%20Auswertung/analysis3_KEBR8BFL_CM.html.

Some specific questions I have:

Are my assumptions and steps in the data preparation and analysis correct?
Is the fixed effects model I used appropriate for this type of analysis?
Any suggestions for improving the accuracy and interpretability of my results?

Thank you very much for your time and assistance.

Best regards,
Nils Borgmann

I have written a description of my approach here, as some parts are in German. It is sufficient to just read through this to check if my approach is correct.

Description of the Linear Regression

Goal of the Analysis:
The goal of this analysis is to investigate the relationship between the number of years of schooling a mother has and the under-5 mortality rate (U5MR) in Kenya. Additionally, the analysis accounts for regional differences by using fixed effects for the various regions.

Data Preparation:

The data was loaded from a DHS database containing relevant information on births, child mortality, educational levels, and regions.
Only the relevant variables were selected: CASEID (Case Identification), B3 (Date of Birth), B7 (Age at Death in Months), V008 (Date of Interview), V106 (Educational Level), V107 (Number of Years of Schooling), and V024 (Region).
Calculating Under-5 Mortality (death_before_5):

A new binary variable death_before_5 was created, which is set to 1 if a child died before their fifth birthday (B7 < 60), and 0 if the child reached at least five years old or is still alive.
Ensuring Region is Treated as a Factor:

The variable V024 (Region) was converted into a factor to be used as a fixed effect in the model.
Performing the Linear Regression:
Creating the Model Formula:

The formula for the model is: death_before_5 ~ V107 | V024
death_before_5: Dependent variable indicating whether a child died before the age of five.
V107: Independent variable indicating the number of years of schooling of the mother.
V024: Fixed effect for the region to control for regional differences.
Executing the Fixed Effects Model:

The model was estimated using the felm function from the lfe package.
The model controls for regional differences by including regional fixed effects (V024).

[Updated on: Mon, 05 August 2024 05:08]

Report message to a moderator

Re: Calculating Under-5 Mortality Rate for Regression Analysis Using DHS 2022 Data in R-Studio [message #29802 is a reply to message #29773] Wed, 07 August 2024 10:27
 Janet-DHS Messages: 841Registered: April 2022 Senior Member
Following is a response from DHS staff member, Tom Pullum:

For an individual child, you know whether the child died or survived. If it died, you know the age at death. If it survived, you know the age at the time of the mother's interview. There is not enough information from one child to calculate the U5MR, which by definition is calculated for a group of children--a large group.

There have been other posts and responses on this topic on the forum. One suggestion is that you focus on neonatal mortality. For an individual child, you calculate a binary outcome that is 1 if the child died in the first month (b7=0) or the first 28 days (b6<=127), and 0 otherwise. Then use logit regression.

This recent article includes some methodological discussion that may help: Subramanian, S.V., Akhil Kumar, Thomas W. Pullum, Mayanka Ambade, Sunil Rajpal, and Rockli Kim. 2024. Early-Neonatal, Late-Neonatal, Postneonatal, and Child Mortality Rates Across India, 1993-2021. JAMA Network Open 7(5):e2410046. doi:10.1001/jamanetworkopen.2024.10046. https:// jamanetwork.com/journals/jamanetworkopen/fullarticle/2818561
Re: Calculating Under-5 Mortality Rate for Regression Analysis Using DHS 2022 Data in R-Studio [message #29808 is a reply to message #29784] Wed, 07 August 2024 12:55
 Janet-DHS Messages: 841Registered: April 2022 Senior Member
Following is a response from DHS staff member, Tom Pullum:

Very sorry, but DHS staff cannot help. Perhaps other users can offer suggestions.
Re: Calculating Under-5 Mortality Rate for Regression Analysis Using DHS 2022 Data in R-Studio [message #29834 is a reply to message #29808] Sat, 10 August 2024 05:25
 n.borgmann Messages: 10Registered: August 2024 Member