The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » Nigeria » Inquiry regarding DHS 2018 analysis using R
Inquiry regarding DHS 2018 analysis using R [message #25968] Wed, 18 January 2023 06:27 Go to previous message
woojae1995 is currently offline  woojae1995
Messages: 6
Registered: January 2023
Member
I am currently doing a secondary analysis project using the 2018 DHS dataset of Nigeria.

Currently, I am using R and I have several technical/coding questions. I am currently using the 2018 DHS individual & children dataset.

1) How do I get a list of the column labels in R?
- I want to know the labels for the column (ex. b19 = current age of child in months) and the labels for the answer choices (ex. for the question asking the sex of the respondent; b4, 1= male, 2=female)
- I did find the 'STANDARD RECODE MANUAL for DHS-7' published by the USAID, but it still does not have the full response labels.
- Is this a problem inherent to using R? I heard that labels are easily visible when using STATA. However, since I have been using R till now, I wonder if there is a way to create a list of all the questions & labels for the dataset I am using.

2) How do I merge two dataset in R?
Referencing from this site 'https://dhsprogram.com/data/Merging-datasets.cfm', I merged the children dataset & individual dataset using the following code in R

NigeriaIR <- read_dta('NGIR7BFL.DTA')
NigeriaChildrenKR <- read_dta('NGKR7BFL.DTA')
NigeriaKRIR <- merge(NigeriaChildrenKR, NigeriaIR, by = c('v001','v002'))
*IR = individual dataset, KR = children dataset

Is this the correct way to merge it? I am concerned because the children dataset itself has 33924 observations, individual dataset has 41821 observations but when I merge them by v001 (cluster number) v002 (household number), I get 52982 observations.
From my crude understanding, I cannot understand how the merged dataset has more observations than the number of observations for the individual dataset. Could anyone explain why this is happening or what I am doing wrong?

3) Is this the correct way to account for the weighted-survey?
NigeriaKRIRsvy <- svydesign(id = NigeriaKRIR$v021.x, strata=NigeriaKRIR$v022.x, weights = NigeriaKRIR$v005.x/1000000, data=NigeriaKRIR)
*NigeriaKRIR is the merged dataset name
*for some reason, after I merged the dataset, the vXXX variables (ex. v001, v002) change to vXXX.x (ex. v001.x, v002.x)

Thank you all in advance
 
Read Message
Read Message
Previous Topic: chap 11(Nutrition indicator)
Next Topic: Inquiry regarding data merging with DHS 2018 using R
Goto Forum:
  


Current Time: Wed Nov 27 19:14:08 Coordinated Universal Time 2024