The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » Nigeria » NGHR7BFL (NDHS 2018)
NGHR7BFL [message #24711] Mon, 27 June 2022 12:33 Go to next message
Mayo is currently offline  Mayo
Messages: 9
Registered: February 2019
Member
Hello,

I am using NGHR7BFL to do an analysis and in the data file, I noticed that there are multiple variables for "Highest educational level attained" (HV106$01 onward). Which one should I use?

Re: NGHR7BFL [message #24721 is a reply to message #24711] Tue, 28 June 2022 09:40 Go to previous messageGo to next message
Janet-DHS is currently offline  Janet-DHS
Messages: 258
Registered: April 2022
Senior Member
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

You are using the HR file, which has one very wide record per household. The subscripts 01, 02....refer to the line number of the person in the household. Life will be easier if you use the PR file, which has one record per person. In the PR file, the line number is given as hvidx and hv106 does not have a subscript.
Re: NGHR7BFL [message #24723 is a reply to message #24721] Tue, 28 June 2022 11:02 Go to previous messageGo to next message
Mayo is currently offline  Mayo
Messages: 9
Registered: February 2019
Member
Thanks so much!

Dr. M.
Re: NGHR7BFL [message #24967 is a reply to message #24721] Wed, 10 August 2022 16:51 Go to previous messageGo to next message
Mayo is currently offline  Mayo
Messages: 9
Registered: February 2019
Member
Hi everyone,

I have another question. ​I am aware that there are different weights for different sample selections/units of analysis....Households will be my unit of analysis and ​I would like to identify how to properly weight this survey data in SPSS.

Thanks,


Dr. M.
Re: NGHR7BFL [message #24976 is a reply to message #24967] Fri, 12 August 2022 08:30 Go to previous messageGo to next message
Shireen-DHS is currently offline  Shireen-DHS
Messages: 131
Registered: August 2020
Location: USA
Senior Member
Hello,

When using household data (HR file) or population data (PR file) you use the hv005 weight but you need to divide this by 1 million.

This YouTube video explains about the different weights in DHS data and how to apply them in SPSS: https://www.youtube.com/watch?v=NNg8HD_lKow&t=88s

I also wanted to share another resource. We have standardized code to produce all DHS indicators in SPSS which is organized into chapters based on topic. When you go to the chapter of interest you can see "tables" syntax files that tabulates these indicators while applying weights. If you use this code, please be sure to read the notes in the readme file and the main file.
https://github.com/DHSProgram/DHS-Indicators-SPSS

Thank you.

Best,

Shireen Assaf
The DHS Program
Re: NGHR7BFL [message #24993 is a reply to message #24711] Tue, 16 August 2022 09:32 Go to previous messageGo to next message
Mayo is currently offline  Mayo
Messages: 9
Registered: February 2019
Member
Thanks again.


Another question is: how do I know if there is data missing at random (MAR) or MCAR?


Dr. M.
Re: NGHR7BFL [message #25127 is a reply to message #24993] Thu, 01 September 2022 16:34 Go to previous messageGo to next message
Janet-DHS is currently offline  Janet-DHS
Messages: 258
Registered: April 2022
Senior Member
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

You are asking questions that go beyond DHS data and therefore beyond the scope of the forum.

DHS has very low levels of "missing" data. A blank or dot in a DHS data file should be interpreted as Not Applicable (NA). If you are thinking of "missing" as "don't know" or "refused" or something like that, we use special codes such as 8, 9, 9994, etc., depending on the variable. The frequencies of those codes are usually very low.

In general, to test whether "missing" is random with respect to some potential covariate, you construct a binary variable that is 1 if "missing" and 0 if "not missing" and do a logit regression of that variable on the covariate, to see whether there is a statistically significant relationship.
Re: NGHR7BFL [message #25350 is a reply to message #25127] Mon, 10 October 2022 14:59 Go to previous messageGo to next message
Mayo is currently offline  Mayo
Messages: 9
Registered: February 2019
Member
When using the PR file, what is the difference between the HV270 and HV270a variables? Which is best to use for a binary logistic regression?
Re: NGHR7BFL [message #25360 is a reply to message #25350] Wed, 12 October 2022 09:37 Go to previous messageGo to next message
Janet-DHS is currently offline  Janet-DHS
Messages: 258
Registered: April 2022
Senior Member
Following is a response from DHS staff member Tom Pullum:

Beginning with DHS-7, most surveys include hv270a in the PR file, v190a in the IR, KR, and BR files, and mv190a in the MR file. The "a" indicates that the wealth quintiles are residence-adjusted, i.e. calculated separately for urban and rural areas. A problem with the original, unadjusted wealth quintiles is that, in most surveys, there are very few households in the top quintile in rural areas and very few households in the bottom quintile in urban areas. If you use the unadjusted wealth quintiles in a model, much of the information is actually an urban/rural distinction. If you use the unadjusted wealth quintiles in a model, AND include urban/rural (hv025, etc.) then you have a better separation of wealth and residence, but the model may run into estimation issues because there are (typically) so few cases in the two combinations I mentioned.

Bottom line: if your model includes urban/rural, which it probably should, then you may want to use the adjusted wealth quintiles rather than the unadjusted. But there's no law saying you have to do that. It would be good to tell the reader which version you are using.
Re: NGHR7BFL [message #25585 is a reply to message #24711] Sun, 13 November 2022 10:43 Go to previous messageGo to next message
Mayo is currently offline  Mayo
Messages: 9
Registered: February 2019
Member
Hi Janet,

When using the NGPR7BSV file, I would like to determine the unweighted sample size. Would I use HV002 for this?
Re: NGHR7BFL [message #25602 is a reply to message #25585] Wed, 16 November 2022 09:10 Go to previous message
Janet-DHS is currently offline  Janet-DHS
Messages: 258
Registered: April 2022
Senior Member
Following is a response from DHS staff member Tom Pullum:

The best variable for this purpose is hv000 (in Stata, HV000 in SPSS). That variable only takes one value, the string "NG7". If you tab that variable, without weights, you get 188,010 cases in the PR file, i.e. individuals in the household survey. If you do the same thing in the HR file, which has households as units, you get 40,427 households.
Previous Topic: chap 9 reproductive health
Goto Forum:
  


Current Time: Sat Dec 3 22:04:29 Coordinated Universal Time 2022