The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Topics » HIV » How to treat missing data in analysis (Missing Data)
How to treat missing data in analysis [message #24953] Tue, 09 August 2022 03:14 Go to next message
India2222 is currently offline  India2222
Messages: 1
Registered: July 2022

I am using data from the India 2019-20 NFHS survey for my thesis, concentrating on variables related to HIV knowledge, attitude and behaviour. For certain variables, for example, V769 'can get a condom' is missing over 30%.

I am relatively new to dealing with missing data. Should I run my analysis as normal (chi square, binary logistic regression) and address and state the missing values in my write up, perhaps discussing the reason for this missing data?

Do you know if this missing data is random or due to the surveying method for various HIV knowledge, attitude and behaviour variables?

EDIT: I believe that those missing are often due to the fact that they answered 'no' to 'ever heard of HIV/AIDS' hence no further questions were asked. Therefore, I believe I will only include those who answered 'yes' in my study?

Many thanks

[Updated on: Tue, 09 August 2022 04:06]

Report message to a moderator

Re: How to treat missing data in analysis [message #24981 is a reply to message #24953] Fri, 12 August 2022 16:51 Go to previous message
Janet-DHS is currently offline  Janet-DHS
Messages: 685
Registered: April 2022
Senior Member
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

By "missing" you are probably referring to a dot in Stata ("."), which is equivalent to a blank, and is the Not Applicable (NA) code. It means the question or variable did not apply to the case. If you go back to the questionnaire and find the question for "can get a condom" I expect you will find that it is preceded by filters and skips such that the question is only asked of some subpopulation of women. For other women, the question is not asked and is coded with a dot for NA.

Sometimes in DHS data files you will find a code "9" or "99" etc. that is not included in the label for the variable. When you find these codes, you will probably want to drop the case from a tabulation or calculation of a mean, etc., but they are actually very rare.
Previous Topic: Reweighting after merging IR and AR files
Next Topic: Comprehensive HIV Knowledge
Goto Forum:

Current Time: Sat Apr 13 08:13:53 Coordinated Universal Time 2024