The DHS Program User Forum: Dataset use (other programs) » filtering hhid in the HR file using R after importing in haven and using dplyr

Home » Data » Dataset use (other programs) » filtering hhid in the HR file using R after importing in haven and using dplyr

Show: Today's Messages :: Show Polls :: Message Navigator

filtering hhid in the HR file using R after importing in haven and using dplyr [message #13235]

Sun, 08 October 2017 11:04

newtoDHS2017
Messages: 6
Registered: September 2017
Location: Europe

Member

Dear forum,

I have a basic question on how to correctly select cases based on the hhid field in the HR file. I have imported the Rwanda 2015-16 DHS HR (household file) in STATA format into R using the haven package, and all seems to have worked well.

However, I am trying to select certain cases of HH records by making use of the caseid (the first field in the HR file) with no success. I noticed that in R, the caseid (hhid) is a character field with values such as "1 1", "1 2", "1 10" etc as this is a combination of the cluster number (hv001) and the household number (hv002).

I used the filter command of the R dplyr package to just get the first case but it does not work:

code example - t <- select(HR_dataset, hhid=="11"). It does not return the line of record.

Then I thought it looks like there might be spaces in the hhid field given the way some hhid values are displayed, so equally I tried putting a space between the two 1s but this doesn't make a difference:

t <- select(HR_dataset, hhid =="1 1")

Can you let me know what I did wrong, or perhaps I should filter the cases using the cluster number and the household number (hv001,hv002) instead?

Thanks a lot for your help,

newtoDHS2017

Report message to a moderator

Re: filtering hhid in the HR file using R after importing in haven and using dplyr [message #13252 is a reply to message #13235]

Mon, 09 October 2017 07:54

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is a response from Senior DHS Stata Specialist, Tom Pullum:

The easiest solution will be to use hv001, hv002, and hvidx (cluster id, household id, and line number, respectively), all of which are numeric rather than strings.

Report message to a moderator

Re: filtering hhid in the HR file using R after importing in haven and using dplyr [message #13253 is a reply to message #13252]

Mon, 09 October 2017 10:06

newtoDHS2017
Messages: 6
Registered: September 2017
Location: Europe

Member

Dear Bridgette and Tom,

Many thanks for this.I used the cluster and hh data and they work fine now.

thanks a lot for your response.

Report message to a moderator

Previous Topic:	Reading data files into R Studio
Next Topic:	import data file to R

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Fri Aug 1 17:02:01 Coordinated Universal Time 2025