The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Dataset use (other programs) » R DHS aggregation by Indigenous Identity in Guatemala (Aggregate health statisitics by indigenous group)
R DHS aggregation by Indigenous Identity in Guatemala [message #24063] Tue, 15 February 2022 14:52 Go to next message
hmwoods02 is currently offline  hmwoods02
Messages: 2
Registered: February 2022

I am trying to use DHS data in R to create health data that is broken down by tribal group in Guatemala. I have identified variable SETID (self identification), s114 (language learned to speak) or s117(languages spoken at home) as relevant for identifying indigenous identity, however I am not sure how to construct such a conplex variable in R. There are 25 indigenous groups identified in the data.

Re: R DHS aggregation by Indigenous Identity in Guatemala [message #24084 is a reply to message #24063] Mon, 21 February 2022 08:37 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3064
Registered: February 2013
Senior Member
Following is a response from DHS Senior Sampling Specialist, Mahmoud Elkasabi:

You can try the following function to group the variables into one variable (tribalg).

GUBR71 <- GUBR71 %>% 
               mutate(tribalg = group_indices(., setid, s114))
Previous Topic: alternative to exporting with xlsx
Next Topic: Respondent's sex NFHS-5
Goto Forum:

Current Time: Thu May 23 16:20:34 Coordinated Universal Time 2024