| 
		
			| Survey Design Issue  [message #22488] | Fri, 19 March 2021 17:44  |  
			| 
				
				
					|  hamid Messages: 9
 Registered: November 2018
 | Member |  |  |  
	| Recently I am experiencing some issues in setting the DHS survey design in R using the Afghanistan 2015 Standard DHS7. 
 After loading the Stata format individual recode dataset (AFIR71FL) into R:
 
 library(foreign)
 library(survey)
 
 data <- read.dta("AFIR71FL.DTA")
 data$wt <- data$v005 /1000000
 
 I simply run the following command (as indicated in the DHS documentation):
 
 DHSdesign<-svydesign(id = data$v021, strata = data$v023, weights = wt,  data=data)
 
 and I get the following error:
 
 Error in svydesign.default(id = data$v021, strata = data$v023, weights = wt, data = data) :
 Clusters not nested in strata at top level; you may want nest=TRUE.
 
 which is basically telling me that PSU ids are not unique across Strata (Province +rural/urban).
 
 Some additional notes:
 - The same code worked just fine over the past month (working on it everyday);
 - I tried a fresh install of R + packages without any success;
 - R & packages are up to date;
 - I get the same error using different machines;
 - The AfDHS file in use is the most recent;
 - Using the option "nest=T", as suggested by R, is not a working solution as it creates problems when running "svyglm" regressions.
 
 I am wondering if this is a R:suvery package bug or something else.
 
 Thanks,
 Hamid
 [Updated on: Sat, 20 March 2021 05:33] Report message to a moderator |  
	|  |  | 
	| 
		
			| Re: Survey Design Issue  [message #22510 is a reply to message #22488] | Mon, 22 March 2021 16:32   |  
			| 
				
				
					|  Bridgette-DHS Messages: 3230
 Registered: February 2013
 | Senior Member |  |  |  
	| Following is a response from DHS Senior Sampling Specialist, Mahmoud Elkasabi: 
 
 Apparently this is due an error in the IR dataset (I assume it exists in the other ones as well; I haven't checked). One woman in cluster 476 is coded as rural in v025 although the cluster is urban. This causes the error message.
 
 Here are two possible solutions:
 
 1-	Use the nest option after you re-construct v021 and v023:
 
 
 2-	Recode a new v025 variable (say v025r) where you assign all cases with v021=476 to v025r=1 (otherwise v025r=v025) and then proceed with the code below:IRdata$STRAT <- as.integer(factor(with(IRdata, paste(v024, v025))))
IRdata$CLUST <- as.integer(factor(with(IRdata, paste(v023))))
DHSdesign<-svydesign(id = IRdata$CLUST, strata = IRdata$STRAT, weight = IRdata$v005, data=IRdata, nest = TRUE)
 
 IRdata$STRAT <- as.integer(factor(with(IRdata, paste(v024, v025r))))
IRdata$CLUST <- as.integer(factor(with(IRdata, paste(v023))))
DHSdesign<-svydesign(id = IRdata$CLUST, strata = IRdata$STRAT, weight = IRdata$v005, data=IRdata)
 |  
	|  |  | 
	| 
		
			| Re: Survey Design Issue  [message #22511 is a reply to message #22510] | Tue, 23 March 2021 05:12  |  
			| 
				
				
					|  hamid Messages: 9
 Registered: November 2018
 | Member |  |  |  
	| Thanks a lot. This actually works! 
 I prefer the second approach better. This issue does not apply to the couples recode (probably because this woman is not listed there).
 As already written, the nest=T option creates problems when running some R function (e.g., lmtest::waldtest would produce an unequal sample between models error).
 
 Kind regards,
 Hamid
 
 |  
	|  |  |