Survey design in R [message #29599] |
Mon, 08 July 2024 10:18 |
RobertB
Messages: 4 Registered: June 2024
|
Member |
|
|
Hello all,
Has anyone encountered the warning below when creating tabulations/statistical summaries of an outcome across various socio-economic characteristic. This is after accounting for survey design. The function used for statistical summary is tbl_svysummary() from the gtsummary package.
Warning: There were 48 warnings in `mutate()`.
The first warning was:
ℹ In argument: `df_stats = pmap(...)`.
Caused by warning in `svymean.survey.design2()`:
! Sample size greater than population size: are weights correctly scaled?
ℹ Run dplyr::last_dplyr_warnings() to see the 47 remaining warnings.
code used for survey design
svydesign(id=mydata$hv021,data=mydata, strata=mydata$hv023,
weight=mydata$wt,nest=T)
options(survey.lonely.psu="adjust")
Notably when i run similar analysis in stata I do not get the error and despite the warning the proportions produced in R are similar to those
produced in stata.
Should i be concerned about the warning in R or ignore?
Thanks in advance.
Best wishes,
RobertB
|
|
|
Re: Survey design in R [message #29601 is a reply to message #29599] |
Tue, 09 July 2024 10:15 |
Bridgette-DHS
Messages: 3151 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS staff member, Ali Roghani:
The warning you are encountering when using tbl_svysummary() from the gtsummary package is likely related to how the weights are scaled in your survey design. In R, the svymean.survey.design2() function is quite strict about weight scaling. To eliminate the warning, you can inflate the weights column so that it sums to the number of individuals it actually represents, rather than the number of survey respondents. Here's how we can do that:
# Adjust weights to sum to the actual population size
total_population <- 25000000 # Replace with your actual population size
mydata$wt_scaled <- mydata$wt * total_population / sum(mydata$wt)
svy_design <- svydesign(id = ~hv021, data = mydata, strata = ~hv023, weights = ~wt_scaled, nest = TRUE)
Using the adjusted weights in your svydesign() may eliminate the warning.
|
|
|