weighting data in regression analysis [message #30297] |
Wed, 30 October 2024 11:15 |
Hejie Wang
Messages: 15 Registered: July 2024
|
Member |
|
|
I want to explore the main determinants affecting childhood anemia, using variables from the KR document. I mainly use R for analysis, and the code is as follows:
DHS_data$wt<-DHS_data$v005/100000
model <- glm(formula, data = DHS_data, family = binomial, weights = wt)
Warning message:
In eval(family$initialize, rho) : non-integer #successes in a binomial glm!
As you can see, there's always a warning. But when I don't do the weighting, the warning goes away. So I want to know how to set my weights correctly. Another question I would like to ask is whether it is reasonable for me to take the cluster and country of the research object as random items when conducting multi-level logistic regression analysis. In addition, I use the lme4 package for multilevel analysis, but it always takes a lot of time to run a model, because there are about 400,000 samples included, so I wonder if there is any way to run my code faster
|
|
|
Re: weighting data in regression analysis [message #30306 is a reply to message #30297] |
Thu, 31 October 2024 08:13 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS staff member, Tom Pullum:
You have combined too many questions, and these are mainly questions about R syntax.
It appears that you have pooled many surveys into a single file and you are treating "survey" or "country" as a random effect. There are profound differences in anemia between countries, and changes within countries over time. I recommend that you analyze each survey or country separately.
Perhaps other users can help.
|
|
|