Home » Countries » India » k fold cross validation for logistic regression in R (stuck up using cross validation using surveyCV package)
k fold cross validation for logistic regression in R [message #26199] |
Thu, 16 February 2023 03:53 |
dhivvyajp@am.amrita.edu
Messages: 4 Registered: January 2023
|
Member |
|
|
Dear Experts,
I am working on DHS7 dataset. I was able to do logistic regression for 70% training and 30% testing data. But When I am trying to do k fold cross validation instead of 70: 30 split up, came across surveyCV package. I am getting the following error. Kindly let me know how can I fix this issue.
> set.seed(2023)
> svylogistic <- svyglm(formula = InternetUsage~RuralOrUrban+AgeGroup+WealthIndex+SchoolingCom pleted+Religion+Caste+MaritalStatus+Occupation+Gender+Litera cy+OwnsMobile, design=my_design, family=quasibinomial())
> cv.svyglm(svylogistic, nfolds=3, na.rm = FALSE)
Error in if (clusterID %in% c("0", "1")) { : the condition has length > 1
|
|
|
|
Re: k fold cross validation for logistic regression in R [message #26409 is a reply to message #26199] |
Fri, 17 March 2023 03:45 |
dhivvyajp@am.amrita.edu
Messages: 4 Registered: January 2023
|
Member |
|
|
I tried the following also. But not able to fix the error.
> cv.svy(train, formulae = " InternetUsage~RuralOrUrban+AgeGroup+WealthIndex+SchoolingCom pleted+Religion+Caste+MaritalStatus+Occupation+Gender+Litera cy+OwnsMobile ", method = "logistic", nfolds=3, strataID = train$strata, clusterID = train$Cluster, nest = T, weightsID = train$samplewt)
....................................Error in .subset2(x, i, exact = exact) : no such index at level 1[/color]
> cv.svy(train, formulae = " InternetUsage~RuralOrUrban+AgeGroup+WealthIndex+SchoolingCom pleted+Religion+Caste+MaritalStatus+Occupation+Gender+Litera cy+OwnsMobile ", method = "logistic", nfolds=3, strataID = train$strata, clusterID = train$Cluster, nest = F, weightsID = train$samplewt)
....................................Error in .subset2(x, i, exact = exact) : no such index at level 1
> cv.svyglm(svylogistic, nfolds=3,na.rm=FALSE)
....................................Error in if (clusterID %in% c("0", "1")) { : the condition has length > 1
Can anyone help me in fixing this error?
|
|
|
Re: k fold cross validation for logistic regression in R [message #26502 is a reply to message #26409] |
Mon, 27 March 2023 08:05 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Senior Analysis & Research Manager, Shireen Assaf:
# install and load the packages you need
install.packages("survey")
library(survey)
# setting your survey design
# To identify the survey design, you need three variables: weight, psu, and strata
# creating the sampling weight variable.
IRdata$wt <- IRdata$v005/1000000
mysurvey<-svydesign(id=IRdata$v021, data=IRdata, strata=IRdata$v022, weight=IRdata$wt, nest=T)
options(survey.lonely.psu="adjust")
#now you can use the svy commands in the survey package and use the "mysurvey" sample design object. Check the commands you can use in the survey package.
# for example
# table of variable v313 (FP use) this is after you attach your data. You can also use svyby, svyglm, etc.
svytable(~v313, mysurvey)
[Updated on: Mon, 27 March 2023 08:05] Report message to a moderator
|
|
|
Goto Forum:
Current Time: Fri Nov 29 10:10:41 Coordinated Universal Time 2024
|