The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » India » Missing Region for 2015 SPSS India dataset (Only showing individuals from some regions and not others. )
Missing Region for 2015 SPSS India dataset [message #25914] Tue, 03 January 2023 08:31 Go to next message
olympiaca is currently offline  olympiaca
Messages: 9
Registered: January 2021
Member
Hi,

I am loading the 2015 India DHS .SAV Individual file into Rstudio. When it loads I see 32757 respondents. When I see what regions these individuals are from it says they are only from Andaman and Nicobar Islands, Andhra Pradesh, Arunachal Pradesh, and Assam but there are no individuals from any of the other regions. Is this because R Studio is not loading the full dataset or is there an issue with the data file?

Thanks


Re: Missing Region for 2015 SPSS India dataset [message #25917 is a reply to message #25914] Tue, 03 January 2023 09:45 Go to previous messageGo to next message
Trevor-DHS is currently offline  Trevor-DHS
Messages: 788
Registered: January 2013
Senior Member
Please provide the filename to let us know exactly which file you are referring to. We are not seeing a problem with the files, so we suspect that it has been truncated when reading the file into R (possibly R is only reading the first part of the file). Letting us know the filename will permit us to check the file properly.

Also provide the specific commands in R that you are using to read the data into R, and we can check whether the problem is with those commands.

[Updated on: Tue, 03 January 2023 09:48]

Report message to a moderator

Re: Missing Region for 2015 SPSS India dataset [message #25926 is a reply to message #25917] Sat, 07 January 2023 01:33 Go to previous messageGo to next message
olympiaca is currently offline  olympiaca
Messages: 9
Registered: January 2021
Member
Hi

i'm using the function : read.spss from the haven package

I2015 <- read.spss("IAIR74FL.SAV", use.value.label = TRUE, to.data.frame = TRUE)

and the file name is IAIR74FL

When i read it i'm getting a sample size of 32757

Thanks
Re: Missing Region for 2015 SPSS India dataset [message #25928 is a reply to message #25926] Mon, 09 January 2023 16:10 Go to previous message
Trevor-DHS is currently offline  Trevor-DHS
Messages: 788
Registered: January 2013
Senior Member
I downloaded the SPSS file and tested this, first in SPSS itself, and then in R. In SPSS I get 699686 cases, and I am seeing data for all states.
I then tested the code you gave in R. First, read.spss is not in the haven package, but in the foreign package. The haven package uses read_spss or read_sav, not read.spss. Also the other parameters are different between the packages.

I was unable to load the data with read.spss from the foreign package. I got the message "Error: cannot allocate vector of size 5.3 Mb".

I then tried with the read_spss from the haven package:
I2015 <- read_spss("C:/Users/21180/OneDrive - ICF/Data/DHS_SPSS/IAIR74FL.SAV", user_na=TRUE)
and it successfully loaded the data showing "699686 obs. of 4796 variables".

I would first try reading the data with read_spss. If you are still only getting 32757 cases, then I think the data file that you have is somehow truncated so I would suggest downloading the dataset from our website again and trying to read it again using read_spss.
Previous Topic: TV exposure decline and v159 coding error?
Next Topic: survey weights in R
Goto Forum:
  


Current Time: Tue Apr 16 18:17:36 Coordinated Universal Time 2024