Read SPSS file into R

2019-01-21 02:44发布

I am trying to learn R and want to bring in an SPSS file, which I can open in SPSS.

I have tried using read.spss from foreign and spss.get from Hmisc. Both error messages are the same.

Here is my code:

## install.packages("Hmisc")
library(foreign)

## change the working directory
getwd()
setwd('C:/Documents and Settings/BTIBERT/Desktop/')

## load in the file
## ?read.spss
asq <- read.spss('ASQ2010.sav', to.data.frame=T)

And the resulting error:

Error in read.spss("ASQ2010.sav", to.data.frame = T) : error reading system-file header In addition: Warning message: In read.spss("ASQ2010.sav", to.data.frame = T) : ASQ2010.sav: position 0: character `\000' (

Also, I tried saving out the SPSS file as a SPSS 7 .sav file (was previously using SPSS 18).

Warning messages: 1: In read.spss("ASQ2010_test.sav", to.data.frame = T) : ASQ2010_test.sav: Unrecognized record type 7, subtype 14 encountered in system file 2: In read.spss("ASQ2010_test.sav", to.data.frame = T) : ASQ2010_test.sav: Unrecognized record type 7, subtype 18 encountered in system file

标签: r spss
14条回答
相关推荐>>
2楼-- · 2019-01-21 03:01

There is no such problem with packages you are using. The only requirement for read a spss file is to put the file into a PORTABLE format file. I mean, spss file have *.sav extension. You need to transform your spss file in a portable document that uses *.por extension.

There is more info in http://www.statmethods.net/input/importingdata.html

查看更多
神经病院院长
3楼-- · 2019-01-21 03:05

It looks like the R read.spss implementation is incomplete or broken. R2.10.1 does better than R2.8.1, however. It appears that R gets upset about custom attributes in a sav file even with 2.10.1 (The latest I have). R also may not understand the character encoding field in the file, and in particular it probably does not work with SPSS Unicode files.

You might try opening the file in SPSS, deleting any custom attributes, and resaving the file. You can see whether there are custom attributes with the SPSS command

display attributes.

If so, delete them (see VARIABLE ATTRIBUTE and DATAFILE ATTRIBUTE commands), and try again.

HTH, Jon Peck

查看更多
啃猪蹄的小仙女
4楼-- · 2019-01-21 03:06

You can read SPSS file from R using above solutions or the one you are currently using. Just make sure that the command is fed with the file, that it can read properly. I had same error and the problem was, SPSS could not access that file. You should make sure the file path is correct, file is accessible and it is in correct format.

library(foreign)
asq <- read.spss('ASQ2010.sav', to.data.frame=TRUE)

As far as warning message is concerned, It does not affect the data. The record type 7 is used to store features in newer SPSS software to make older SPSS software able to read new data. But does not affect data. I have used this numerous times and data is not lost.

You can also read about this at http://r.789695.n4.nabble.com/read-spss-warning-message-Unrecognized-record-type-7-subtype-18-encountered-in-system-file-td3000775.html#a3007945

查看更多
男人必须洒脱
5楼-- · 2019-01-21 03:06

In my case this warning was combined with a appearance of a new variable before first column of my data with values -100, 2, 2, 2, ..., a shift in the correspondence between labels and values and the deletion of the last variable. A solution that worked was (using SPSS) to create a new dump variable in the last column of the file, fill it with random values and execute the following code: (filename is the path to the sav file and in my case the original SPSS file had 62 columns, thus 63 with the additional dumb variable)

library(memisc)
data <- as.data.set(spss.system.file(filename))

copyofdata = data
for(i in 2:63){
  names(data)[i] <- names(copyofdata)[i-1]
}
data[[1]] <- NULL

newcopyofdata = data
for(i in 2:62){
  labels(data[[i]]) <- labels(newcopyofdata[[i-1]])
}
labels(data[[1]]) <- NULL

Hope the above code will help someone else.

查看更多
疯言疯语
6楼-- · 2019-01-21 03:09

Turn your UNICODE in SPSS off

Open SPSS without any data open and run the code below in your syntax editor

SET UNICODE OFF.

Open the data set and resave it to remove the Unicode

read.spss('yourdata.sav', to.data.frame=T) works correctly then

查看更多
ら.Afraid
7楼-- · 2019-01-21 03:10

I agree with @SDahm that the haven package would be the way to go. I myself have struggled a bit with string values when starting to use it, so I thought I'd share my approach on that here, too.

The "semantics" vignette has some useful information on this topic.

library(tidyverse)
library(haven)

# Some interesting information in here
vignette('semantics')

# Get data from spss file
df <- read_sav(path_to_file)

# get value labels
df <- map_df(.x = df, .f = function(x) {
  if (class(x) == 'labelled') as_factor(x)
  else x})
# get column names
colnames(df) <- map(.x = spss_file, .f = function(x) {attr(x, 'label')})
查看更多
登录 后发表回答