How to import Qualtrics data (in csv format) into

2019-02-24 07:01发布

问题:

I am trying to import a data downloaded from Qualtrics into R. It is a csv file.

However, I encounter 2 problems.

  1. R could not figure out the format of each column by itself, probably because row 2 and row 3 (highlighted above) are all useless text. R thinks that all columns are character. However, obviously some are date, some are factor, and some are integer. How can R figure out the data class of each column correctly by itself?
library(tidyverse)
filename <- "mydata.csv"
df = read_csv(filename, col_names = TRUE)

Parsed with column specification:
cols(
  .default = col_character()
)
See spec(...) for full column specifications.
  1. I also tried to load the variable name (header) and data matrix separately. Unfortunately, using the skip = 3 argument does not work. It says that my data only has 1 observation... Why?
 filename <- "mydata.csv"
 headers = read_csv(filename, col_names = FALSE, n_max = 1)
 df = read_csv(filename, skip = 3, col_names = FALSE)
 colnames(df)= headers
Error in names(x) <- value : 
'names' attribute [273] must be the same length as the vector [1]

What is a good way to import my csv file into R?

回答1:

I use the following code to import data from Qualtrics into R:

library(tidyverse)
filename <- "mydata.csv"
headers = read_csv(filename, col_names = FALSE, n_max = 1)
df = read_csv(filename, skip = 3, col_names = FALSE)
colnames(df)= headers

However, there is one caveat. This method only works when you removed all line breaks when you downloaded your data. (Please see the graph below as to how to do so.) My skip = 3 argument works because I removed all line breaks when I downloaded the data from Qualtrics. It is very probable that the questions you asked in Qualtrics contains multiple lines. It constitutes a problem for R to understand your file in this way. I recommend you to remove all line breaks when you download the data from the website.

Using the method above, R can normally correctly recognise the data structure of most columns, saving yourself a ton of effort to recode yourself.