Split a string vector at whitespace

2019-02-01 05:18发布

问题:

I have the following vector:

tmp3 <- c("1500 2", "1500 1", "1510 2", "1510 1", "1520 2", "1520 1", "1530 2", 
"1530 1", "1540 2", "1540 1")

I would like to just retain the second number in each of the atoms of this vector, so it would read:

c(2,1,2,1,2,1,2,1,2,1)

回答1:

There's probably a better way, but here are two approaches with strsplit():

as.numeric(data.frame(strsplit(tmp3, " "))[2,])
as.numeric(lapply(strsplit(tmp3," "), function(x) x[2]))

The as.numeric() may not be necessary if you can use characters...



回答2:

One could use read.table on textConnection:

X <- read.table(textConnection(tmp3))

then

> str(X)
'data.frame':   10 obs. of  2 variables:
 $ V1: int  1500 1500 1510 1510 1520 1520 1530 1530 1540 1540
 $ V2: int  2 1 2 1 2 1 2 1 2 1

so X$V2 is what you need.



回答3:

It depends a little bit on how closely your actual data matches the example data you've given. I you're just trying to get everything after the space, you can use gsub:

gsub(".+\\s+", "", tmp3)
[1] "2" "1" "2" "1" "2" "1" "2" "1" "2" "1"

If you're trying to implement a rule more complicated than "take everything after the space", you'll need a more complicated regular expresion.



回答4:

What I think is the most elegant way to do this

>     res <- sapply(strsplit(tmp3, " "), "[[", 2)

If you need it to be an integer

>     storage.mode(res) <- "integer"


回答5:

substr(x = tmp3, start = 6, stop = 6)

So long as your strings are always the same length, this should do the trick.

(And, of course, you don't have to specify the argument names - substr(tmp3, 6, 6) works fine, too)



回答6:

This should do it:

library(plyr)
ldply(strsplit(tmp3, split = " "))[[2]]

If you need a numeric vector, use

as.numeric(ldply(strsplit(tmp3, split = " "))[[2]])


回答7:

Another option is scan(). To get the second value, we can use a logical subset.

scan(text = tmp3)[c(FALSE, TRUE)]
#  [1] 2 1 2 1 2 1 2 1 2 1


回答8:

An easier way to split 1 column into 2 columns via data.table

require(data.table)  
data_ex = data.table( a = paste( sample(1:3, size=10, replace=TRUE),"-separate", sep="" ))  
data_ex[, number:=  unlist( strsplit(x=a, split="-") )[[1]], by=a]  
data_ex[, word:= unlist( strsplit(x=a, split="-") )[[2]], by=a ]