Append value to empty vector in R?

2019-01-07 02:54发布

问题:

I'm trying to learn R and I can't figure out how to append to a list.

If this were Python I would . . .

#Python
vector = []
values = ['a','b','c','d','e','f','g']

for i in range(0,len(values)):
    vector.append(values[i])

How do you do this in R?

#R Programming
> vector = c()
> values = c('a','b','c','d','e','f','g')
> for (i in 1:length(values))
+ #append value[i] to empty vector

回答1:

Here are several ways to do it. All of them are discouraged. Appending to an object in a for loop causes the entire object to be copied on every iteration, which causes a lot of people to say "R is slow", or "R loops should be avoided".

# one way
for (i in 1:length(values))
  vector[i] <- values[i]
# another way
for (i in 1:length(values))
  vector <- c(vector, values[i])
# yet another way?!?
for (v in values)
  vector <- c(vector, v)
# ... more ways

help("append") would have answered your question and saved the time it took you to write this question (but would have caused you to develop bad habits). ;-)

Note that vector <- c() isn't an empty vector; it's NULL. If you want an empty character vector, use vector <- character().

Also note, as BrodieG pointed out in the comments: if you absolutely must use a for loop, then at least pre-allocate the entire vector before the loop. This will be much faster than appending for larger vectors.

set.seed(21)
values <- sample(letters, 1e4, TRUE)
vector <- character(0)
# slow
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
#   user  system elapsed 
#  0.340   0.000   0.343 
vector <- character(length(values))
# fast(er)
system.time( for (i in 1:length(values)) vector[i] <- values[i] )
#   user  system elapsed 
#  0.024   0.000   0.023 


回答2:

FWIW: analogous to python's append():

b <- 1
b <- c(b, 2)


回答3:

You have a few options:

c(vector, values)
append(vector, values)
vector[(length(vector) + 1):(length(vector) + length(values))] <- values

The first one is the standard approach. The second one gives you the option to append someplace other than the end. The last one is a bit contorted but has the advantage of modifying vector (though really, you could just as easily do vector <- c(vector, values).

Notice that in R you don't need to cycle through vectors. You can just operate on them in whole.

Also, this is fairly basic stuff, so you should go through some of the references.

Some more options based on OP feedback:

for(i in values) vector <- c(vector, i)


回答4:

Just for the sake of completeness, appending values to a vector in a for loop is not really the philosophy in R. R works better by operating on vectors as a whole, as @BrodieG pointed out. See if your code can't be rewritten as:

ouput <- sapply(values, function(v) return(2*v))

Output will be a vector of return values. You can also use lapply if values is a list instead of a vector.



回答5:

Sometimes we have to use loops, for example, when we don't know how many iterations we need to get the result. Take while loops as an example. Below are methods you absolutely should avoid:

a=numeric(0)
b=1
system.time(
  {
    while(b<=1e5){
      b=b+1
      a<-c(a,pi)
    }
  }
)
# user  system elapsed 
# 13.2     0.0    13.2 

a=numeric(0)
b=1
system.time(
  {
    while(b<=1e5){
      b=b+1
      a<-append(a,pi)
    }
  }
)
# user  system elapsed 
# 11.06    5.72   16.84 

These are very inefficient because R copies the vector every time it appends.

The most efficient way to append is to use index. Note that this time I let it iterate 1e7 times, but it's still much faster than c.

a=numeric(0)
system.time(
  {
    while(length(a)<1e7){
      a[length(a)+1]=pi
    }
  }
)
# user  system elapsed 
# 5.71    0.39    6.12  

This is acceptable. And we can make it a bit faster by replacing [ with [[.

a=numeric(0)
system.time(
  {
    while(length(a)<1e7){
      a[[length(a)+1]]=pi
    }
  }
)
# user  system elapsed 
# 5.29    0.38    5.69   

Maybe you already noticed that length can be time consuming. If we replace length with a counter:

a=numeric(0)
b=1
system.time(
  {
    while(b<=1e7){
      a[[b]]=pi
      b=b+1
    }
  }
)
# user  system elapsed 
# 3.35    0.41    3.76

As other users mentioned, pre-allocating the vector is very helpful. But this is a trade-off between speed and memory usage if you don't know how many loops you need to get the result.

a=rep(NaN,2*1e7)
b=1
system.time(
  {
    while(b<=1e7){
      a[[b]]=pi
      b=b+1
    }
    a=a[!is.na(a)]
  }
)
# user  system elapsed 
# 1.57    0.06    1.63 

An intermediate method is to gradually add blocks of results.

a=numeric(0)
b=0
step_count=0
step=1e6
system.time(
  {
    repeat{
      a_step=rep(NaN,step)
      for(i in seq_len(step)){
        b=b+1
        a_step[[i]]=pi
        if(b>=1e7){
          a_step=a_step[1:i]
          break
        }
      }
      a[(step_count*step+1):b]=a_step
      if(b>=1e7) break
      step_count=step_count+1
    }
  }
)
#user  system elapsed 
#1.71    0.17    1.89


回答6:

In R, you can try out this way:

X = NULL
X
# NULL
values = letters[1:10]
values
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
X = append(X,values)
X
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
X = append(X,letters[23:26])
X
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "w" "x" "y" "z"


回答7:

> vec <- c(letters[1:3]) # vec <- c("a","b","c") ; or just empty vector: vec <- c()

> values<- c(1,2,3)

> for (i in 1:length(values)){
      print(paste("length of vec", length(vec))); 
      vec[length(vec)+1] <- values[i]  #Appends value at the end of vector
  }

[1] "length of vec 3"
[1] "length of vec 4"
[1] "length of vec 5"

> vec
[1] "a" "b" "c" "1" "2" "3"