I'm programming in R. I've got a vector containing, let's say, 1000 values. Now let's say I want to partition these 1000 values randomly into two new sets, one containing 400 values and the other containing 600. How could I do this? I've thought about doing something like this...
firstset <- sample(mydata, size=400)
...but this doesn't partition the data (in other words, I still don't know which 600 values to put in the other set). I also thought about looping from 1 to 400, randomly removing 1 value at a time and placing it in firstset
. This would partition the data correctly, but how to implement this is not clear to me. Plus I've been told to avoid for
loops in R whenever possible.
Any ideas?
Just randomize mydata and take the first 400 and then last 600.
Instead of sampling the values, you could sample their positions.
EDIT: ucfagls' suggestion will be more efficient (especially for larger vectors), since it avoids allocating a vector of positions in R.
If
mydata
is truly a vector, one option would be: