Assigning random rows from a dataframe into 2 othe

2019-08-12 01:32发布

问题:

I have a dataframe (a) as mentioned below:

   V1 V2
1   a  b
2   a  e
3   a  f
4   b  c
5   b  e
6   b  f
7   c  d
8   c  g
9   c  h
10  d  g
11  d  h
12  e  f
13  f  g
14  g  h

Now what i want is to randomly assign rows from the above dataframe (a) to 2 other empty dataframes (b and c) such that none of the rows are repeated. That means neither b has any repeated rows nor c has any repeated row. Now apart from that even across b and c, none of the rows should be same i.e a row in b shouldn't be present in any rows of c and vice versa.

Once way is to sample 7 elements from (a) without replacement and assign to (b) and then assign remaining to the (c). But in this approach all elements would be assigned at the same time to (b) and then to (c) BUT what i want is to assign elements one by one. That is a random row to (b) then a random row to (c) then again a random row to (b) ... and so on till all rows in dataframe (a) are done.

Thanks

回答1:

Sampling all of the row numbers and then partitioning the dataframe according to the parity of the row number indexes should achieve what you are after. This is the same as randomly partitioning the original dataframe row-by-row.

n <- nrow(df)
s <- sample.int(n, n)
odd.idxs <- seq_along(s) %% 2 != 0

s1 <- s[odd.idxs]
s2 <- s[-odd.idxs]

d1 <- df[s1, ]
d2 <- df[s2, ]


标签: r random