可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm trying to divide each number within a data frame with 16 columns by a specific number for each column. The numbers are stored as a data frame with 1-16 corresponding to the samples in the larger data frames columns 1-16. There is a single number per column that I need to divide by each number in the larger spreadsheet and print the output to a final spreadsheet.
Here's and example of what I'm starting with. The spreadsheet to be divided.
X131.478.1 X131.478.2 X131.NSC.1 X131.NSC.2 X166.478.1 X166.478.2
1/2-SBSRNA4 4 2 2 6 7 6
A1BG 93 73 88 86 58 65
A1BG-AS1 123 103 96 128 46 57
The numbers to divide the spreadsheet by
X131.478.1 1.0660880
X131.478.2 0.9104053
X131.NSC.1 0.8642545
X131.NSC.2 0.9611866
X166.478.1 0.9711406
X166.478.2 1.0560121
And the expected results, not necessarily rounded as I did here.
X131.478.1 X131.478.2 X131.NSC.1 X131.NSC.2 X166.478.1 X166.478.2
1/2-SBSRNA4 3.75 2.19 2.31 6.24 7.20 5.68
A1BG 87.23 80.17 101.82 89.47 59.72 61.55
A1BG-AS1 115.37 113.13 111.07 133.16 47.36 53.97
I tried simply dividing the data frames mx2 = mx/sf with mx being the large data set and sf being the data frame of numbers to divide by. That seemed to divide everything by the first number in the sf data set.
The numbers for division were generated by estimateSizeFactors, part of the DESeq package if that helps.
Any help would be great. Thanks!
回答1:
sweep
is useful for these sorts of operations. For example, some dummy data where we divide each element in respective columns of matrix mat
by the corresponding value in the vector vec
:
mat <- matrix(1:25, ncol = 5)
vec <- seq(2, by = 2, length = 5)
sweep(mat, 2, vec, `/`)
In use we have:
> mat
[,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
> vec
[1] 2 4 6 8 10
> sweep(mat, 2, vec, `/`)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.5 1.50 1.833333 2.000 2.1
[2,] 1.0 1.75 2.000000 2.125 2.2
[3,] 1.5 2.00 2.166667 2.250 2.3
[4,] 2.0 2.25 2.333333 2.375 2.4
[5,] 2.5 2.50 2.500000 2.500 2.5
> mat[,1] / vec[1]
[1] 0.5 1.0 1.5 2.0 2.5
回答2:
Just for variety, you could also use mapply
mx <- structure(list(X131.478.1 = c(4L, 93L, 123L), X131.478.2 = c(2L,
73L, 103L), X131.NSC.1 = c(2L, 88L, 96L), X131.NSC.2 = c(6L,
86L, 128L), X166.478.1 = c(7L, 58L, 46L), X166.478.2 = c(6L,
65L, 57L)), .Names = c("X131.478.1", "X131.478.2", "X131.NSC.1",
"X131.NSC.2", "X166.478.1", "X166.478.2"), class = "data.frame", row.names = c("1/2-SBSRNA4",
"A1BG", "A1BG-AS1"))
sf <- structure(list(V1 = c(1.066088, 0.9104053, 0.8642545, 0.9611866,
0.9711406, 1.0560121)), .Names = "V1", row.names = c("X131.478.1",
"X131.478.2", "X131.NSC.1", "X131.NSC.2", "X166.478.1", "X166.478.2"
), class = "data.frame")
mapply(function(x, y) x * y, mx, t(sf))
X131.478.1 X131.478.2 X131.NSC.1 X131.NSC.2 X166.478.1 X166.478.2
[1,] 4.264352 1.820811 1.728509 5.76712 6.797984 6.336073
[2,] 99.146184 66.459587 76.054396 82.66205 56.326155 68.640787
[3,] 131.128824 93.771746 82.968432 123.03188 44.672468 60.192690
But for this I think Josh's answer is better... and Gavin's is even better!
回答3:
This is nothing but element-wise matrix multiplication:
mat <- matrix(c(4,2,2,6,7,6, 93,73,88,86,58,65, 123,103,96,128,46,57), nrow=3, byrow=T)
vec = c(1.0660880,0.9104053,0.8642545,0.9611866,0.9711406,1.0560121)
mat %o% 1/vec
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3.752035 2.080761 1.876018 6.242284 6.566062 6.242284
[2,] 102.152305 75.169342 96.660246 88.555663 63.707889 66.931606
[3,] 142.319190 97.536761 111.078392 121.210732 53.225063 53.976654
To do that we used the outer-product approach, since directly trying mat %*% 1/vec
gives an error on non-conformable arguments
because they have different shapes.
Or look at the many posts on https://stackoverflow.com/search?q=%5Br%5D+multiply+matrix+by+vector
回答4:
You could use transform
mx2 <- transform(mx,
X131.478.1=X131.478.1/sf["X131.478.1",1],
X131.478.2=X131.478.2/sf["X131.478.2",1],
etc
)
Quite a bit to type with 16 columns, but it should work.