Calculating slopes of fluorescence over time

2019-08-03 12:28发布

I am a newbie to R and want to use it to make my life easier analyzing the data of my fluorescence assays. Before I did the analysis manually in excel but now want to make it easier by setting up a r-script for it.

one example of my data would be this:

> df <- data.frame(time=1:10, sample1=4*1:10, sample2=3*1:10, sample3=2*1:10)
> df
   time sample1 sample2 sample3
1     1       4       3       2
2     2       8       6       4
3     3      12       9       6
4     4      16      12       8
5     5      20      15      10
6     6      24      18      12
7     7      28      21      14
8     8      32      24      16
9     9      36      27      18
10   10      40      30      20

so the first column is always the time of my assay and the following columns represent the fluorescent signal of each sample for a given time-point.

I then want to calculate the slope of fluorescence over time for every sample (e.g. sample1 over time, sample2 over time, sample3 over time,...). As a result I should get one value of slope per column.

In excel I used: slope(B2:BX;$A$2:$A$X) Since I usually have 96 samples this makes it even more annoying to do it by hand in excel.

The solution offered by @missuse

apply(df[,2:ncol(df)], 2, function(x){
  model = lm(x ~ df$time - 1)
  return(coef(model))
})

worked for the sample df I showed above but not for my real data.

As requested, here is a part of my real data:

> df1
      Time       1       2       3       4       5       6       7       8       9      10      11      12      13
1        0 24315.5 21446.5 46748.5   36008   15501 16799.5   24847 25354.5 16617.5   10576 43422.5   40036 15988.5
2     26.1 25592.5 22667.5 47310.0 36284.0 15790.5 16815.5 25108.0 25535.0 16702.5 10418.0 43602.0 40301.5 16227.0
3     52.1 26493.0 22839.5 47356.5 36549.0 15773.5 16804.0 25307.5 25538.5 16697.5 10390.0 43682.0 40332.0 16271.0
4     78.2   26889   23585 47496.5   36525 15942.5   16903   25498   25565 16796.5 10369.5 43768.5 40253.5   16584
5    104.3 27320.5   23914 47331.5 36680.5 16033.5   16912   25717 25798.5 16903.5 10356.5   43960   40299 16604.5
6    130.4 27823.0 24208.0 47815.0 36591.0 16132.0 17052.5 25669.5 25614.5 16958.0 10306.0 44104.5 40266.0 16682.5
7    156.4 28335.0 24647.0 47838.0 36718.0 16269.5 17001.0 25945.0 25754.5 16995.0 10397.5 43998.5 40256.5 16838.5
8    182.5 28859.5   25128   48056   36887   16385 17032.5 25832.5   25710 16980.5 10306.5 44282.5   40461   16995
9    208.6 29324.0 25369.0 48094.5 36889.5 16360.5 16931.0 25961.0 25918.5 17259.0 10271.0 44297.0 40511.0 17033.0
10   234.6 29920.5 25803.5 48314.5 36755.5 16566.0 17106.5 26210.0 25595.5 17293.5 10268.5 44355.5 40503.5 17047.5
11   260.7 30412.5 26314.5 48709.5 36848.5 16630.5 17208.0 26065.0 25702.0 17448.5 10296.0 44805.0 40383.5 17029.5
12   286.8 30883.0 26624.0 48570.5 36845.0 16804.0 17116.5 26237.0 25817.0 17523.0 10274.0 44743.5 40727.5 17167.5  


 > dput(df1[1:5, 1:3])
    structure(list(Time = structure(c(44L, 1L, 2L, 47L, 45L), .Names = c("X__2", 

"X__3", "X__4", "X__5", "X__6"), .Label = c("   26.1", "   52.1", 
"  130.4", "  156.4", "  208.6", "  234.6", "  260.7", "  286.8", 
"  312.8", "  338.9", "  365.0", "  391.0", "  417.1", "  443.2", 
"  469.2", "  495.3", "  521.4", "  547.4", "  573.5", "  599.6", 
"  625.6", "  651.7", "  677.8", "  703.9", "  729.9", "  756.0", 
"  782.1", "  808.1", "  834.2", "  860.3", "  886.3", "  912.4", 
"  938.5", "  964.5", "  990.6", " 1016.7", " 1042.7", " 1068.8", 
" 1094.9", " 1120.9", " 1147.0", " 1173.1", " 1199.1", "0", "104.3", 
"182.5", "78.2"), class = "factor"), `1` = structure(1:5, .Names = c("X__2", 
"X__3", "X__4", "X__5", "X__6"), .Label = c("24315.5", "25592.5", 
"26493.0", "26889", "27320.5", "27823.0", "28335.0", "28859.5", 
"29324.0", "29920.5", "30412.5", "30883.0", "31599.5", "31958.0", 
"32744.0", "33065.5", "33432.0", "34269.0", "34603.5", "35214.5", 
"35570.5", "36149.0", "36596.5", "37087.5", "37520.0", "38254.5", 
"38540.5", "39200.5", "39718.0", "40126.0", "40808.0", "41235.0", 
"41537.5", "42316.5", "42755.5", "42927.0", "43772.0", "44095.0", 
"44669.0", "45027.0", "45607.0", "45976.5", "46624.5", "46961.0", 
"47338.0", "48147.5", "48499.0"), class = "factor"), `2` = structure(1:5, .Names = c("X__2", 
"X__3", "X__4", "X__5", "X__6"), .Label = c("21446.5", "22667.5", 
"22839.5", "23585", "23914", "24208.0", "24647.0", "25128", "25369.0", 
"25803.5", "26314.5", "26624.0", "27103.5", "27366.5", "27656.5", 
"28195.0", "28655.0", "28912.5", "29316.5", "29530.0", "29931.0", 
"30401.5", "30899.0", "30973.5", "31643.0", "31740.5", "32313.0", 
"32597.5", "32967.0", "33331.5", "33825.0", "34051.5", "34438.0", 
"34646.0", "35299.0", "35365.5", "35980.0", "36217.0", "36634.0", 
"37005.0", "37338.5", "37842.0", "38039.0", "38501.5", "38694.0", 
"39057.5", "39330.5"), class = "factor")), .Names = c("Time", 
"1", "2"), row.names = c(NA, 5L), class = "data.frame")

When I use the solution offered by @missuse I get the following instead of a single slope value for each column:

> df1
                        1       2       3       4       5       6       7       8       9      10      11      12      13
data8$Time   26.1 25592.5 22667.5 47310.0 36284.0 15790.5 16815.5 25108.0 25535.0 16702.5 10418.0 43602.0 40301.5 16227.0
data8$Time   52.1 26493.0 22839.5 47356.5 36549.0 15773.5 16804.0 25307.5 25538.5 16697.5 10390.0 43682.0 40332.0 16271.0
data8$Time  130.4 27823.0 24208.0 47815.0 36591.0 16132.0 17052.5 25669.5 25614.5 16958.0 10306.0 44104.5 40266.0 16682.5
data8$Time  156.4 28335.0 24647.0 47838.0 36718.0 16269.5 17001.0 25945.0 25754.5 16995.0 10397.5 43998.5 40256.5 16838.5
data8$Time  208.6 29324.0 25369.0 48094.5 36889.5 16360.5 16931.0 25961.0 25918.5 17259.0 10271.0 44297.0 40511.0 17033.0
data8$Time  234.6 29920.5 25803.5 48314.5 36755.5 16566.0 17106.5 26210.0 25595.5 17293.5 10268.5 44355.5 40503.5 17047.5
data8$Time  260.7 30412.5 26314.5 48709.5 36848.5 16630.5 17208.0 26065.0 25702.0 17448.5 10296.0 44805.0 40383.5 17029.5
data8$Time  286.8 30883.0 26624.0 48570.5 36845.0 16804.0 17116.5 26237.0 25817.0 17523.0 10274.0 44743.5 40727.5 17167.5
data8$Time  312.8 31599.5 27103.5 48943.5 36966.0 16807.0 17150.0 26306.5 25697.0 17566.0 10375.0 44674.0 40740.5 17309.0

标签: r excel
1条回答
仙女界的扛把子
2楼-- · 2019-08-03 13:11

here is a possible solution

apply(df[,2:ncol(df)], 2, function(x){
  model = lm(x ~ df$time - 1)
  return(coef(model))
})
#ouput 
sample1 sample2 sample3 
4       3       2  

It assumes the intercept is 0. If you would like the intercept evaluated use:

model = lm(x ~ df$time)

What this code does is to apply a function over columns (indicated by 2) of the data frame (df). It takes all columns from the second till the last one ([2:ncol(df]), and does a linear regression (model = lm(x ~ df$time - 1) and returns the coefficient(s) (return(coef(model)).

If the first column is not always named time then:

model = lm(x ~ df[,1] - 1)

to indicate the first column is the x

EDIT: problem is the variables were not coded as numeric. Here is a solution:

df$Time = as.numeric(df$Time) #convert time to numeric

Without intercept:

  apply(df[,2:ncol(df)], 2, function(x){
      x = as.numeric(x) #convert x to numeric
      model = lm(x ~ df$Time - 1)
      return(coef(model))
    })

With intercept:

apply(df[,2:ncol(df)], 2, function(x){
  x = as.numeric(x) #convert x to numeric
  model = lm(x ~ df$time)
  return(coef(model)[2])
})
查看更多
登录 后发表回答