Converting NMR ascii file to peak list

I have some Bruker NMR spectra that i am using to create a program as part of a project. My program needs to work on the actual spectrum. So i converted the 1r files of the Bruker NMR spectra to ASCII. For Carnitine this is what the ascii file looks like(this is not the complete list. The complete list runs into thousands of lines. This is only a snapshot):

-0.807434   -23644  
-0.807067   -22980  
-0.806701   -22967  
-0.806334   -24513  
-0.805967   -27609  
-0.805601   -31145  
-0.805234   -33951  
-0.804867   -35553  
-0.804501   -35880  
-0.804134   -35240  
-0.803767   -34626  
-0.8034  -34613 
-0.803034   -34312  
-0.802667   -32411  
-0.8023  -28925 
-0.801934   -25177  
-0.801567   -22132  
-0.8012  -19395

and this is what the spectrum is: alt text http://www.bmrb.wisc.edu/metabolomics/standards/L_carnitine/nmr/bmse000211/spectra_png/1H.png

My program has to identify the peaks from this data. So i need to know how to interpret these numbers. And how exactly they are converted into their appropriate values in the spectrum. So far this is what i have learnt:

1.) The first column represents the spectral point position (ppm)

2.) The second column represents the intensity of each peak.

3.) notice that in the second column there are some numbers which are not perfectly aligned but are closer to the first column. Eg:-34613, -28925, -19395. I think this is significant.

For the sake of full disclosure- I am doing the programming in R.

NOTE: I have also asked this in Biostar but i think i have a better chance of getting an answer here than there because not many people seem to be answering questions there.

EDIT: This is one solution that i have found is plausible:

A friend gave me the idea to use an awk script to check where exactly the intensities in the file change from positive to negative to find the local maxima. Here is a working script:

awk 'BEGIN{dydx = 0;}
{ 
  if(NR > 1)
     { dydx = ($2 - y0)/($1 - x0); } 
  if(NR > 2 && last * dydx < 0)
     { printf( "%.4f  %.4f\n", (x0 + $1)/2, log((dydx<0)?-dydx:dydx)); } ;
  last=dydx; x0=$1; y0=$2
}' /home/chaitanya/Work/nmr_spectra/caffeine/pdata/1/spectrumtext.txt  | awk '$2 > 17'

Tell me if you dont understand it. I will improve the explanation.

Also, there is this related question i asked.

标签： r bioinformatics spectrum

2条回答

何必那么认真

2楼-- · 2019-06-14 17:50

Package PRocess has a function to find peaks in spectra. There are many more, if you search for something like "peak finding R"

0人赞添加讨论(0) 举报

甜甜的少女心

3楼-- · 2019-06-14 17:55

Here's a worked example with a reproducible piece of code. I don't claim it's any good with regard to the strategy or coding, but it could get you started.

find_peaks <- function (x, y, n.fine = length(x), interval = range(x), ...) {
  maxdif <- max(diff(x)) # longest distance between successive points

  ## selected interval for the search
  range.ind <- seq(which.min(abs(x - interval[1])),
                   which.min(abs(x - interval[2])))
  x <- x[range.ind]
  y <- y[range.ind]

  ## smooth the data
  spl <- smooth.spline(x, y, ...)
  ## finer x positions
  x.fine <- seq(range(x)[1], range(x)[2], length = n.fine)
  ## predicted y positions
  y.spl <- predict(spl, x.fine, der = 0)$y
  ## testing numerically the second derivative
  test <- diff(diff((y.spl), 1) > 0, 1)
  maxima <- which(test == -1) + 1

  ## according to this criterion, we found rough positions
  guess <- data.frame(x=x.fine[maxima], y=y.spl[maxima])

  ## cost function to maximize 
  obj <- function(x) predict(spl, x)$y

  ## optimize the peak position around each guess
  fit <- data.frame(do.call(rbind,
          lapply(guess$x, function(g) {
            fit <- optimize(obj, interval = g + c(-1,1) * maxdif, maximum=TRUE)
            data.frame(x=fit$maximum,y=fit$objective)
          })))

  ## return both guesses and fits
  invisible(list(guess=guess, fit=fit))
}

set.seed(123)
x <- seq(1, 15, length=100)
y <- jitter(cos(x), a=0.2)

plot(x,y)
res <- find_peaks(x,y)
points(res$guess,col="blue")
points(res$fit,col="red")

test

0人赞添加讨论(0) 举报

Converting NMR ascii file to peak list

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间