Reading a .tps morphometrics file into R

2019-07-10 10:49发布

问题:

I am looking to read a .tps file into R.

An example file is now available at:

example file

The actual files I am trying to read into R obviously have many more individuals/IDs (>1000)

The .tps file format is produced by TPSDIG.

http://life.bio.sunysb.edu/morph/

The file is an ANSI plain text file.

The file contains X and Y coordinates and specimen information as follows.

The main difficulty is that specimens vary in the numbers of attributes (eg. some have 4 and some have 6 LM landmarks, some have 2 curves, others none, with thus no associated points).

I have tried working with a for loop and read.table, but can not find a way to account for the varying number of attributes.

Example of start of file

LM=3
1  1
2  2
3  3
CURVES=2
POINTS=2
1 1
2 2
POINTS=2
1 1
2 2
IMAGE=COMPLETE/FILE/PATH/IMAGE
ID=1
SCALE=1
LM=3
1  1
2  2
3  3
CURVES=2
...

Example dummy code that works if all specimens have equal number of attributes.

i<-1
landmarks<-NULL
while(i < 4321){

  print(i)

  landmarks.temp<-read.table(file="filepath", sep=" ", header=F, skip=i, nrows=12, col.names=c("X", "Y"))
  i<-i+13
  landmarks.temp$ID<-read.table(file="filepath", sep=c(" "), header=F, skip=i, nrows=1, as.is=T)[1,1]
  i<-i+1
  landmarks.temp$scale<-read.table(file="filepath", sep=c(" "), header=F, skip=i, nrows=1, as.is=T)[1,1]
  i<-i+2

  landmarks<-rbind(landmarks, landmarks.temp)

  print(unique(landmarks.temp$ID))
}

回答1:

I'm not exactly clear about what you are looking for in your output. I assumed a standard data frame with X, Y, ID, and Scale as the variables.

Try this function that I threw together and see if it gives you the type of output that you're looking for:

    read.tps = function(data) {
      a = readLines(data)
      LM = grep("LM", a)
      ID.ind = grep("ID", a)  
      images = basename(gsub("(IMAGE=)(.*)", "\\2", a[ID.ind - 1]))

      skip = LM
      nrows = as.numeric(gsub("(LM=)([0-9])", "\\2", grep("LM", a, value=T)))
      l = length(LM)

      landmarks = vector("list", l)

      for (i in 1:l) {
        landmarks[i] = list(data.frame(
            read.table(file=data, header=F, skip=LM[i],
                       nrows=nrows[i], col.names=c("X", "Y")),
            IMAGE = images[i],
            ID = read.table(file=data, header=F, skip=ID.ind[i]-1, 
                            nrows=1, sep="=", col.names="ID")[2,],
            Scale = read.table(file=data, header=F, skip=ID.ind[i],
                                nrows=1, sep="=")[,2]))
      }
      do.call(rbind, landmarks)
    }

After you've loaded the function, you can use it by typing:

read.tps("example.tps")

where "example.tps" is the name of your .tps file in your working directory.

If you want to assign your output to a new object, you can use the standard:

landmarks <- read.tps("example.tps")


回答2:

Perhaps worth mentioning that there is now an R package geomorph which has a function readland.tps() for this.



标签: r text-files tps