How to read output from linux process status (ps)

2019-04-30 00:05发布

问题:

here is the data.txt:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND  
root         1  0.0  0.0   2280   728 ?        Ss   20:44   0:00 init [2]    
root         2  0.0  0.0      0     0 ?        S    20:44   0:00 [kthreadd]  
root       202  0.0  0.0      0     0 ?        S<   20:44   0:00 [ext4-dio-unwri  
root       334  0.0  0.1   2916  1452 ?        Ss   20:44   0:00 udevd --daemon  

how to read the data into a data.frame?
1.can not to decide separator
the last field is a problem,space can not be the separator,
init [2] ,udevd --daemon are the one field,can not be separated by space.
2.no fixed width
every line has different width.

so ,how can i read the data.txt into a data.frame?

回答1:

I would do it like this:

library(stringr) # has a convenient function for splitting to a fixed length 

raw          <- system("ps aux", intern = TRUE)
fields       <- strsplit(raw[1], " +")[[1]]
ps           <- str_split_fixed(raw[-1], " +", n = length(fields))
colnames(ps) <- fields


回答2:

Here is a one-liner that should do the trick:

do.call(rbind, lapply(strsplit(readLines("data.txt"), "\\s+"), function(fields) c(fields[1:10], paste(fields[-(1:10)], collapse = " "))))

This is what it does in detail:

  1. read all lines of the file via readLines (results in a character vector where each vector element is one line of the file)

  2. use strsplit to split each line into strigs separated by white space (\\s+)

  3. for each line (lapply), merge all fields that come after the 10th field into one (via paste(..., collapse = " "))---this creates a list where each list element represents one line of the file and is a character vector of length 11 (one for each field)

  4. finally, call rbind to merge the list into a matrix (or data frame)



回答3:

What format is your data in? If you can open it in Excel saving it as a tab delminated file is most likely the best way to move forward.

Saving files as a tab deliminated file is one of the more common ways to prepare data for import into R. This can be done in Excel by 'saving as' '.txt (tab deliminated)'. once this is done:

my_data <- read.table("path/to/file/", header = TRUE, sep = "\t")

sep = "\t" tells R that your file is tab deliminated