here is the data.txt:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2280 728 ? Ss 20:44 0:00 init [2]
root 2 0.0 0.0 0 0 ? S 20:44 0:00 [kthreadd]
root 202 0.0 0.0 0 0 ? S< 20:44 0:00 [ext4-dio-unwri
root 334 0.0 0.1 2916 1452 ? Ss 20:44 0:00 udevd --daemon
how to read the data into a data.frame?
1.can not to decide separator
the last field is a problem,space
can not be the separator,
init [2] ,udevd --daemon are the one field,can not be separated by space
.
2.no fixed width
every line has different width.
so ,how can i read the data.txt into a data.frame?
I would do it like this:
library(stringr) # has a convenient function for splitting to a fixed length
raw <- system("ps aux", intern = TRUE)
fields <- strsplit(raw[1], " +")[[1]]
ps <- str_split_fixed(raw[-1], " +", n = length(fields))
colnames(ps) <- fields
Here is a one-liner that should do the trick:
do.call(rbind, lapply(strsplit(readLines("data.txt"), "\\s+"), function(fields) c(fields[1:10], paste(fields[-(1:10)], collapse = " "))))
This is what it does in detail:
read all lines of the file via readLines
(results in a character vector where each vector element is one line of the file)
use strsplit
to split each line into strigs separated by white space (\\s+
)
for each line (lapply
), merge all fields that come after the 10th field into one (via paste(..., collapse = " ")
)---this creates a list where each list element represents one line of the file and is a character vector of length 11 (one for each field)
finally, call rbind
to merge the list into a matrix (or data frame)
What format is your data in? If you can open it in Excel saving it as a tab delminated file is most likely the best way to move forward.
Saving files as a tab deliminated file is one of the more common ways to prepare data for import into R
. This can be done in Excel by 'saving as' '.txt (tab deliminated)'. once this is done:
my_data <- read.table("path/to/file/", header = TRUE, sep = "\t")
sep = "\t"
tells R that your file is tab deliminated