可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have very large genotype files that are basically impossible to open in R, so I am trying to extract the rows and columns of interest using linux command line. Rows are straightforward enough using head/tail, but I'm having difficulty figuring out how to handle the columns.

If I attempt to extract (say) the 100-105th tab or space delimited column using

 cut -c100-105 myfile >outfile

this obviously won't work if there are strings of multiple characters in each column. Is there some way to modify cut with appropriate arguments so that it extracts the entire string within a column, where columns are defined as space or tab (or any other character) delimited?

回答1:

If the command should work with both tabs and spaces as the delimiter I would use awk:

awk '{print $100,$101,$102,$103,$104,$105}' myfile > outfile

As long as you just need to specify 5 fields it is imo ok to just type them, for longer ranges you can use a for loop:

awk '{for(i=100;i<=105;i++)print $i}' myfile > outfile

If you want to use cut, you need to use the -f option:

cut -f100-105 myfile > outfile

If the field delimiter is different from TAB you need to specify it using -d:

cut -d' ' -f100-105 myfile > outfile

Check the man page for more info on the cut command.

回答2:

You can use cut with a delimiter like this:

with space delim:

cut -d " " -f1-100,1000-1005 infile.csv > outfile.csv

with tab delim:

cut -d$'\t' -f1-100,1000-1005 infile.csv > outfile.csv

I gave you the version of cut in which you can extract a list of intervals...

Hope it helps!

Extracting columns from text file with different d

问题:

回答1:

回答2:

收藏的人(0)

Extracting columns from text file with different d

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮