How to list only the file names in HDFS

2019-03-17 08:45发布

I would like to know is there any command/expression to get only the file name in hadoop. I need to fetch only the name of file, when I do hadoop fs -ls it prints the whole path.

I tried below but just wondering if some better way to do it.

hadoop fs -ls <HDFS_DIR>|cut -d ' ' -f17 

标签: shell hadoop
6条回答
我只想做你的唯一
2楼-- · 2019-03-17 09:15

Use the basename command, which strips any prefix ending in '/' from the string.

basename $(hadoop fs -ls)
查看更多
Viruses.
3楼-- · 2019-03-17 09:18

The Below Command return only the File names in the Directory. Awk Splits the list by '/' and prints last field which would be the File name.

hdfs dfs -ls /<folder> | awk -F'/' '{print $NF}'

查看更多
孤傲高冷的网名
4楼-- · 2019-03-17 09:28

The following command will return filenames only:

hdfs dfs -stat "%n" my/path/*
查看更多
We Are One
5楼-- · 2019-03-17 09:30
 hadoop fs -ls  -C  /path/* | xargs -n 1 basename
查看更多
Melony?
6楼-- · 2019-03-17 09:31

I hope this helps someone - with version 2.8.x+ (available in 3 as well) -

hadoop fs -ls  -C  /paths/
查看更多
成全新的幸福
7楼-- · 2019-03-17 09:33

It seems hadoop ls does not support any options to output just the filenames, or even just the last column.

If you want get the last column reliably, you should first convert the whitespace to a single space, so that you can then address the last column:

hadoop fs -ls | sed '1d;s/  */ /g' | cut -d\  -f8

This will get you just the last column but files with the whole path. If you want just filenames, you can use basename as @rojomoke suggests:

hadoop fs -ls | sed '1d;s/  */ /g' | cut -d\  -f8 | xargs -n 1 basename

I also filtered out the first line that says Found ?x items

Note: beware that, as @felix-frank notes in the comments, that the above command will not correctly preserve file names with multiple consecutive spaces. Hence a more correct solution proposed by Felix:

hadoop fs -ls /tmp | sed 1d | perl -wlne'print +(split " ",$_,8)[7]'

查看更多
登录 后发表回答