Opening a file stored in HDFS to edit in VI

2019-04-05 07:21发布

I would like to edit a text file directly in HDFS using VI without having to copy it to local, edit it and then copy it back from local. Is this possible?

Edit: This used to be possible in Cloudera's Hue UI but is no longer the case.

5条回答
【Aperson】
2楼-- · 2019-04-05 08:01

Other answers here are correct, you can't edit files in HDFS as it is not a POSIX-compliant filesystem. Only appends are possible.

Although recently I had to fix a header in a hdfs file, and that's best I came up with..

sc.textFile(orig_file).map(fix_header).coalesce(1).saveAsTextFile(orig_file +'_fixed')

This is a Spark (PySpark) code. Notice coalesce(1) so the job is not .. parallel but benefit is that you get only one output file. So then just move/rename file from "orig_file +'_fixed'" directory to overwrite original file.

ps. You could omit .coalesce(1) part and the conversion will run in parallel (assuming big file/multiple splits) and will be much faster, but then you'll have to merge output hdfs files into one.

pps. "map" call in the pipeline fixes the headers through "fix_header" function (not shown here for clarity).

查看更多
别忘想泡老子
3楼-- · 2019-04-05 08:06

There are couple of options that you could try, which allows you to mount HDFS to your local machine and then you could use your local system commands like cp, rm, cat, mv, mkdir, rmdir, more, etc. But neither of them supports random write operations but supports append operations.

NFS Gateway uses NFS V3 and support appending to file but could not perform random write operations.

And regarding your comment on hue, maybe Hue is downloading the file to a local buffer and after editing it might be replacing the original file in HDFS.

查看更多
乱世女痞
4楼-- · 2019-04-05 08:13

A simple way is to copy from and to hdfs, and edit locally (See here)

hvim <filename>

Source code of hvim

hadoop fs -text $1>hvim.txt
vim hvim.txt
hadoop fs -rm -skipTrash $1
hadoop fs -copyFromLocal hvim.txt $1
rm hvim.txt
查看更多
别忘想泡老子
5楼-- · 2019-04-05 08:13

File in HDFS can be replaced using the -f option in hadoop fs -put -f This will eliminate the need to delete and then copy.

查看更多
【Aperson】
6楼-- · 2019-04-05 08:21

File in HDFS can't be edit directly.Even you can't replace the file in HDFS. only way can delete the file and update the same with new one.

Edit the file in local and copy it again in HDFS. Don't forget to delete the old file if you want to keep same name.

查看更多
登录 后发表回答