I'm having trouble to append data to an existing file in HDFS. I want that if the file exists then append a line, if not, create a new file with the name given.
Here's my method to write into HDFS.
if (!file.exists(path)){
file.createNewFile(path);
}
FSDataOutputStream fileOutputStream = file.append(path);
BufferedWriter br = new BufferedWriter(new OutputStreamWriter(fileOutputStream));
br.append("Content: " + content + "\n");
br.close();
Actually this method writes into HDFS and create a file but as I mention is not appending.
This is how I test my method:
RunTimeCalculationHdfsWrite.hdfsWriteFile("RunTimeParserLoaderMapperTest2", "Error message test 2.2", context, null);
The first param is the name of the file, the second the message and the other two params are not important.
So anyone have an idea what I'm missing or doing wrong?
Solved..!!
Append is supported in HDFS.
You just have to do some configurations and simple code as shown below :
Step 1: set dfs.support.append as true in hdfs-site.xml :
Stop all your daemon services using stop-all.sh and restart it again using start-all.sh
Step 2 (Optional): Only If you have a singlenode cluster , so you have to set replication factor to 1 as below :
Through command line :
Or you can do the same at run time through java code:
Step 3 : Code for Creating/appending data into the file :
Kindly let me know for any other help.
Cheers.!!
Actually, you can append to a HDFS file:
I checked HDFS sources, there is
DistributedFileSystem#append
method:For details, see presentation.
Also you can append through command line:
Add lines directly from stdin:
HDFS does not allow
append
operations. One way to implement the same functionality as appending is: