Writing to HDFS : File is overwritten

2020-07-30 03:38发布

问题:

I am writing to hadoop file system. But everytime I append something, it overwrites the data instead of adding it to the existing data/file. The code which is doing this is provided below. This code is called again and again for different data. Is opening a new SequenceFile.Writer everytime a problem?

Each time I am getting the path as new Path("someDir");

  public void writeToHDFS(Path path, long uniqueId, String data){
      FileSystem fs = path.getFileSystem(conf);
      SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs, conf,
          path, LongWritable.class, MyWritable.class);
      inputWriter.append(new LongWritable(uniqueId++), new MyWritable(data));
      inputWriter.close();
  }

回答1:

There is currently no way to append to an existing SequenceFile through the API. When you make the new SequenceFile.Writer object, it will not append to an existing file at that Path, but instead overwrite it. See my earlier question.

As Thomas points out, if you keep the same SequenceFile.Writer object, you will be able to append to the file until you call close().



标签: hadoop hdfs