spring batch file writer to write directly to amaz

2020-03-30 07:30发布

问题:

I'm trying to upload a file to amazon s3. Instead of uploading, I want to read the data from database using spring batch and write the file directly into the s3 storage. Is there anyway we can do that ?

回答1:

Spring Cloud AWS adds support for the Amazon S3 service to load and write resources with the resource loader and the s3 protocol. Once you have configured the AWS resource loader, you can write a custom Spring Batch writer like:

import java.io.OutputStream;
import java.util.List;

import org.springframework.batch.item.ItemWriter;
import org.springframework.core.io.ResourceLoader;
import org.springframework.core.io.WritableResource;

public class AwsS3ItemWriter implements ItemWriter<String> {

    private ResourceLoader resourceLoader;

    private WritableResource resource;

    public AwsS3ItemWriter(ResourceLoader resourceLoader, String resource) {
        this.resourceLoader = resourceLoader;
        this.resource = (WritableResource) this.resourceLoader.getResource(resource);
    }

    @Override
    public void write(List<? extends String> items) throws Exception {
        try (OutputStream outputStream = resource.getOutputStream()) {
            for (String item : items) {
                outputStream.write(item.getBytes());
            }
        }
    }
}

Then you should be able to use this writer with an S3 resource like s3://myBucket/myFile.log.

Is there anyway we can do that ?

Please note that I did not compile/test the previous code. I just wanted to give you a starting point of how to do it.

Hope this helps.



回答2:

I had the same thing to do. Because spring has no clas to write to a stream alone I made one my self like above example:

You need to classes for this. A Resource class which implements WriteableResource and extends AbstractResource:

...

public class S3Resource extends AbstractResource implements WritableResource {

   ByteArrayOutputStream resource = new ByteArrayOutputStream();

    @Override
    public String getDescription() {
        return null;
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return new ByteArrayInputStream(resource.toByteArray());
    }

    @Override
    public OutputStream getOutputStream() throws IOException {
        return resource;
    }
}

And your writer which extends ItemWriter:

public class AmazonStreamWriter<T> implements ItemWriter<T>{

    private WritableResource resource;
    private LineAggregator<T> lineAggregator;
    private String lineSeparator;

    public String getLineSeparator() {
        return lineSeparator;
    }

    public void setLineSeparator(String lineSeparator) {
        this.lineSeparator = lineSeparator;
    }

    AmazonStreamWriter(WritableResource resource){
        this.resource = resource;
    }

    public WritableResource getResource() {
        return resource;
    }

    public void setResource(WritableResource resource) {
        this.resource = resource;
    }

    public LineAggregator<T> getLineAggregator() {
        return lineAggregator;
    }

    public void setLineAggregator(LineAggregator<T> lineAggregator) {
        this.lineAggregator = lineAggregator;
    }

    @Override
    public void write(List<? extends T> items) throws Exception {
        try (OutputStream outputStream = resource.getOutputStream()) {
                StringBuilder lines = new StringBuilder();
                Iterator var3 = items.iterator();

                while(var3.hasNext()) {
                    T item = (T) var3.next();
lines.append(this.lineAggregator.aggregate(item)).append(this.lineSeparator);
                }
                outputStream.write(lines.toString().getBytes());
        }
    }
}

With this setup you will write your Item-Information you recieve from your database and write it to your Customresource via an OutputStream. The filled resource then can be used in one of your Steps zu open an InputStream and upload to S3 via Client. I did it with: amazonS3.putObject(awsBucketName, awsBucketKey , resource.getInputStream(), new ObjectMetadata());

My solution may be not the perfect aproach, but from here on you can optimize it.



回答3:

The problem is that the OutputStream will only write the last List items sent by the step... I think you might need to write a temporary file on file system and then send the whole file in a separate tasklet

See this example : https://github.com/TerrenceMiao/AWS/blob/master/dynamodb-java/src/main/java/org/paradise/microservice/userpreference/service/writer/CSVFileWriter.java