Upload of huge file using a web application

2019-06-02 01:03发布

问题:

The environment for the given objective is not currenly available, hence, I'm not able to try out things and have to rely on the analysis only !

My objective can be broken into the following distinct steps :

  1. Uploading huge files(upto 100GB) using a dumb 'Upload File' page - there is no escape from this as the users want a (dumb)front-end and are not willing to ftp the file etc.
  2. The web application which provides the above front end will be hosted on a low-end machine - 2GB RAM and 40GB HDD and this web application WILL NOT STORE any part of the huge file on the local machine but must 'quickly' write it to a high-end remote Linux machine

For each step, I'm highlighting my approach,concerns and queries :

  • I referred this thread which confused me as I was planning to create a dumb web application using Spring MVC with an upload page - do I need to go into the HTML5 etc. or a simple web application will suffice?

  • Given the 2GB RAM, the web application will get less than 1GB of it. I'm afraid that an 'OutOfMemoryError' is probable if the code is not written strictly - I have to ensure that from the stream, a small chunk, say 10MB must be read at a time and written to the remote Linux machine's file. Assuming that I am in the Controller Servlet's doPost(...), I did some reading about how to proceed and got confused :

          /**
             * @see HttpServlet#doPost(HttpServletRequest request, HttpServletResponse
             *      response)
             */
            protected void doPost(HttpServletRequest request,
                    HttpServletResponse response) throws ServletException, IOException {
                // TODO Auto-generated method stub
    
                InputStream fis = request.getInputStream();
                int i = 0;
    
                /* Approach - 1 : Plain old byte-by-byte method */
                Socket socket = new Socket("192.168.90.20", 22);
                OutputStream remoteOpStream = socket.getOutputStream();
    
                while ((i = fis.read()) != -1) {
                    remoteOpStream.write(i);
                }
    
                /* clean-up */
    
                /* Approach - 2 : NIO */
                ByteBuffer byteBuff = ByteBuffer.allocate(10000);/* read 10MB of data */
    
                ReadableByteChannel rdbyc = Channels.newChannel(request
                        .getInputStream());
    
                File remoteFile = new File("192.168.90.20/Remote_Linux_Folder");/*
                                                                                 * Dunno
                                                                                 * how
                                                                                 * to
                                                                                 * create
                                                                                 * a
                                                                                 * File
                                                                                 * on a
                                                                                 * remote
                                                                                 * Linux
                                                                                 * machine
                                                                                 */
                FileOutputStream remoteFos = new FileOutputStream(remoteFile);
                FileChannel writableChannel = remoteFos.getChannel();
    
                while (true/* dunno how to loop till all the data is read! */) {
                    rdbyc.read(byteBuff);
                    writableChannel.write(byteBuff);
                }
    
                /* clean-up */
    
            }
    

I need some way wherein the data storage on the local machine is minimal - the code simply reads n bytes from the input stream and writes the same to a remote machine

I believe NIO is the way to go but I'm not able to establish as to how I must proceed - please guide about the same.

回答1:

I would implement a FixedSizeQueue and popAll() the data out of the QueueStream to another computer. Probably have it double buffered, just to provide the cushioning for network/bandwidth problems.