What is the most efficient way of sending files be

2019-02-07 11:28发布

问题:

Introduction

Say that on the same local network we have two Node JS servers set up with Express: Server A for API and Server F for form.

  • Server A is an API server where it takes the request and saves it to MongoDB database (files are stored as Buffer and their details as other fields)
  • Server F serves up a form, handles the form post and sends the form's data to Server A.

What is the most efficient way to send files between two NodeJS servers where the receiving server is Express API? Where does the file size matter?

1. HTTP Way

If the files I'm sending are PDF files (that won't exceed 50mb) is it efficient to send the whole contents as a string over HTTP?

Algorithm is as follows:

  • Server F handles the file request using https://www.npmjs.com/package/multer and saves the file
  • then Server F reads this file and makes an HTTP request via https://github.com/request/request along with some details on the file
  • Server A receives this request and turns the file contents from string to Buffer and saves a record in MongoDB along with the file details.

In this algorithm, both Server A (when storing into MongoDB) and Server F (when it was sending it over to Server A) have read the file into the memory, and the request between the two servers was about the same size as the file. (Are 50Mb requests alright?)

However, one thing to consider is that -with this method- I would be using the ExpressJS style of API for the whole process and it would be consistent with the rest of the app where the /list, /details requests are also defined in the routes. I like consistency.

2. Socket.IO Way

In contrast to this algorithm, I've explored https://github.com/nkzawa/socket.io-stream way which broke away from the consistency of the HTTP API on Server A (as the handler for socket.io events are defined not in the routes but the file that has var server = http.createServer(app);).

Server F handles the form data as such in routes/some_route.js:

router.post('/', multer({dest: './uploads/'}).single('file'), function (req, res) {
    var api_request = {};
    api_request.name = req.body.name;
    //add other fields to api_request ...

    var has_file = req.hasOwnProperty('file');

    var io = require('socket.io-client');

    var transaction_sent = false;
    var socket = io.connect('http://localhost:3000');
    socket.on('connect', function () {
        console.log("socket connected to 3000");

        if (transaction_sent === false) {
            var ss = require('socket.io-stream');
            var stream = ss.createStream();

            ss(socket).emit('transaction new', stream, api_request);

            if (has_file) {
                var fs = require('fs');
                var filename = req.file.destination + req.file.filename;

                console.log('sending with file: ', filename);

                fs.createReadStream(filename).pipe(stream);
            }

            if (!has_file) {
                console.log('sending without file.');
            }
            transaction_sent = true;

            //get the response via socket
            socket.on('transaction new sent', function (data) {
                console.log('response from 3000:', data);
                //there might be a better way to close socket. But this works.
                socket.close();
                console.log('Closed socket to 3000');

            });

        }


    });


});

I said I'd be dealing with PDF files that are < 50Mb. However, if I use this program to send larger files in the future, is socket.io a better way to handle 1GB files as it's using stream?

This method does send the file and the details across but I'm new to this library and don't know if it should be used for this purpose or if there is a better way of utilizing it.

Final thoughts

What alternative methods should I explore?

  • Should I send the file over SCP and make an HTTP request with file details including where I've sent it- thus, separating the protocols of files and API requests?
  • Should I always use streams because they don't store the whole file into memory? (that's how they work, right?)
  • This https://github.com/liamks/Delivery.js ?

References:

  • File/Data transfer between two node.js servers this got me to try socket-stream way.
  • transfer files between two node.js servers over http for HTTP way

回答1:

There are plenty of ways to achieve this , but not so much to do it right !

socket io and wesockets are efficient when you use them with a browser , but since you don't , there is no need for it.

The first method you can try is to use the builtin Net module of nodejs, basically it will make a tcp connection between the servers and pass the data.

you should also keep in mind that you need to send chunks of data not the entire file , the socket.write method of the net module seems to be a good fit for your case check it : https://nodejs.org/api/net.html

But depending on the size of your files and concurrency , memory consumption can be quite large.

if you are running linux on both servers you could even send the files at ground zero with a simple linux command called scp

nohup scp -rpC /var/www/httpdocs/* remote_user@remote_domain.com:/var/www/httpdocs &

You can even do this with windows to linux or the other way.

http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

the client scp for windows is pscp.exe

Hope this helps !