Break HTTP file uploading from server side by PHP

2019-01-23 16:28发布

问题:

When uploading big file (>100M) to server, PHP always accept entire data POST from browser first. We cannot inject into the process of uploading.

For example, check the value of "token" before entire data send to server is IMPOSSIBLE in my PHP code:

<form enctype="multipart/form-data" action="upload.php?token=XXXXXX" method="POST">
    <input type="hidden" name="MAX_FILE_SIZE" value="3000000" />
    Send this file: <input name="userfile" type="file" />
    <input type="submit" value="Send File" />
</form>

So I've try to use mod_rewrite like this:

RewriteEngine On
RewriteMap mymap prg:/tmp/map.php
RewriteCond %{QUERY_STRING} ^token=(.*)$ [NC]
RewriteRule ^/upload/fake.php$ ${mymap:%1} [L]

map.php

#!/usr/bin/php
<?php
define("REAL_TARGET", "/upload/real.php\n");
define("FORBIDDEN", "/upload/forbidden.html\n");

$handle = fopen ("php://stdin","r");
while($token = trim(fgets($handle))) {
file_put_contents("/tmp/map.log", $token."\n", FILE_APPEND);
    if (check_token($token)) {
        echo REAL_TARGET;
    } else {
        echo FORBIDDEN;
    }
}

function check_token ($token) {//do your own security check
    return substr($token,0,4) === 'alix';
}

But ... It fails again. mod_rewrite looks working too late in this situation. Data still transfer entirely.

Then I tried Node.js, like this (code snip):

var stream = new multipart.Stream(req);
stream.addListener('part', function(part) {
    sys.print(req.uri.params.token+"\n");
    if (req.uri.params.token != "xxxx") {//check token
      res.sendHeader(200, {'Content-Type': 'text/plain'});
      res.sendBody('Incorrect token!');
      res.finish();
      sys.puts("\n=> Block");
      return false;
    }

Result is ... fail again.

So please help me to find the correct path to resolve this issue or tell me there is no way.

Related questions:

Can PHP (with Apache or Nginx) check HTTP header before POST request finished?

Can some tell me how to make this script check for the password before it starts the upload process instead of after the file is uploaded?

回答1:

First of all, you can try this code yourself using the GitHub repo I created for this. Just clone the repository and run node header.

(Spoiler, if you're reading this and are under time pressure to get something to work and not in the mood to learn ( :( ), there is a simpler solution at the end)

The general idea

This is a great question. What you are asking for is very possible and no clientside is needed, just a deeper understanding of how the HTTP protocol works while showing how node.js rocks :)

This can be made easy if we go one level deeper to the underlying TCP protocol and process the HTTP requests ourselves for this specific case. Node.js lets you do this easily using the built in net module.

The HTTP Protocol

First, let's look at how HTTP requests work.

An HTTP request consists of a headers section in the general format of key:value pairs seperated by CRLF (\r\n). We know that the header section ended when we reach a double CRLF (that is \r\n\r\n).

A typical HTTP GET request might look something like this:

GET /resource HTTP/1.1  
Cache-Control: no-cache  
User-Agent: Mozilla/5.0 

Hello=World&stuff=other

The top part before the 'empty line' is the headers section and the bottom part is the body of the request. Your request will look a bit differently in the body section since it is encoded with multipart/form-data but the header will remain similarLet's explore how this applies to us.

TCP in nodejs

We can listen to the raw request in TCP and read the packets we get until we read that double crlf we talked about. Then we will check the short header section which we already have for whatever validation we need. After we do that, we can either end the request if validation did not pass (For example by simply ending the TCP connection), or pass it through. This allows us to not receive or read the request body, but just the headers which are much smaller.

One easy way to embed this into an already existing application is to proxy requests from it to the actual HTTP server for the specific use case.

Implementation details

This solution is as bare bones as it gets. It is just a suggestion.

Here is the work flow:

  1. We require the net module in node.js which allows us to create tcp servers in node.js

  2. Create a TCP server using the net module which will listen to data: var tcpServer = net.createServer(function (socket) {... . Don't forget to tell it to listen to the correct port

    • Inside that callback, listen to data events socket.on("data",function(data){ , which will trigger whenever a packet arrives.
    • read the data of the passed buffer from the 'data' event, and store that in a variable
    • check for double CRLF, this ensures that the request HEADER section has ended according to the HTTP protocol
    • Assuming that the validation is a header (token in your words) check it after parsing just the headers , (that is, we got the double CRLF). This also works when checking for the content-length header.
    • If you notice that the headers don't check out, call socket.end() which will close the connection.

Here are some things we'll use

A method for reading the headers:

function readHeaders(headers) {
    var parsedHeaders = {};
    var previous = "";    
    headers.forEach(function (val) {
        // check if the next line is actually continuing a header from previous line
        if (isContinuation(val)) {
            if (previous !== "") {
                parsedHeaders[previous] += decodeURIComponent(val.trimLeft());
                return;
            } else {
                throw new Exception("continuation, but no previous header");
            }
        }

        // parse a header that looks like : "name: SP value".
        var index = val.indexOf(":");

        if (index === -1) {
            throw new Exception("bad header structure: ");
        }

        var head = val.substr(0, index).toLowerCase();
        var value = val.substr(index + 1).trimLeft();

        previous = head;
        if (value !== "") {
            parsedHeaders[head] = decodeURIComponent(value);
        } else {
            parsedHeaders[head] = null;
        }
    });
    return parsedHeaders;
};

A method for checking double CRLF in a buffer you get on a data event, and return its location if it exists in an object:

function checkForCRLF(data) {
    if (!Buffer.isBuffer(data)) {
        data = new Buffer(data,"utf-8");
    }
    for (var i = 0; i < data.length - 1; i++) {
        if (data[i] === 13) { //\r
            if (data[i + 1] === 10) { //\n
                if (i + 3 < data.length && data[i + 2] === 13 && data[i + 3] === 10) {
                    return { loc: i, after: i + 4 };
                }
            }
        } else if (data[i] === 10) { //\n

            if (data[i + 1] === 10) { //\n
                return { loc: i, after: i + 2 };
            }
        }
    }    
    return { loc: -1, after: -1337 };
};

And this small utility method:

function isContinuation(str) {
    return str.charAt(0) === " " || str.charAt(0) === "\t";
}

Implementation

var net = require("net"); // To use the node net module for TCP server. Node has equivalent modules for secure communication if you'd like to use HTTPS

//Create the server
var server = net.createServer(function(socket){ // Create a TCP server
    var req = []; //buffers so far, to save the data in case the headers don't arrive in a single packet
    socket.on("data",function(data){
        req.push(data); // add the new buffer
        var check = checkForCRLF(data);
        if(check.loc !== -1){ // This means we got to the end of the headers!
            var dataUpToHeaders= req.map(function(x){
                return x.toString();//get buffer strings
            }).join("");
            //get data up to /r/n
            dataUpToHeaders = dataUpToHeaders.substring(0,check.after);
            //split by line
            var headerList = dataUpToHeaders.trim().split("\r\n");
            headerList.shift() ;// remove the request line itself, eg GET / HTTP1.1
            console.log("Got headers!");
            //Read the headers
            var headerObject = readHeaders(headerList);
            //Get the header with your token
            console.log(headerObject["your-header-name"]);

            // Now perform all checks you need for it
            /*
            if(!yourHeaderValueValid){
                socket.end();
            }else{
                         //continue reading request body, and pass control to whatever logic you want!
            }
            */


        }
    });
}).listen(8080); // listen to port 8080 for the sake of the example

If you have any questions feel free to ask :)

Ok, I lied, there is a simpler way!

But what's the fun in that? If you skipped here initially, you wouldn't learn how HTTP works :)

Node.js has a built in http module. Since requests are chunked by nature in node.js, especially long requests, you can implement the same thing without the more advanced understanding of the protocol.

This time, let's use the http module to create an http server

server = http.createServer( function(req, res) { //create an HTTP server
    // The parameters are request/response objects
    // check if method is post, and the headers contain your value.
    // The connection was established but the body wasn't sent yet,
    // More information on how this works is in the above solution
    var specialRequest = (req.method == "POST") && req.headers["YourHeader"] === "YourTokenValue";
    if(specialRequest ){ // detect requests for special treatment
      // same as TCP direct solution add chunks
      req.on('data',function(chunkOfBody){
              //handle a chunk of the message body
      });
    }else{
        res.end(); // abort the underlying TCP connection, since the request and response use the same TCP connection this will work
        //req.destroy() // destroy the request in a non-clean matter, probably not what you want.
    }
}).listen(8080);

This is based on the fact the request handle in a nodejs http module actually hooks on after the headers were sent (but nothing else was performed) by default. (this in the server module , this in the parser module)

User igorw suggested a somewhat cleaner solution using the 100 Continue header assuming browsers you're targeting supports it. 100 Continue is a status code designed to do exactly what you're attempting to:

The purpose of the 100 (Continue) status (see section 10.1.1) is to allow a client that is sending a request message with a request body to determine if the origin server is willing to accept the request (based on the request headers) before the client sends the request body. In some cases, it might either be inappropriate or highly inefficient for the client to send the body if the server will reject the message without looking at the body.

Here it is :

var http = require('http');

function handle(req, rep) {
    req.pipe(process.stdout); // pipe the request to the output stream for further handling
    req.on('end', function () {
        rep.end();
        console.log('');
    });
}

var server = new http.Server();

server.on('checkContinue', function (req, rep) {
    if (!req.headers['x-foo']) {
        console.log('did not have foo');
        rep.writeHead(400);
        rep.end();
        return;
    }

    rep.writeContinue();
    handle(req, rep);
});

server.listen(8080);

You can see sample input/output here. This would require your request to fire with the appropriate Expect: header.



回答2:

Use javascript. Submit a pre-form via ajax when user clicks submit, wait for the ajax response, then when it comes back successful or not, submit the actual form. You can also have a fallback to the method you don't want which is better than nothing.

<script type="text/javascript">
function doAjaxTokenCheck() {
    //do ajax request for tokencheck.php?token=asdlkjflgkjs
    //if token is good return true
    //else return false and display error
}
</script>

<form enctype="multipart/form-data" action="upload.php?token=XXXXXX" method="POST">
    <input type="hidden" name="MAX_FILE_SIZE" value="3000000" />
    Send this file: <input name="userfile" type="file" />
    <input type="submit" value="Send File" onclick="return doAjaxTokenCheck()"/>
</form>


回答3:

It sounds like you're trying to stream the upload and need to validate before processing: Does this help? http://debuggable.com/posts/streaming-file-uploads-with-node-js:4ac094b2-b6c8-4a7f-bd07-28accbdd56cb

http://www.componentix.com/blog/13/file-uploads-using-nodejs-once-again



回答4:

I suggest you to use some client side plugins to upload files. You could use

http://www.plupload.com/

or

https://github.com/blueimp/jQuery-File-Upload/

Both plugins have provision to check file size before uploading.

If you want to use your own scripts, check this. This may help you

        function readfile()
        {
            var files = document.getElementById("fileForUpload").files;
            var output = [];
            for (var i = 0, f; f = files[i]; i++) 
            {
                    if(f.size < 100000) // Check file size of file
                    {
                        // Your code for upload
                    }
                    else
                    {
                        alert('File size exceeds upload size limit');
                    }

            }
        }


回答5:

Previous version was somewhat vague. So I've rewritten the code to show the difference between route handling and middleware. Middlewares are executed for every request. They are executed in the order they are given. express.bodyParser() is the middleware which handles file upload, which you should skip, for incorrect tokens. mymiddleware simply checks for tokens and terminates invalid requests. This must be done before express.bodyParser() is executed.

var express = require('express'),
app = express();

app.use(express.logger('dev'));
app.use(mymiddleware);                                 //This will work for you.
app.use(express.bodyParser());                         //You want to avoid this
app.use(express.methodOverride());
app.use(app.router);

app.use(express.static(__dirname+'/public'));
app.listen(8080, "127.0.0.1");

app.post('/upload',uploadhandler);                     //Too late. File already uploaded

function mymiddleware(req,res,next){                   //Middleware
    //console.log(req.method);
    //console.log(req.query.token);
    if (req.method === 'GET')
        next();
    else if (req.method === 'POST' && req.query.token === 'XXXXXX')
        next();
    else
        req.destroy();
}

function uploadhandler(req,res){                       //Route handler
    if (req.query.token === 'XXXXXX')
        res.end('Done');
    else
        req.destroy();
}

uploadhandler on the other hand cannot interrupt the upload as it has been processed by express.bodyParser() already. It just processes the POST request. Hope this helps.



回答6:

One way to bypass PHP's post handling is to route the request through PHP-CLI. Create the following CGI script and try uploading a large file to it. The web server should respond by killing the connection. If it does, then it's just a matter of opening an internal socket connection and sending the data to the actual location--provided that conditions are met, of course.

#!/usr/bin/php
<?php

echo "Status: 500 Internal Server Error\r\n";
echo "\r\n";
die();

?>


回答7:

Why dont you just use the APC file upload progress and set the progress key as the key for the APC file upload so in that case the form is submitted and the upload progress will start initially but then at the first progress check you will verify the key and if its not correct you will interrupt everything:

http://www.johnboy.com/blog/a-useful-php-file-upload-progress-meter http://www.ultramegatech.com/2008/12/creating-upload-progress-bar-php/

This is a more native approach of doing it. Roughly the same, just change the key of the hidden input to your token and validate that and interrupt the connection in case of an error. Maybe thats even better. http://php.net/manual/en/session.upload-progress.php