Best file system for serving 1GB files using nginx

2019-03-11 11:34发布

问题:

I'm going to build large files server, and need stack-overflow community advice for file system choice (linux).

File server is going to serve 1-2GB sized static files (mostly different with every request) via Nginx, under constant moderate write to the disks (RAID5 SATA/7200 disks massive). Write to read ratio is about 1:5-10, for every 1 byte written per second, 5-10 are read. Most important for me is read performance, I can live with slower writes.

What Linux file system would be the best solution for this task? And why :) Thanks!

回答1:

To provide best results with serve heavy content, there is something else to tune. Please take a look at Nginx core developer's comment below:

  1. Switch off sendfile, it works bad on such workloads under linux due to no ability to control readahead (and hence blocks read from disk).

    sendfile off;

  2. Use large output buffers

    output_buffers 1 512k

  3. Try using aio to ensure better disk concurrency (and note under linux it needs directio as well), i.e. something like this

    aio on; directio 512;

Other recommendations:

  1. Check the filesystem swap is not used

  2. Filesystem - ext4, xfs. Good to enable data_writeback and noatime mount options



回答2:

I achieved 80MB/s of "random read" performance per "real" disk (spindle). Here are my findings.

So, first decide how much traffic you need to push down to users and how much storage you need per server.

You may skip the disk setup advice given below since you already have a RAID5 setup.

Lets take an example of a dedicated 1Gbps bandwidth server with 3 * 2TB disks. Keep first disk dedicated to OS and tmp. For other 2 disks you may create a software raid (For me, it worked better than on-board hardware raid). Else, you need to divide your files equally on independent disks. Idea is to keep both disk share read/write load equally. Software raid-0 is best option.

Nginx Conf There are two ways to achieve high level of performance using nginx.

  1. use directio

    aio on;
    directio 512; output_buffers 1 8m;

    "This option will require you to have good amount of ram" Around 12-16GB of ram is needed.

  2. userland io

    output_buffers 1 2m;

    "make sure you have set readahead to 4-6MB for software raid mount" blockdev --setra 4096 /dev/md0 (or independent disk mount)

    This setting will optimally use system file cache, and require much less ram. Around 8GB of ram is needed.

Common Notes:

  • keep "sendfile off;"

you may also like to use bandwidth throttle to enable 100s of connections over available bandwidth. Each download connection will use 4MB of active ram.

        limit_rate_after 2m;
        limit_rate 100k;

Both of above solution will scale easily to 1k+ simultaneous user on a 3 disk server. Assuming you have 1Gbps bandwidth and each connection is throttled at 1Mb/ps There is additional setup needed to optimize disk writes without affecting reads much.

make all Uploads to main os disk on a mount say /tmpuploads. this will ensure no intermittent disturbance while heavy reads are going on. Then move the file from /tmpuploads using "dd " command with oflag=direct. something like

dd if=/tmpuploads/<myfile> of=/raidmount/uploads/<myfile> oflag=direct bs=8196k


回答3:

Very large files tend not to be very dependent on which filesystem you use, modern filesystems (i.e. not FAT!) do a very good job of allocating them in large contiguous chunks of storage and thus minimizing seek latency. Where you tend to see differences between them are in small file performance, fragmentation resistance in out-of-space situations, concurrency, etc... Storing big files is a comparatively easy problem, and I doubt you'll see measurable differences.

But as always: if you really care, benchmark. There are no simple answers about filesystem performance.