I have a very large file, approx. 200 million rows of data.
I would like to compress it with the Zlib library, specifically using the Writer.
Reading through each line one at at time seems like it would take quite a bit of time. Is there a better way to accomplish this?
Here is what I have right now:
require 'zlib'
Zlib::GzipWriter.open('compressed_file.gz') do |gz|
File.open(large_data_file).each do |line|
gz.write line
end
gz.close
end
You can use IO#read to read a chunk of arbitrary length from the file.
This will read the source file in 16kb chunks and add each compressed chunk to the output stream. Adjust the block size to your preference based on your environment.