I have a job running on production which process xml files. xml files counts around 4k and of size 8 to 9 GB all together.
After processing we get CSV files as output. I've a cat command which will merge all CSV files to a single file I'm getting:
Errno::ENOMEM: Cannot allocate memory
on cat
(Backtick) command.
Below are few details:
- System Memory - 4 GB
- Swap - 2 GB
- Ruby : 1.9.3p286
Files are processed using nokogiri
and saxbuilder-0.0.8
.
Here, there is a block of code which will process 4,000 XML files and output is saved in CSV (1 per xml) (sorry, I'm not suppose to share it b'coz of company policy).
Below is the code which will merge the output files to a single file
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each {|file|
`cat #{file} >> #{final_output_file}`
}
I've taken memory consumption snapshots during processing.It consumes almost all part of the memory, but, it won't fail.
It always fails on cat
command.
I guess, on backtick it tries to fork a new process which doesn't get enough memory so it fails.
Please let me know your opinion and alternative to this.
So it seems that your system is running pretty low on memory and spawning a shell + calling cat is too much for the few memory left.
If you don't mind loosing some speed, you can merge the files in ruby, with small buffers. This avoids spawning a shell, and you can control the buffer size.
This is untested but you get the idea :
I have the same problem, but instead of
cat
it wassendmail
(gem mail
).I found problem & solution here by installing
posix-spawn
gem, e.g.and here is the example:
See also: Minimizing Memory Usage for Creating Application Subprocesses at Oracle.
You are probably out of physical memory, so double check that and verify your swap (
free -m
). In case you don't have a swap space, create one.Otherwise if your memory is fine, the error is most likely caused by shell resource limits. You may check them by
ulimit -a
.They can be changed by
ulimit
which can modify shell resource limits (see:help ulimit
), e.g.To make these limit persistent, you can configure it by creating the ulimit setting file by the following shell command:
Or use
/etc/sysctl.conf
to change the limit globally (man sysctl.conf
), e.g.