running nokogiri in Jruby vs. just ruby

2020-08-04 04:37发布

问题:

I found startling difference in CPU and memory consumption usage. It seems garbage collection is not happening when i run the following nokogiri script

require 'rubygems'
require 'nokogiri'
require 'open-uri'

def getHeader()
 doz = Nokogiri::HTML(open('http://losangeles.craigslist.org/wst/reb/1484772751.html'))
 puts doz.xpath("html[1]\/body[1]\/h2[1]")
end


(1..10000).each do |a|

getHeader()

end

when run in Jruby, CPU consumption is over 10, and memory consumption % rises with time(starts from 2 to 20), until it eventually says "not enough memory"

when run in Ruby, CPU consumption never exceeds 2, and memory consumption % constant at 0.2 !

Why such big differences, why is memory consumption steadily increasing until it crashes.

running it in Ruby, much much lower cpu usage, and constant very low memory consumption

回答1:

Am I reading your script right? Are you hitting poor craigslist site, performing 10K HTTP get requests? :)

At any rate, what's your system, which version of Nokogiri gem, which JRuby version? With small modification to the script (opening the HTTP request only once and then rewinding the same data), both MRI and JRuby behave about the same, JRuby even 2 secs out of 20 total faster. No memory problems.



回答2:

ruby has better control of the memory than Jruby. In my opinion, you should only use Jruby if you need to use Java libraries or if you have several instances of the same program that will be running in the same machine at the same time, in that case JVM caching will do amazing things.