I have a gzip file and currently I read it like this:
infile = open("file.log.gz")
gz = Zlib::GzipReader.new(infile)
output = gz.read
puts result
I think this converts the file to a string, but I would like to read it line by line.
What I want to accomplish is that the file has some warning messages with some garbage, I want to grep those warning messages and then write them to another file. But, some warning messages are repeated so I have to make sure that i only grep them once. Hence line by line reading would help me.
You should be able to simply loop over the gzip reader like you do with regular streams (according to the docs)
infile = open("file.log.gz")
gz = Zlib::GzipReader.new(infile)
gz.each_line do |line|
puts line
end
Try this:
infile = open("file.log.gz")
gz = Zlib::GzipReader.new(infile)
while output = gz.gets
puts output
end
Other answers show how to read the file line by line, but not how to only capture the errors once. Building on @Tigraine's answer:
require 'set'
infile = open("file.log.gz")
gz = Zlib::GzipReader.new(infile)
errors = Set.new
# or ...
# errors = [].to_set
gz.each_line do |line|
errors << line if (line[/^Error:/])
# or ...
# errors << line if (line['Error:'])
end
puts errors
Set acts like Array, but is built using Hash, so it's like a Hash but we're only concerned with the keys, i.e. only unique values are stored. If you try to add duplicates they will be thrown away, leaving you with only the unique values. You could use an Array, and afterwards use uniq
, on it, but a Set will manage it for you up-front.
>> require 'set'
=> true
>> errors = Set.new
=> #<Set: {}>
>> errors << 'a'
=> #<Set: {"a"}>
>> errors << 'b'
=> #<Set: {"a", "b"}>
>> errors << 'a'
=> #<Set: {"a", "b"}>