I got weird behavior from ruby (in irb):
irb(main):002:0> pp " LS 600"
"\302\240\302\240\302\240\302\240LS 600"
irb(main):003:0> pp " LS 600".strip
"\302\240\302\240\302\240\302\240LS 600"
That means (for those, who don't understand) that strip
method does not affect this string at all, same with gsub('/\s+/', '')
How can I strip that string (I got it while parsing Internet page)?
The string
"\302\240"
is a UTF-8 encoded string (C2 A0
) for Unicode code pointA0
, which represents a non breaking space character. There are many other Unicode space characters. Unfortunately theString#strip
method removes none of these.If you use Ruby 1.9.2, then you can solve this in the following way:
In Ruby 1.8.7 support for Unicode is not as good. You might be successful if you can depend on Rails's
ActiveSupport::Multibyte
. This has the advantage of getting a workingstrip
method for free. Install ActiveSupport withgem install activesupport
and then try this: