Split on different newlines

2019-03-10 12:55发布

Right now I'm doing a split on a string and assuming that the newline from the user is \r\n like so:

string.split(/\r\n/)

What I'd like to do is split on either \r\n or just \n.

So how what would the regex be to split on either of those?

8条回答
Evening l夕情丶
2楼-- · 2019-03-10 13:06

Another option is to use String#chomp, which also handles newlines intelligently by itself.

You can accomplish what you are after with something like:

lines = string.lines.map(&:chomp)

Or if you are dealing with something large enough that memory use is a concern:

<string|io>.each_line do |line|
  line.chomp!
  #  do work..
end

Performance isn't always the most important thing when solving this kind of problem, but it is worth noting the chomp solution is also a bit faster than using a regex.

On my machine (i7, ruby 2.1.9):

Warming up --------------------------------------
           map/chomp    14.715k i/100ms
  split custom regex    12.383k i/100ms
Calculating -------------------------------------
           map/chomp    158.590k (± 4.4%) i/s -    794.610k in   5.020908s
  split custom regex    128.722k (± 5.1%) i/s -    643.916k in   5.016150s
查看更多
戒情不戒烟
3楼-- · 2019-03-10 13:09
# Split on \r\n or just \n
string.split( /\r?\n/ )

Although it doesn't help with this question (where you do need a regex), note that String#split does not require a regex argument. Your original code could also have been string.split( "\r\n" ).

查看更多
地球回转人心会变
4楼-- · 2019-03-10 13:11

Did you try /\r?\n/ ? The ? makes the \r optional.

Example usage: http://rubular.com/r/1ZuihD0YfF

查看更多
我欲成王,谁敢阻挡
5楼-- · 2019-03-10 13:17

The alternation operator in Ruby Regexp is the same as in standard regular expressions: |

So, the obvious solution would be

/\r\n|\n/

which is the same as

/\r?\n/

i.e. an optional \r followed by a mandatory \n.

查看更多
放荡不羁爱自由
6楼-- · 2019-03-10 13:20

Are you reading from a file, or from standard in?

If you're reading from a file, and the file is in text mode, rather than binary mode, or you're reading from standard in, you won't have to deal with \r\n - it'll just look like \n.

C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"
查看更多
Rolldiameter
7楼-- · 2019-03-10 13:24

Ruby has the methods String#each_line and String#lines

returns an enum: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-each_line

returns an array: http://www.ruby-doc.org/core-2.1.2/String.html#method-i-lines

I didn't test it against your scenario but I bet it will work better than manually choosing the newline chars.

查看更多
登录 后发表回答