I want to change different charaters/substrings to a single character or nil
. I want to change "How to chop an onion?"
to "how-chop-onion"
.
string
.gsub(/'s/,'')
.gsub(/[?&]/,'')
.gsub('to|an|a|the','')
.split(' ')
.map { |s| s.downcase}
.join '-'
Using pipe character |
does not work. How can I do this with gsub
?
to|an|a|the
is pattern, you are using it as String. Here:Start by making a list of what you want to do:
Now think about the order in which these operations should be performed. The conversion to lower case can be done anytime, but it's convenient to do it first, in which case the regex need not be case-indifferent. Punctuation should be removed before certain words, to more easily identify words as opposed to substrings. Removing the extra spaces obviously must be done after words are removed. We therefore want the order to be:
After down-casing, this could be done with three chained
gsub
s:Note that without the word breaks (
\b
) inr2
we would get:Also, the first
gsub
could be replaced by:or:
These
gsub
s can be combined into one (how I'd write it), as follows:"Lookarounds" (here a positive lookahead) are often referred to as "zero-width", meaning that, while the match is required, they do not form part of the match that is returned.
1 Have you ever wondered where the terms "lower case" and "upper case" came from? In the early days of printing, typesetters kept the metal movable type in two cases, one located above the other. Those for the taller letters, used to begin sentences and proper nouns, were in the upper case; the remaining ones were in the lower case.