Regex - How to search for singular or plural versi

2020-06-04 14:31发布

I'm trying to do what should be a simple Regular Expression, where all I want to do is match the singular portion of a word whether or not it has an s on the end. So if I have the following words

test
tests

EDIT: Further examples, I need to this to be possible for many words not just those two

movie
movies
page
pages
time
times

For all of them I need to get the word without the s on the end but I can't find a regular expression that will always grab the first bit without the s on the end and work for both cases.

I've tried the following:

([a-zA-Z]+)([s\b]{0,}) - This returns the full word as the first match in both cases
([a-zA-Z]+?)([s\b]{0,}) - This returns 3 different matching groups for both words
([a-zA-Z]+)([s]?) - This returns the full word as the first match in both cases
([a-zA-Z]+)(s\b) - This works for tests but doesn't match test at all
([a-zA-Z]+)(s\b)? - This returns the full word as the first match in both cases

I've been using http://gskinner.com/RegExr/ for trying out the different regex's.

EDIT: This is for a sublime text snippet, which for those that don't know a snippet in sublime text is a shortcut so that I can type say the name of my database and hit "run snippet" and it will turn it into something like:

$movies= $this->ci->db->get_where("movies", "");
if ($movies->num_rows()) {
    foreach ($movies->result() AS $movie) {

    }
}

All I need is to turn "movies" into "movie" and auto inserts it into the foreach loop.

Which means I can't just do a find and replace on the text and I only need to take 60 - 70 words into account (it's only running against my own tables, not every word in the english language).

Thanks! - Tim

3条回答
Rolldiameter
2楼-- · 2020-06-04 15:11

Here's how you can do it with vi or sed:

s/\([A-Za-z]\)[sS]$/\1

That replaces a bunch of letters that end with S with everything but the last letter.

NOTE:

The escape chars (backslashes before the parens) might be different in different contexts.

ALSO:

The \1 (which means the first pattern) may also vary depending on context.

ALSO:

This will only work if your word is the only word on the line.

If your table name is one of many words on the line, you could probably replace the $ (which stands for the end of the line) with a wildcard that represents whitespace or a word boundary (these differ based on context).

查看更多
走好不送
3楼-- · 2020-06-04 15:22

For simple plurals, use this:

test(?=s| |$)

For more complex plurals, you're in trouble using regex. For example, this regex

part(y|i)(?=es | )

will return "party" or "parti", but what you do with that I'm not sure

查看更多
三岁会撩人
4楼-- · 2020-06-04 15:25

Ok I've found a solution:

([a-zA-Z]+?)(s\b|\b)

Works as desired, then you can simply use the first match as the unpluralized version of the word.

Thanks @Jahroy for helping me find it. I added this as answer for future surfers who just want a solution but please check out Jahroy's comment for more in depth information.

查看更多
登录 后发表回答