Matching Roman Numbers

2019-05-06 15:39发布

I have regular expression

(IX|IV|V?I{0,3}|M{1,4}|CM|CD|D?C{1,3}|XC|XL|L?X{1,3})

I use it to detect if there is any roman number in text.

eregi("( IX|IV|V?I{0,3}[\.]| M{1,4}[\.]| CM|CD|D?C{1,3}[\.]| XC|XL|L?X{1,3}[\.])", $title, $regs)

But format of roman number is always like this: " IV."... I have added in eregi example white space before number and "." after number but I still get the same result. If text is something like "somethinvianyyhing" the result will be vi (between both)...

What am I doing wrong?

标签： php regex roman-numerals

1条回答

欢心

2楼-- · 2019-05-06 15:52

You have no space before VI the space belongs always to the alternative before it was written and not to all. The same for the \. it belongs always to the alternative where it was written.

Try this

" (IX|IV|V?I{0,3}|M{1,4}|CM|CD|D?C{1,3}|XC|XL|L?X{1,3})\."

See it here on Regexr

This will match

I.
II.
III.
IV.
V.
VI.
VII.
VIII.
IX.
X.

But not

XI. MMI. MMXI.
somethinvianyyhing

Your approach to match roman numbers is far from being correct, an approach to match the roman numbers more correct is this, for numbers till 50 (L)

^(?:XL|L|L?(?:IX|X{1,3}|X{0,3}(?:IX|IV|V|V?I{1,3})))$

See it here on Regexr

I tested this only on the surface, but you see this will really get complex and in this expression C, D and M are still missing.

Not to speak about special cases for example 4 = IV = IIII and there are more of them.

Wikipedia about Roman numbers

0人赞添加讨论(0) 举报

Matching Roman Numbers

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间