可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have the following string:

<SEM>electric</SEM> cu <SEM>hello</SEM> rent <SEM>is<I>love</I>, <PARTITION />mind

I want to find the last "SEM" start tag before the "PARTITION" tag. not the SEM end tag but the start tag. The result should be:

<SEM>is <Im>love</Im>, <PARTITION />

I have tried this regular expression:

<SEM>[^<]*<PARTITION[ ]/>

but it only works if the final "SEM" and "PARTITION" tags do not have any other tag between them. Any ideas?

回答1:

And here's your goofy Regex!!!

(?=[\s\S]*?\<PARTITION)(?![\s\S]+?\<SEM\>)\<SEM\>

What that says is "While ahead somewhere is a PARTITION tag... but while ahead is NOT another SEM tag... match a SEM tag."

Enjoy!

Here's that regex broken down:

(?=[\s\S]*?\<PARTITION) means "While ahead somewhere is a PARTITION tag"
(?![\s\S]+?\<SEM\>) means "While ahead somewhere is not a SEM tag"
\<SEM\> means "Match a SEM tag"

回答2:

Use String.IndexOf to find PARTITION and String.LastIndexOf to find SEM?

int partitionIndex = text.IndexOf("<PARTITION");
int emIndex = text.LastIndexOf("<SEM>", partitionIndex);

回答3:

If you are going to use a regex to find the last occurrence of something then you might also want to use the right-to-left parsing regex option:

new Regex("...", RegexOptions.RightToLeft);

回答4:

The solution is this, i have tested in http://regexlib.com/RETester.aspx

<\s*SEM\s*>(?!.*</SEM>.*).*<\s*PARTITION\s*/>

As you want the last one, the only way to identify is to find only the characters that don't contain </SEM>.

I have included "\s*" in case there are some spaces in <SEM> or <PARTITION/>.

Basically, what we do is exclude the word </SEM> with:

(?!.*</SEM>.*)

回答5:

Bit quick-and-dirty, but try this:

(<SEM>.*?</SEM>.*?)*(<SEM>.*?<PARTITION)

and take a look at what's in the C#/.net equivalent of $2

The secret lies in the lazy-matching construct (.*?) --- I assume/hope C# supports this.

Clearly, Jon Skeet's solution will perform better, but you may want to use a regex (to simplify breaking up the bits that interest you, for example).

(Disclaimer: I'm a Perl/Python/Ruby person myself...)

回答6:

Have you tried this:

<EM>.*<PARTITION\s*/>

Your regular expression was matching anything but "<" after the "EM" tag. Therefore it would stop matching when it hit the closing "EM" tag.

Finding the last occurrence of a word

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

Finding the last occurrence of a word

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮