Pattern matching in VBS

2019-09-07 20:33发布

问题:

Using this as a reference: https://msdn.microsoft.com/en-us/library/ms974570.aspx#scripting05_topic2

I've been trying to figure out how to create a pattern to pull this:

LicenseDetail.asp?SID=&id=F1A32D21A83C2BB2BBF227E5443A6023

Out of this:

height='40'><td colspan='1' width='20%' align='center'bgcolor='#e9edf2'><font face=verdana color=#000000 size=-1>Real Estate Broker or Sales</font></td><td colspan='1' align='center' bgcolor='#e9edf2'><font face=verdana color=#000000 size=-1><a href='LicenseDetail.asp?SID=&id=F5A76372AAA358B9CD869630255FA424'>ALMEIDA, JOHN SOBRAL</a></font></td

I've tried a number of different combos, but I'm not even close...

For example, based on what I'm reading, seems like the () should grab literal and the \alphanumeric should grab the trailing numbers and letters and stop before the ' (since it's not a number or letter)...fail: "(LicenseDetail.asp?SID=&id=)\alphanumeric"

Thanks in advance.

回答1:

(1) Re-read the syntax details (e.g. "\alpanumeric")

(2) Search for "LicenseDetail" + "everthing not a '"

In code:

  Dim s : s     = "height='40'><td colspan='1' width='20%' align='center'bgcolor='#e9edf2'><font face=verdana color=#000000 size=-1>Real Estate Broker or Sales</font></td><td colspan='1' align='center' bgcolor='#e9edf2'><font face=verdana color=#000000 size=-1><a href='LicenseDetail.asp?SID=&id=F5A76372AAA358B9CD869630255FA424'>ALMEIDA, JOHN SOBRAL</a></font></td"
  Dim r : Set r = New RegExp
  r.Pattern = "LicenseDetail[^']+"
  Dim m : Set m = r.Execute(s)
  If 1 = m.Count Then
     WScript.Echo m(0).Value
  Else
     WScript.Echo "Bingo!"
  End If

Output:

LicenseDetail.asp?SID=&id=F5A76372AAA358B9CD869630255FA424

Update wrt comment:

I have no clue wrt quotes becoming double quotes when they hit a file, but I know why [^"] 'does not work': In VBScript, " in string literals are escaped by "". In code:

>> s = "name=""escapedquote"""
>> Set r = New RegExp
>> r.Pattern = """"
>> WScript.Echo s, r.Replace(s, "'")
>>
name="escapedquote" name='escapedquote"
>>

(go to here to see negated (double) quotes in regular expression pattern action.)