Selenium C# Webdriver FindElements(By.LinkText) Re

2020-03-03 06:10发布

问题:

Is it possible to find links on a webpage by searching their text using a pattern like A-ZNN:NN:NN:NN, where N is a single digit (0-9).

I've used Regex in PHP to turn text into links, so I was wondering if it's possible to use this sort of filter in Selenium with C# to find links that will all look the same, following a certain format.

I tried:

driver.FindElements(By.LinkText("[A-Z][0-9]{2}):([0-9]{2}):([0-9]{2}):([0-9]{2}")).ToList();

But this didn't work. Any advice?

回答1:

In a word, no, none of the FindElement() strategies support using regular expressions for finding elements. The simplest way to do this would be to use FindElements() to find all of the links on the page, and match their .Text property to your regular expression.

Note though that if clicking on the link navigates to a new page in the same browser window (i.e., does not open a new browser window when clicking on the link), you'll need to capture the exact text of all of the links you'd like to click on for later use. I mention this because if you try to hold onto the references to the elements found during your initial FindElements() call, they will be stale after you click on the first one. If this is your scenario, the code might look something like this:

// WARNING: Untested code written from memory. 
// Not guaranteed to be exactly correct.
List<string> matchingLinks = new List<string>();

// Assume "driver" is a valid IWebDriver.
ReadOnlyCollection<IWebElement> links = driver.FindElements(By.TagName("a"));

// You could probably use LINQ to simplify this, but here is
// the foreach solution
foreach(IWebElement link in links)
{
    string text = link.Text;
    if (Regex.IsMatch("your Regex here", text))
    {
        matchingLinks.Add(text);
    }
}

foreach(string linkText in matchingLinks)
{
    IWebElement element = driver.FindElement(By.LinkText(linkText));
    element.Click();
    // do stuff on the page navigated to
    driver.Navigate().Back();
}


回答2:

Dont use regex to parse Html.

Use htmlagilitypack

You can follow these steps:

Step1 Use HTML PARSER to extract all the links from the particular webpage and store it into a List.

HtmlWeb hw = new HtmlWeb();
 HtmlDocument doc = hw.Load(/* url */);
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]"))
 {
//collect all links here
 }

Step2 Use this regex to match all the links in the list

.*?[A-Z]\d{2}:\d{2}:\d{2}:\d{2}.*?

Step 3 You get your desired links.