cssSelector vs XPath for selenium

2019-02-16 00:24发布

问题:

As per my understanding, CSS selector traverses through the DOM. Because CSS files will not have any info about element position then why cssSelector is faster then XPath (theoretically).

Theoretically cssSelector taking less time then XPath as XPath need to traverse through HTML DOM. XPath we can search elements backward or forward in the DOM hierarchy while CSS works only in a forward direction.

But if cssSelector also traverse through HTML DOM then how it make cssSelector faster.

In other words how cssSelector actually works internally and reason why it always preferable/recommended to use by everyone then xpath

Also please share other benefit of using cssSelector over XPath.

And vice versa in which area XPath are better then cssSelector

回答1:

I've read a lot of articles and I've seen some like this and this that have data that show that CSS selectors are faster and I've done a little testing and have come to the same conclusion. I talked to Dave Haeffner, author of elementalselenium.com, in Dec 2016 and asked him about the perf numbers on his site (in the post I linked above) since they were pretty old. He linked me a presentation (see pp18-23) where he updated the tests and CSS selectors are still faster but XPath is catching up in a few configs.

So we can see evidence that it's true but I've never seen anyone talk about the technical details of why. If I were to guess, it would be because a lot of work has gone into the different browsers to optimize the speed of page rendering. Having CSS selectors work quickly makes the page render faster and since the browser drivers take advantage of the browser's ability to locate elements, that means CSS selectors generally win. I've read that some browsers have improved their XPath locator speed but I think it will likely always lag behind CSS selectors because it's just much less common than CSS selectors.

Both CSS selectors and XPath have to traverse through the DOM so there's no real difference there other than the speed of the engine that does the traversing. The CSS selector engine is likely a fine tuned machine by this point vs the XPath engine because of the wide spread use of CSS selectors.

My general locator strategy is ID first, CSS selector for everything else. When nothing else works I use XPath. It will vary from site to site but in my experience, IDs are maybe ~10% of my locators. CSS selectors are probably ~80% and the last 10% is XPath. I generally use XPath for when I need to locate an element by the contained text and very rarely DOM traversal. An example of my XPath usage might be I need to find an element in a TABLE relative to a row label, e.g. the price of cheese in a table row where the first cell contains "cheese" and the third cell contains the price.

I think XPath is seen a lot on sites like SO and many blogs because of its easy access. All I have to do is right-click an element in the devtools and Copy XPath. The problem is many times that generates a bad, brittle XPath. A handcrafted XPath is better but it takes time and experience to handcraft a good XPath or CSS selector. Time that many aren't willing to put in. A badly crafted CSS selector or XPath will make things slower also. Many times there are any number of ways that an element could be located, some are way more efficient than others... it comes down to the efficiency of the locator and how you use it. A badly formed CSS selector isn't automatically going to be faster than a well formed XPath,.



回答2:

The debate between cssSelector vs XPath still now remains as one of the most heated and subjective conversation in the Selenium Community. A quick recap on what we had already known so far can be summarized as:

  • People in favor of cssSelector say that it is more readable and faster (specifically when running against Internet Explorer).
  • While those in favor of XPath tout it's ability to transverse the page (while cssSelector cannot).
  • Traversing the DOM in older browsers like IE8 does not work with cssSelector but is fine with XPath.
  • XPath can walk up the DOM (e.g. from child to parent), whereas cssSelector can only traverse down the DOM (e.g. from parent to child)
  • However not being able to traverse the DOM with cssSelector in older browsers isn't necessarily a bad thing as it is more of an indicator that your page has poor design and could benefit from some helpful markup.

Dave Haeffner carried out a test on a page with two HTML data tables, one table is written without helpful attributes (ID and Class), and the other with them. I have analyzed the test procedure and the outcome of this experiment in details in the discussion Why should I ever use CSS selectors as opposed to XPath for automated testing?. While this experiment demonstrated that each Locator Strategy is reasonably equivalent across browsers, it didn't adequately paint the whole picture for us.


cssSelector vs XPath, Under a Microscope

Dave Haeffner in the discussion Css Vs. X Path, Under a Microscope mentioned, in an an end-to-end test there were a lot of other variables at play Sauce startup, Browser start up, and latency to and from the application under test. The unfortunate takeaway from that experiment could be that one driver may be faster than the other (e.g. IE vs Firefox), when in fact, that's wasn't the case at all. To get a real taste of what the performance difference is between cssSelector and XPath, we need to dig deeper. This can be achieved by running everything from a local machine while using a performance bench-marking utility. The focus was on a specific Selenium action rather than the entire test run, and run things numerous times.

To demonstrate this detailed example, a Windows XP virtual machine was setup and Ruby (1.9.3) was installed. All the available browsers and their equivalent browser drivers for Selenium was also installed. For bench-marking, Ruby's standard lib benchmark was used.


The Test

In order to get an adequate sample set of data, the same test was ran against each browser 100 times. And to weed out anomalies in the data the rehearsal feature of benchmark was used so that it would run the full test sequence, perform garbage collection, and then run it again. To make things comparable, a few of the locators were updated to make for better matches in comparison to each other. And the specific action we measured is find_element.


Test Code

require_relative 'base'
require 'benchmark'

class SmallDOM < Base

  LOCATORS = {
    :id => {
      id: 'table2'
    },
    :table_header_class => {
      class: 'dues'
    },
    :table_header_id_and_class => {
      :css => "#table2 thead .dues",
      :xpath => "//table[@id='table2']//thead//*[@class='dues']"
    },
    :table_header_id_class_and_direct_desc => {
      :css => "#table2 > thead .dues",
      :xpath => "//table[@id='table2']/thead//*[@class='dues']"
    },
    :table_header_traversing => {
      :css => "#table2 thead tr th:nth-of-type(4)",
      :xpath => "//table[@id='table2']//thead//tr//th[4]"
    },
    :table_header_traversing_and_direct_desc => {
      :css => "#table2 > thead > tr > th:nth-of-type(4)",
      :xpath => "//table[@id='table2']/thead/tr/th[4]"
    },
    :table_cell_id_and_class => {
      :css => "#table2 tbody .dues",
      :xpath => "//table[@id='table2']//tbody//*[@class='dues']"
    },
    :table_cell_id_class_and_direct_desc => {
      :css => "#table2 > tbody .dues",
      :xpath => "//table[@id='table2']/tbody//*[@class='dues']"
    },
    :table_cell_traversing => {
      :css => "#table2 tbody tr td:nth-of-type(4)",
      :xpath => "//table[@id='table2']//tbody//tr//td[4]"
    },
    :table_cell_traversing_and_direct_desc => {
      :css => "#table2 > tbody > tr > td:nth-of-type(4)",
      :xpath => "//table[@id='table2']/tbody/tr/td[4]"
    }
  }

  attr_reader :driver

  def initialize(driver)
    @driver = driver
    visit '/tables'
    super
  end

  # The benchmarking approach was borrowed from
  # http://rubylearning.com/blog/2013/06/19/how-do-i-benchmark-ruby-code/
  def benchmark
    Benchmark.bmbm(27) do |bm|
      LOCATORS.each do |example, data|
    data.each do |strategy, locator|
      bm.report(example.to_s + " using " + strategy.to_s) do
        begin
          ENV['iterations'].to_i.times do
         find(strategy => locator)
          end
        rescue Selenium::WebDriver::Error::NoSuchElementError
          puts "( 0.0 )"
        end
      end
    end
      end
    end
  end

end

Results

NOTE: The output is in seconds, and the results are for the total run time of 100 executions.


Analyzing the Results

  • On a whole, Internet Explorer is slower than the other drivers, but between CSS and XPath it looks like XPath is actually faster than CSS.
  • Chrome and Opera have some differences, albeit much smaller, but they sway in both directions.
  • In some cases CSS is faster, and in others, XPath.
  • Firefox looks to be a bit more optimized for CSS since it's mostly faster across the board.

Outro

Even with these speed differences they are only a few seconds (or fractions of seconds) apart -- and that's for 100 executions. When you think about how it takes 30 seconds or more to complete a test run, this kind of difference is negligible. So, the choice between css-selectors and xpath can be a tough one to make. But now you are armed with more than enough data to make the choice for yourself. It's really just a matter of finding what works for you and your team and to not get weighed down by the hype and opinions around which one is better.



回答3:

I always use xpath. With xpath I've got the ability to use any of the attributes of an object, including it's ID or Name. I can use combinations and subsets of them and invoke xpath's powerful language if I need it.

From my experience, navigating the DOM tree using xpath is a better way of locating an object than using CSS which is after all meant to describe only how it is displayed.

As for speed comparison, I'd reckon it would depend on the complexity of the expression. Does find_element_by_id('myID') or find_element_by_xpath(@id='myID') or find_element_by_css_selector ('myID') invoke different engines and produce different response times?

Much of the bad press I've heard and read about xpath is because people use hardcoded paths like //div[1]/div[2]/a[3] which is brittle. A more robust xpath expression like //div[@id='main']//a[@href='xpath_nodes.asp'] could be written.