Whats the difference between getting text
and innerHTML
when using selenium. Even though we have text under particular element, when we perform .text
we get empty values. But doing .get_attribute("innerHTML")
works fine.
Can someone point out the difference between two? When someone should use '.get_attribute("innerHTML")' over .text
?
.text will retrieve an empty string of the text in not present in the view port, so you can sroll the object into the viewport and try .text it should retrive the value.
On the contrary innerhtml can get the value even of it is present out side the view port
Chrome (i'm not sure about other browsers) ignores the extra spaces within the HTML code and displays as a single space.
.get_attribute('innerHTML')
will return the double-spaced text, which is what you would see when you inspect element), while.text
will return the string with only 1 space.This difference is not trivial as the following will result in a NoSuchElementException.
Similarly,
.get_attribute('innerHTML')
for the following returnsExample Text
, while.text
returnsExample Text
.For instance,
<div><span>Example Text</span></div>
.get_attribute("innerHTML")
gives you the actual HTML inside the current element. SotheDivElement.get_attribute("innerHTML")
returns "<span>Example Text</span>
".text
gives you only text, not include HTML node. SotheDivElement.text
returns "Example Text
"Please note that the algorithm for
.text
depends on webdriver of each browser. In some cases such as element is hidden, you might get different text when you use different webdriver.I usually get text from
.get_attribute("innerText")
instead of.text
so I can handle the all the case.I have just selected the css selector and used below code:
and it prints:
The problem is
h1[itemprop='name']
selector on chrome or firefox are returning 2 matching nodes while.product-h1-container.visible-xl-block>h1
is returning only one matching node thats why its prining what is expectedTo prove my point run below code:
It will print
Because select_element_by_css_selector selects the first element with matching selector and that does not contain any text so it does not print. Hope you understand now
To start with,
text
is a property where asinnerHTML
is an attribute. Fundamentally there are some differences between a property and an attribute.get_attribute("innerHTML")
get_attribute(innerHTML) gets the
innerHTML
of the element.This method will first try to return the value of a property with the given name. If a property with that name doesn’t exist, it returns the value of the
attribute
with the same name. If there’s noattribute
with that name,None
is returned.Values which are considered truthy, that is equals
true
orfalse
, are returned as booleans. All other non-None
values are returned as strings. For attributes or properties which do not exist,None
is returned.Args:
Example:
text
text gets the text of the element.
Definition:
Example:
Still sounds similar? Read below ...
Attributes and properties
When the browser loads the page, it parses the HTML and generates DOM objects from it. For element nodes, most standard HTML attributes automatically become properties of DOM objects.
For instance, if the tag is:
then the DOM object has
body.id="page"
.HTML attributes
In HTML, tags may have attributes. When the browser parses the HTML to create DOM objects for tags, it recognizes standard attributes and creates DOM properties from them.
So when an element has id or another standard attribute, the corresponding property gets created. But that doesn’t happen if the attribute is non-standard.
So, if an attribute is non-standard, there won’t be a DOM-property for it. In that case all attributes are accessible by using the following methods:
elem.hasAttribute(name)
: checks for existence.elem.getAttribute(name)
: gets the value.elem.setAttribute(name, value)
: sets the value.elem.removeAttribute(name)
: removes the attribute.An example of reading a non-standard property:
Property-attribute synchronization
When a standard attribute changes, the corresponding property is auto-updated, and (with some exceptions) vice versa. But there are exclusions, for instance
input.value
synchronizes only fromattribute
-> toproperty
, but not back. This feature actually comes in handy, because the user may modify value, and then after it, if we want to recover the "original" value from HTML, it’s in the attribute.As per Attributes and Properties in python when we reference an attribute of an object with something like
someObject.someAttr
, Python uses several special methods to get thesomeAttr
attribute of the object. In the simplest case, attributes are simply instance variables.Python Attributes
In a broader perspective:
someObj.name
.__dict__
of an object.someObj.name
, the default behavior is effectivelysomeObj.__dict__['name']
Python Properties
In Python we can bind
getter
,setter
(anddeleter
) functions with an attribute name, using the built-inproperty()
function or@property
decorator. When we do this, each reference to an attribute has the syntax of direct access to an instance variable, but it invokes the given method function.