I want to extract the reference from an URL.
For example, my URL looks like:
"https://www.amazon.es/Lenovo-YOGA-520-14IKB-Ordenador-convertible/dp/B071WBF4PZ/"
I want to get only the reference part, that is B071WBF4PZ
I also want to extract the price from this html element:
"<div id="cerberus-data-metrics" style="display: none;" data-asin="B078ZYX4R5" data-asin-price="1479.00" data-asin-shipping="0" data-asin-currency-code="EUR" data-substitute-count="0" data-device-type="WEB" data-display-code="Asin is not eligible because it has a retail offer" ></div>"
I need to get only the value of the attribute data-asin-price.
It could be done with indexOf',
substringor
split` but I don't get how to do it.
code:
However, I would recommend using Regular Expressions / patterns
And also check if
parts.length() >= 6
before you access[5]
Using Jsoup you can easily parse html and extract properties like
data-asin-price
. In this case I would not use Regular Expressions. However Regular expressions don't need extra libraries.This RegEx:
will match any number after
data-asin-price="
- so the match group 1 will be:1479.00