I have this string:
<p><span><b>C10373 - FIAT GROUP AUTOMOBILES/RAMO DI AZIENDA DI KUEHNE + NAGEL</b></span>
<p>la somma pari a € 400+IVA per l’attività</p>
<p>TELE+ A 20.000 LIRE AL MESE </p>
<li>a mano o via fax al numero +39.00.0.0.0.00.</li>
<p>Il punteggio base sarà incrementato di un <strong>+ </strong>al ricorrere di ciascuna delle seguenti condizioni:</p>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO1560277+Obblighi+contattuali-signed.pdf" title="">Obblighi contrattuali</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO1560277+Obblighi+contattuali-signed.pdf" title="">Obblighi contrattuali</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO_1560277Lettera_Invito.pdf" title="">Lettera di invito</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO1560277+Obblighi+contattuali-signed.pdf" title="">Obblighi contrattuali</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO1560277+disciplinare+di+gara-signed.pdf" title="">Disciplinare di gara</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+determina+di+aggiudicazione+58+2017.pdf" title="">Determina di aggiudicazione</a></li>
that is a part of the html content of my record.
Now, my problem is that I tried to replace ONLY the '+' in the href tags, so I create in MySQL this script:
set @xxx = replace(
@xxx,
substr(@xxx,locate('<a href',@xxx)+3,locate('</a>',@xxx)-locate('<a href',@xxx)),
replace(
substr(
@xxx,
locate('<a href',@xxx)+3,
locate('</a>',@xxx)-locate('<a href',@xxx)
),
'+',
' ')
);
where @xxx is the string in question.
The head is <a href
and the tail is </a>
, and this never change.
But I obtain this result, with a changing only in the substrings that are the same of the first, because the script search only other substrings with the same central part :
<p><span><b>C10373 - FIAT GROUP AUTOMOBILES/RAMO DI AZIENDA DI KUEHNE + NAGEL</b></span>
<p>la somma pari a € 400+IVA per l’attività</p>
<p>TELE+ A 20.000 LIRE AL MESE </p>
<li>a mano o via fax al numero +39.06.85.82.12.56.</li>
<p>Il punteggio base sarà incrementato di un <strong>+ </strong>al ricorrere di ciascuna delle seguenti condizioni:</p>
<li><a href="/aaa/gare/CIGZB81E5568D RDO1560277 Obblighi contattuali-signed.pdf" title="">Obblighi contrattuali</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D RDO1560277 Obblighi contattuali-signed.pdf" title="">Obblighi contrattuali</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO_1560277Lettera_Invito.pdf" title="">Lettera di invito</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D RDO1560277 Obblighi contattuali-signed.pdf" title="">Obblighi contrattuali</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+RDO1560277+disciplinare+di+gara-signed.pdf" title="">Disciplinare di gara</a></li>
<li><a href="/aaa/gare/CIGZB81E5568D+determina+di+aggiudicazione+58+2017.pdf" title="">Determina di aggiudicazione</a></li>
I must replace all the '+' in the href tags WITHOUT replacing the '+' in the other places, like in '€ 400+IVA' or in 'al numero +39.00.0.0.0.00.' .
Thank you for the help in advance, I hope the question is not so cryptic ^^ .
it's possible in pure MySQL only without the use off a full blown HTML parser or a UDF that would import REGEX_REPLACE function into MySQL.
The trick is by using a number generator and by using a nested SUBSTRING_INDEX to split the large string on the html tags
This number generator will generate number from 1 to 10.000
So the "parser" will support up to 10.000 tags if you need more you need to add more
Query
this SQL statement will be used to split on html tag
SQL statement
The trick now is to combine the number generator and the html string splittig. So the [tag_position] is filled with a number from the number generator.
This is done with a CROSS JOIN.
Query
see demo http://www.sqlfiddle.com/#!9/de2ed/32
Now we have the tags as separated records it really eazy to replace only with in records that contain "href".
SQL statement
see demo http://www.sqlfiddle.com/#!9/de2ed/38
Now that we know how to replace now we are going to merge the records back to one string. We can use GROUP_CONCAT for that.
Query
see demo for the complete query
http://www.sqlfiddle.com/#!9/de2ed/46