I'm new to LibreOffice Basic. I'm trying to write a macro in LibreOffice Calc that will read the name of a noble House of Westeros from a cell (e.g. Stark), and output the Words of that House by looking it up on the relevant page on A Wiki of Ice and Fire. It should work like this:
Here is the pseudocode:
Read HouseName from column A
Open HtmlFile at "http://www.awoiaf.westeros.org/index.php/House_" & HouseName
Iterate through HtmlFile to find line which begins "<table class="infobox infobox-body"" // Finds the info box for the page.
Read Each Row in the table until Row begins Words
Read the contents of the next <td> tag, and return this as a string.
My problem is with the second line, I don't know how to read a HTML file. How should I do this in LibreOffice Basic?
There are two mainly issues with this. 1. Performance Your UDF will need get the HTTP resource in every cell, in which it is stored. 2. HTML Unfortunately there is no HTML parser in OpenOffice or LibreOffice. There is only a XML parser. Thats why we cannot parse HTML directly with the UDF.
This will work, but slow and not very universal:
The better way would be, if you could get the needed informations from a Web API that would offered from the Wiki. You know the people behind the Wiki? If so, then you could place this there as a suggestion.
Greetings
Axel