some time I'm trying to get data from this html table, I tried components paid and free. I tried to do some coding and also got no results. I have a class that throw directly html tables for ClientDataSet, but with this table it does not work. Anyone have any tips on how to get the data in this html table? Or a way to convert it to txt / xls / csv or xml? Follows the code for the table:
WebBrowser1.Navigate('http://site2.aesa.pb.gov.br/aesa/monitoramentoPluviometria.do?metodo=listarMesesChuvasMensais');
WebBrowser1.OleObject.Document.All.Tags('select').Item(0).Value:= '2013';
WebBrowser1.OleObject.Document.All.Tags('select').Item(1).Value:= '7';
WebBrowser1.OleObject.Document.All.Tags('input').Item(1).click;
Memo1.Text:= WebBrowser1.OleObject.Document.All.Tags('table').Item(10).InnerHTML;
Memo1.Lines.SaveToFile('table.html');
The following will extract the data from the HTML table on your target page and load it into a ClientDataSet.
It's fairly long-winded, perhaps demonstrating that as David said, Delphi is maybe not the best tool for the job.
On my Form1, I have a TEdit, edValue, for me to key in the value in the first data row in the HTML table data. I use this as a way to find the table in the HTML document. I dare say there are better methods, but at least my method should be more robust than hard-coding assumptions about the layout of the document in which the table is embedded that maybe won't survive a change by the page's author.
Broadly, the code works by first finding the HTML table cell using the contents of my edValue.Text, then finding the table to which the cell belongs, and then populating the CDS's Fields and data from the table.
The CDS fields are set to 255 characters by default; maybe there's a specification for the data published on the web page that would allow you to use a smaller value for some, if not all, fields. They're all assumed to be of type ftString, to avoid the code choking on unexpected cell contents.
Btw, at the bottom is a utility function for saving the HTML page locally, to save having to keep clicking the button for selecting a year + month. To reload the WebBrowser from the saved file, just use the file's name as the URL to load.
after some time studying I finally extract data from html table. To simplify I can extract data from html table directly, without having to 'parse' it was the tag 'table' and 'item' 11 the 'item' 10 had the same data but in a single cell. So what I did, I took each element of the table in html and StringGrid filled one, and then found a way to directly populate the dbgrid through ClientDataSet. I'll post the code (unit) to stand as an example and for that you need someone. I wanted to thank everyone who helped me in the comments. With more study'm seeing that the best way to do this procedure is to MSHTML.
.