Jsoup Get data from table inside table

It is not simple. I am parsing a page ( http://www.catedralaltapatagonia.com/invierno/partediario.php?default_tab=0 ) I need the data contained in a table inside another table, but I cannot access because I get all the errors about the index Invalid index

I need these values

cells i need

These cells are inside a td inside a tr, inside a table, and this table is inside another table. each column of cells is inside the div id "meteo_info" and inside each td there is the same div id.

I have tried this way with no success

      Elements base1=document.select("div#pd_foto_fondo");
            Elements base2 = base1.select("table");
            Elements base3 = base2.select("tr");
            Elements base4 = base3.select("table");
            Elements base5 = base4.select("tr");
            Elements base6 = base5.select("td");
            Element base7 =base6.get(0);
            Element div1 = base7.getElementById("meteo_info");
            Elements tables1 = div1.getElementsByTag("table");
            Element table1 = tables1.get(0);

            String text2 = table1.getElementsByTag("tr").get(3).getElementsByTag("td").get(2).text();

      

I am using this code inside Asyntask doInBackground

0


source to share


1 answer


First, when loading a web page in your application, change the field USER AGENT

to match the browser you are using on your computer. I assure you that you will get exactly the same page in your application with the same tags.
I'm using FF, but if you are using a different browser it should be pretty much the same -
open the developer tools (in FF it F12), select the inspector and select the item selector (FF is the leftmost tool). After that select one of the items you want to receive, say Sensación Térmica SECTOR BASE. The browser will highlight the code containing this element.
Hover over the high-level code, right-click it and select Copy unique selector

.
Then you can use this code to get the element -

Elements e = doc.select("#pd_foto_fondo > table:nth-child(5) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(1) > div:nth-child(1) > div:nth-child(3) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(4) > td:nth-child(3)"); 

      

And you can get the value

e.text();

      

Now do this for all the elements you need and you will find a pattern - there are three tables (SECTOR BASE, SECTOR INTERMEDIO, SECTOR SUPERIOR) and their id is in seventh place from the end (not easy see it, string too long. ..) -

#pd_foto_fondo > table:nth-child(5) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(1) > div:nth-child(1) > div:nth-child(3) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(4) > td:nth-child(3)
#pd_foto_fondo > table:nth-child(5) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(2) > div:nth-child(1) > div:nth-child(3) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(4) > td:nth-child(3)
#pd_foto_fondo > table:nth-child(5) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(3) > div:nth-child(1) > div:nth-child(3) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(4) > td:nth-child(3)

      



Also, each line has a different identifier, this time the second one from the end. Sensación Térmica

#pd_foto_fondo > table:nth-child(5) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(1) > div:nth-child(1) > div:nth-child(3) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(4) > td:nth-child(3)

      

and Viento -

#pd_foto_fondo > table:nth-child(5) > tbody:nth-child(1) > tr:nth-child(2) > td:nth-child(1) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(1) > div:nth-child(1) > div:nth-child(3) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(5) > td:nth-child(3)

      

(notice 4 and 5 on the last two lines).
You can run these selectors with two nested loops for

and get all the information you need.

+1


source







All Articles