Jsoup - find element and remove it along with the previous element

I am trying to extract some data from a historical stock market table in Android. There is sometimes a row in the table that I need to delete in order to have a clean table. The snippet below the line contains the third third. I found a way to delete a cell with a dividend using

html = document.select("td[class=\"yfnc_tabledata1\"][align=\"center\"]");
            html.remove();

      

But I'm not sure how to remove the td with the date (May 4th, 2015). Any ideas? Then I retrieve the items from the "yfnc_tabledata1" class and loop through them to find the data I want.

<tr>
  <td class="yfnc_tabledata1" nowrap align="right">May 5, 2015</td>
  <td class="yfnc_tabledata1" align="right">28.69</td>
  <td class="yfnc_tabledata1" align="right">28.96</td>
  <td class="yfnc_tabledata1" align="right">27.64</td>
  <td class="yfnc_tabledata1" align="right">27.71</td>
  <td class="yfnc_tabledata1" align="right">4,595,800</td>
  <td class="yfnc_tabledata1" align="right">27.58</td>
</tr>
<tr>
  <td class="yfnc_tabledata1" nowrap align="right">May 4, 2015</td>
  <td class="yfnc_tabledata1" align="right">28.67</td>
  <td class="yfnc_tabledata1" align="right">28.80</td>
  <td class="yfnc_tabledata1" align="right">28.35</td>
  <td class="yfnc_tabledata1" align="right">28.61</td>
  <td class="yfnc_tabledata1" align="right">33,537,800</td>
  <td class="yfnc_tabledata1" align="right">28.47</td>
</tr>
<tr>
  <td class="yfnc_tabledata1" nowrap align="right">May 4, 2015</td>
  <td class="yfnc_tabledata1" align="center" colspan="6">0.26 Dividend</td>
</tr>
<tr>
  <td class="yfnc_tabledata1" nowrap align="right">May 1, 2015</td>
  <td class="yfnc_tabledata1" align="right">28.68</td>
  <td class="yfnc_tabledata1" align="right">28.68</td>
  <td class="yfnc_tabledata1" align="right">28.68</td>
  <td class="yfnc_tabledata1" align="right">28.68</td>
  <td class="yfnc_tabledata1" align="right">0</td>
  <td class="yfnc_tabledata1" align="right">28.28</td>
</tr>
      

Run codeHide result


+3


source to share


2 answers


I haven't tried it myself, but you could try doing something like this:

document.select("td[class=\"yfnc_tabledata1\"][align=\"center\"]").parents();

      



This way you get a covering "tr" and you can delete the entire line.

+1


source


Ok, found a solution already.

for( Element element : document.select("td[class=\"yfnc_tabledata1\"][align=\"center\"]")) {
                el = element.parent();
                el.remove();
            }

      



So I find a td with a dividend, I get its parent and remove it all. Seems to work.

+1


source







All Articles