Retreiving a <TD> tag from a table in a website using VBA and insert into excel

I am trying to get information from a tag <TD>

on a website.

It works, but I cannot get the text from the second tag <TD>

in the tag <TR>

when using the conditional operator to get the second tag, as this is just the way I see it working. The code works fine to extract information that I just can't figure out how to access that second with the condition that I found a match in the first one <TD>

.

So the actual html table will look like this.

<html>
<head></head>
<body>
<table id="Table2">
<tr>
  <td class="tSystemRight">System Name: -if this matches</td>
  <td class="tSystemLeft breakword">Windows3756 -I need this</td>
</tr>
<tr>
  <td class="tSystemRight">System Acronym: -if this matches</td>
  <td class="tSystemLeft breakword">WIN37  -I need this</td>
</tr>
</table>
</body>
</html>

      

VBA script I have:

excelRow = 2

For Each tr In msxml.tableRows
cellCount = 1
   For Each TD In tr.getElementsByTagName("TD")
    If ((cellCount = 1) And (TD.innerText = "System Acronym:")) Then
       Worksheets("Data").Cells(excelRow, 2).value = Cells(1, 2)
    ElseIf ((cellCount = 1) And (TD.innerText = "System Name:")) Then
       Worksheets("Data").Cells(excelRow, 3).value = Cells(1, 2)
    cellCount = cellCount + 1
    End If
   Next
Next

      

This just displays System Name:

and System Acronym:

on excel sheet

+3


source to share


2 answers


If you have an element td

and you want to get the inner text of the next td

in a line, use a property nextSibling

like:

For Each td In tr.getElementsByTagName("TD")
    If ((cellCount = 1) And (td.innerText = "System Acronym:")) Then
       Worksheets("Data").Cells(excelRow, 2).Value = td.NextSibling.innerText
    ElseIf ((cellCount = 1) And (td.innerText = "System Name:")) Then
       Worksheets("Data").Cells(excelRow, 3).Value = td.NextSibling.innerText
    cellCount = cellCount + 1
    End If
   Next
Next

      



Note that nothing in this code changes the value excelRow

, so everything will be written on one line. Also note that the HTML provided has "System Name" and "System Abbreviation" second, whereas the code seems to be structured to look for "System Abbreviation" first and "System Name" second

+3


source


I developed the following from a public website with almost identical structure. ( https://www.federalreserve.gov/releases/h3/current/ )

Link to Microsoft Internet Controls

and requiredMicrosoft HTML Object Library

Option Explicit

Sub Test()

Dim ie As New InternetExplorer
Dim doc As New HTMLDocument

With ie

    .Visible = True
    .Navigate "https://www.federalreserve.gov/releases/h3/current/"

    'can place code to wait for IE to load here .. I skipped it since its not in direct focus of question

    Set doc = .Document

    Dim t As HTMLTable
    Dim r As HTMLTableRow
    Dim c As HTMLTableCol

    Set t = doc.getElementById("t1tg1")

    'loop through each row
    For Each r In t.Rows

        If r.Cells(0).innerText = "Mar. 2016" Then Debug.Print r.Cells(1).innerText

        'loop through each column in the row
        'For Each c In r.Cells

        '    Debug.Print c.innerText

        'Next

    Next

End With

End Sub

      



All that said, after setting your specific table like mine above, I suggest the following edit to your code (I missed cell validation and other stuff):

For Each r In t.Rows

    'find out which columns System Acronym and value will be and modify the Cells(n) statements          
    If r.Cells(0).innerText = "System Acronym:" Then Worksheets("Data").Cells(excelRow, 2).Value = r.Cells(2).innerText

Next

      

+2


source







All Articles