Retreiving a <TD> tag from a table in a website using VBA and insert into excel
I am trying to get information from a tag <TD>
on a website.
It works, but I cannot get the text from the second tag <TD>
in the tag <TR>
when using the conditional operator to get the second tag, as this is just the way I see it working. The code works fine to extract information that I just can't figure out how to access that second with the condition that I found a match in the first one <TD>
.
So the actual html table will look like this.
<html>
<head></head>
<body>
<table id="Table2">
<tr>
<td class="tSystemRight">System Name: -if this matches</td>
<td class="tSystemLeft breakword">Windows3756 -I need this</td>
</tr>
<tr>
<td class="tSystemRight">System Acronym: -if this matches</td>
<td class="tSystemLeft breakword">WIN37 -I need this</td>
</tr>
</table>
</body>
</html>
VBA script I have:
excelRow = 2
For Each tr In msxml.tableRows
cellCount = 1
For Each TD In tr.getElementsByTagName("TD")
If ((cellCount = 1) And (TD.innerText = "System Acronym:")) Then
Worksheets("Data").Cells(excelRow, 2).value = Cells(1, 2)
ElseIf ((cellCount = 1) And (TD.innerText = "System Name:")) Then
Worksheets("Data").Cells(excelRow, 3).value = Cells(1, 2)
cellCount = cellCount + 1
End If
Next
Next
This just displays System Name:
and System Acronym:
on excel sheet
source to share
If you have an element td
and you want to get the inner text of the next td
in a line, use a property nextSibling
like:
For Each td In tr.getElementsByTagName("TD")
If ((cellCount = 1) And (td.innerText = "System Acronym:")) Then
Worksheets("Data").Cells(excelRow, 2).Value = td.NextSibling.innerText
ElseIf ((cellCount = 1) And (td.innerText = "System Name:")) Then
Worksheets("Data").Cells(excelRow, 3).Value = td.NextSibling.innerText
cellCount = cellCount + 1
End If
Next
Next
Note that nothing in this code changes the value excelRow
, so everything will be written on one line. Also note that the HTML provided has "System Name" and "System Abbreviation" second, whereas the code seems to be structured to look for "System Abbreviation" first and "System Name" second
source to share
I developed the following from a public website with almost identical structure. ( https://www.federalreserve.gov/releases/h3/current/ )
Link to Microsoft Internet Controls
and requiredMicrosoft HTML Object Library
Option Explicit
Sub Test()
Dim ie As New InternetExplorer
Dim doc As New HTMLDocument
With ie
.Visible = True
.Navigate "https://www.federalreserve.gov/releases/h3/current/"
'can place code to wait for IE to load here .. I skipped it since its not in direct focus of question
Set doc = .Document
Dim t As HTMLTable
Dim r As HTMLTableRow
Dim c As HTMLTableCol
Set t = doc.getElementById("t1tg1")
'loop through each row
For Each r In t.Rows
If r.Cells(0).innerText = "Mar. 2016" Then Debug.Print r.Cells(1).innerText
'loop through each column in the row
'For Each c In r.Cells
' Debug.Print c.innerText
'Next
Next
End With
End Sub
All that said, after setting your specific table like mine above, I suggest the following edit to your code (I missed cell validation and other stuff):
For Each r In t.Rows
'find out which columns System Acronym and value will be and modify the Cells(n) statements
If r.Cells(0).innerText = "System Acronym:" Then Worksheets("Data").Cells(excelRow, 2).Value = r.Cells(2).innerText
Next
source to share