Excel VBA - load blog text without loading images
I am trying to get a blog post (text only) using the below code:
Function extractPostBody(myURL As String) As String
Dim IE As New InternetExplorer
IE.Visible = True
IE.navigate myURL
On Error GoTo 0
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.Document
For i = 0 To Doc.getElementsByTagName("p").Length - 1
If InStr(1, Doc.getElementsByTagName("p")(i).innerText, "Tags: ") > 0 Then
Exit For
End If
PostBody = PostBody & vbNewLine & Doc.getElementsByTagName("p")(i).innerText
Next i
IE.Quit
extractPostBody = PostBody
End Function
After getting the text, I assign it to a cell and then use the split function to count the number of words in the extracted text. However, the code works on websites with a lot of images, the code waits until those images are loaded, which will slow down execution dramatically.
Is there another way to disconnect the text from the blog without waiting for the images to load?
EDIT:
Using Jeeped's suggestion, I am using the below code which I took from another StackOverflow post, however, it doesn't seem to come back to it to give credit to the author:
Function ScrapeWebPage(ByVal URL As String)
Dim HTMLDoc As New HTMLDocument
Dim tmpDoc As New HTMLDocument
Dim PostBody As String
Dim i As Integer, row As Integer
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets("Sheet1")
Set XMLHttpRequest = CreateObject("MSXML2.XMLHTTP")
XMLHttpRequest.Open "GET", URL, False
XMLHttpRequest.send
While XMLHttpRequest.readyState <> 4
DoEvents
Wend
With HTMLDoc.body
'Set HTML Document
.innerHTML = XMLHttpRequest.responseText
Set ListItems = .getElementsByTagName("p")
'Let process each data of the list items
For Each li In ListItems
PostBody = PostBody & vbNewLine & li.innerText
Next
End With
ScrapeWebPage = PostBody
End Function
This works, however the code now returns a captcha message which obviously I cannot fill in anymore because I cannot render IE. or can I?
source to share
No one has answered this question yet
Check out similar questions: