An alternative to HTML parsing with regex
I am parsing HTML with regex in node.js to return a string. However, I was told that this is not a good idea in this post: Pull a specific string from an HTTP request in node.js
What are the more stable alternatives?
I am new to programming, so links to tutorials would be very helpful. Some documentation explanations are hard for me to understand.
source to share
node-htmlparser handles all the heavy lifting of HTML parsing. In addition, node-soupselect allows you to use a CSS style selector to find which element you are looking for.
However , I looked at your other question, and the question you really should be asking is not "how to clear this data from the HTML page", but rather "is there a better way to get the data I'm looking for?" The USGS has APIs that provide its data in a machine-readable form .
Here's a JSON object for the location you are in. To get the "most recent instantaneous" for the elevation of the tank surface, you load this file, run var d = JSON.parse
and:
for (var i = 0; i < d.value.timeSeries.length; i++) {
if (d.value.timeSeries[i].variable.variableName == 'Elevation of reservoir water surface above datum, ft') {
var result = d.value.timeSeries[i].values[0].value[d.value.timeSeries[i].values[0].value.length-1];
}
}
result
will now look like { dateTime: "2012-04-07T17:15:00.000-05:00", value: "1065.91" }
.
source to share