Problem creating data.frame from xml file
I have my first jump to convert XML to data.frame and found questions like this: How to convert XML data to data.frame? very useful, but still fail to convert the XML part to the data.frame file.
My goal is to make a graph of the Euro against the US dollar rates over time. The data is listed here in XML format:
http://www.ecb.europa.eu/stats/exchange/eurofxref/html/usd.xml
I can read the data and show which piece of data (node?) I am interested in:
library(XML)
doc <- xmlTreeParse("http://www.ecb.europa.eu/stats/exchange/eurofxref/html/usd.xml")
root <- xmlRoot(doc)
root[[2]][[2]]
I've tried variations of getNodeSet () to show all lines that start with, but sofar to no avail:
getNodeSet(root, "/DataSet/Series/*")
getNodeSet(root, "//obs")
getNodeSet(root, "//obs[@OBS_VALUE = 1.1789]")
How can I extract all or TIME_PERIOD and OBS_VALUE variables from this XML file and put them in R data.frame? Thanks for any comments or clarifications.
source to share
This data is in sdmx format. You can use the R package rsdmx
for data analysis:
library(rsdmx)
appData <- readSDMX("http://www.ecb.europa.eu/stats/exchange/eurofxref/html/usd.xml")
myData <- as.data.frame(appData)
> head(myData)
FREQ CURRENCY CURRENCY_DENOM EXR_TYPE EXR_SUFFIX TIME_FORMAT COLLECTION TIME_PERIOD OBS_VALUE OBS_STATUS OBS_CONF
1 D USD EUR SP00 A P1D A 1999-01-04 1.1789 A F
2 D USD EUR SP00 A P1D A 1999-01-05 1.1790 A F
3 D USD EUR SP00 A P1D A 1999-01-06 1.1743 A F
4 D USD EUR SP00 A P1D A 1999-01-07 1.1632 A F
5 D USD EUR SP00 A P1D A 1999-01-08 1.1659 A F
6 D USD EUR SP00 A P1D A 1999-01-11 1.1569 A F
Alternatively, if you only have an XML package:
doc <- xmlParse("http://www.ecb.europa.eu/stats/exchange/eurofxref/html/usd.xml")
docData <- getNodeSet(doc, "//ns:Obs"
, namespaces = c(ns = "http://www.ecb.europa.eu/vocabulary/stats/exr/1")
, fun = xmlAttrs)
docData <- do.call(rbind, docData)
> head(docData)
TIME_PERIOD OBS_VALUE OBS_STATUS OBS_CONF
[1,] "1999-01-04" "1.1789" "A" "F"
[2,] "1999-01-05" "1.1790" "A" "F"
[3,] "1999-01-06" "1.1743" "A" "F"
[4,] "1999-01-07" "1.1632" "A" "F"
[5,] "1999-01-08" "1.1659" "A" "F"
[6,] "1999-01-11" "1.1569" "A" "F"
source to share