Extract second attribute of a xml node in R (XML p

2019-04-13 05:53发布


I want to extract both 'lat' and 'long' from a .xml file like this:

    <px lon="-55.75" lat="-18.5">2.186213</px>
    <px lon="-50.0"  lat="-18.5">0.0</px>
    <px lon="-66.75" lat="-03.0">1.68412</px>

this is what I've done so far, using the R::XML package:

#Load library for xml loading reading extracting

#Parse xml file 
a3  <- xmlRoot(xmlTreeParse("my_file.xml"))

#Extract text-value and attributes as lists
precip <- xmlSApply(a3, function(x) xmlSApply(x, xmlValue))
long   <- xmlSApply(a3, function(x) xmlSApply(x, xmlAttrs))
lat    <- xmlSApply(a3, function(x) xmlSApply(x, xmlAttrs)) #???

dt.lat.long.val <- data.frame(as.numeric(as.vector(lat)), 

How do I edit the line ending in #??? so to get the lat values?


You can extract the data using something along these lines

test <- '<asdf>
    <px lon="-55.75" lat="-18.5">2.186213</px>
    <px lon="-50.0"  lat="-18.5">0.0</px>
    <px lon="-66.75" lat="-03.0">1.68412</px>

a3 <- xmlParse(test)

out <- xpathApply(a3, "//px", function(x){
  coords <- xmlAttrs(x)
  data.frame(precip = xmlValue(x), lon = coords[1], lat = coords[2], stringsAsFactors = FALSE)

> do.call(rbind.data.frame, out)
       precip    lon   lat
lon  2.186213 -55.75 -18.5
lon1      0.0  -50.0 -18.5
lon2  1.68412 -66.75 -03.0