Parse XML using Google Apps Script

2019-03-14 07:45发布

问题:

I'm having difficulty parsing the XML from a boardgamegeek query so that I may populate a google sheet with the data. Here's an example of the bgg xml:

<boardgames termsofuse="http://boardgamegeek.com/xmlapi/termsofuse">
  <boardgame objectid="423">
    <yearpublished>1995</yearpublished>
    <minplayers>3</minplayers>
    <maxplayers>6</maxplayers>
    <playingtime>300</playingtime>
    <name primary="true" sortindex="1">1856</name>
  </boardgame>
</boardgames>

And here's the Google Apps Script I've written to parse it:

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();     
  var entries = new Array();
  entries = root.getChildren('boardgame');

  for (var x = 0; x < entries.length; i++) {
    var name = entries[x].getAttribute('name').getValue();
    var yearpublished = entries[x].getAttribute('yearpublished').getValue();
    var minplayers = entries[x].getAttribute('minplayers').getValue();
    var maxplayers = entries[x].getAttribute('maxplayers').getValue();
  }
  //SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);
  Logger.log(entries);

I'm currently getting an error in the for-loop caused by entries being NULL. If I comment the loop out and log what bggXml looks like, it looks just like the example above. However, logging variables further down I get the following:

document => [Document:  No DOCTYPE declaration, Root is [Element: <boardgames/>]]
root => [Element: <boardgames/>]
entries =>  [[Element: <boardgame/>]]
entries[2] => undefined

Since the bggXml looks exactly how I'd expect but document does not, I assume the problem is in the parsing?

回答1:

After much trial and error and stumbling through the dark, I found the solution I was looking for. This will get the value of an individual xml element and set it to a variable:

var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();

So my final code looks like this. I hope it helps you in your endeavors.

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();

  //set variables to data from bgg
  var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();
  var minplayers = root.getChild('boardgame').getChild('minplayers').getText();
  var maxplayers = root.getChild('boardgame').getChild('maxplayers').getText();
  var playingtime = root.getChild('boardgame').getChild('playingtime').getText();
  var name = root.getChild('boardgame').getChild('name').getText();

  //populate sheet with variable data
  SpreadsheetApp.getActiveSheet().getRange(i+1,1).setValue(name);
  SpreadsheetApp.getActiveSheet().getRange(i+1,4).setValue(minplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(maxplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(playingtime);
  SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);

In case you happen to also be querying BGG, there are multiple name elements. I want the one with the primary attribute set to "true". Iterating through those elements to find the correct one will be my next challenge.