I've been playing about with Go's XML package and cannot see what is wrong with the following code.
package main
import (
"encoding/xml"
"fmt"
"net/http"
)
type Channel struct {
Items Item
}
type Item struct {
Title string `xml:"title"`
Link string `xml:"link"`
Description string `xml:"description"`
}
func main() {
var items = new(Channel)
res, err := http.Get("http://www.reddit.com/r/google.xml")
if err != nil {
fmt.Printf("Error: %v\n", err)
} else {
decoded := xml.NewDecoder(res.Body)
err = decoded.Decode(items)
if err != nil {
fmt.Printf("Error: %v\n", err)
}
fmt.Printf("Title: %s\n", items.Items.Title)
}
}
The above code runs without any errors and prints to the terminal:
Title:
The struct seems empty but I can't see why it isn't getting populated with the XML data.
Your program comes close, but needs to specify just a little bit more context to match the XML document.
You need to revise your field tags to help guide the XML binding down through your
Channel
structure to your Item
structure:
type Channel struct {
Items []Item `xml:"channel>item"`
}
type Item struct {
Title string `xml:"title"`
Link string `xml:"link"`
Description string `xml:"description"`
}
Per the documentation for encoding/xml.Unmarshal()
, the seventh bullet item applies here:
If the XML element contains a sub-element whose name matches
the prefix of a tag formatted as "a" or "a>b>c", unmarshal
will descend into the XML structure looking for elements with the
given names, and will map the innermost elements to that struct
field. A tag starting with ">" is equivalent to one starting
with the field name followed by ">".
In your case, you're looking to descend through the top-level <rss>
element's <channel>
elements to find each <item>
element. Note, though, that we don't need to—an in fact can't—specify that the Channel
struct should burrow through the top-level <rss>
element by writing the Items
field's tag as
`xml:"rss>channel>item"`
That context is implicit; the struct supplied to Unmarshall()
already maps to the top-level XML element.
Note too that your Channel
struct's Items
field should be of type slice-of-Item
, not just a single Item
.
You mentioned that you're having trouble getting the proposal to work. Here's a complete listing that I find works as one would expect:
package main
import (
"encoding/xml"
"fmt"
"net/http"
"os"
)
type Channel struct {
Items []Item `xml:"channel>item"`
}
type Item struct {
Title string `xml:"title"`
Link string `xml:"link"`
Description string `xml:"description"`
}
func main() {
if res, err := http.Get("http://www.reddit.com/r/google.xml"); err != nil {
fmt.Println("Error retrieving resource:", err)
os.Exit(1)
} else {
channel := Channel{}
if err := xml.NewDecoder(res.Body).Decode(&channel); err != nil {
fmt.Println("Error:", err)
os.Exit(1)
} else if len(channel.Items) != 0 {
item := channel.Items[0]
fmt.Println("First title:", item.Title)
fmt.Println("First link:", item.Link)
fmt.Println("First description:", item.Description)
}
}
}
I'd be completely explicit like this - name all the XML parts
See the playground for a full working example
type Rss struct {
Channel Channel `xml:"channel"`
}
type Channel struct {
Title string `xml:"title"`
Link string `xml:"link"`
Description string `xml:"description"`
Items []Item `xml:"item"`
}
type Item struct {
Title string `xml:"title"`
Link string `xml:"link"`
Description string `xml:"description"`
}
Nowadays the Reddit RSS feed seem to be have changed to the atom
type. This means that regular parsing will not work anymore. The atom functionality of go-rss could parse such feeds:
//Feed struct for RSS
type Feed struct {
Entry []Entry `xml:"entry"`
}
//Entry struct for each Entry in the Feed
type Entry struct {
ID string `xml:"id"`
Title string `xml:"title"`
Updated string `xml:"updated"`
}
//Atom parses atom feeds
func Atom(resp *http.Response) (*Feed, error) {
defer resp.Body.Close()
xmlDecoder := xml.NewDecoder(resp.Body)
xmlDecoder.CharsetReader = charset.NewReader
feed := Feed{}
if err := xmlDecoder.Decode(&feed); err != nil {
return nil, err
}
return &feed, nil
}