XML is a commonly used data communication format in web services today, it becomes more and more important role in daily development. In this section, we're going to introduce how to work with XML through standard library.
I'll not teach what is XML or something like that, please read more documentation about XML if you haven't known that. We only focus on how to encode and decode XML files.
Suppose you are a operation staff, and you have following XML configuration file:
<?xml version="1.0" encoding="utf-8"?>
<servers version="1">
<server>
<serverName>Shanghai_VPN</serverName>
<serverIP>127.0.0.1</serverIP>
</server>
<server>
<serverName>Beijing_VPN</serverName>
<serverIP>127.0.0.2</serverIP>
</server>
</servers>
Above XML document contains two kinds of information about your server, which are server name and IP; we will use this document in our following examples.
How to parse this XML document? We can use function Unmarshal
in package xml
to do this.
func Unmarshal(data []byte, v interface{}) error
data receives data stream from XML, v is the structure you want to output, it is a interface, which means you can convert XML to any kind of structures. Here we only talk about how to convert to struct
because they have similar tree structures.
Sample code:
package main
import (
"encoding/xml"
"fmt"
"io/ioutil"
"os"
)
type Recurlyservers struct {
XMLName xml.Name `xml:"servers"`
Version string `xml:"version,attr"`
Svs []server `xml:"server"`
Description string `xml:",innerxml"`
}
type server struct {
XMLName xml.Name `xml:"server"`
ServerName string `xml:"serverName"`
ServerIP string `xml:"serverIP"`
}
func main() {
file, err := os.Open("servers.xml") // For read access.
if err != nil {
fmt.Printf("error: %v", err)
return
}
defer file.Close()
data, err := ioutil.ReadAll(file)
if err != nil {
fmt.Printf("error: %v", err)
return
}
v := Recurlyservers{}
err = xml.Unmarshal(data, &v)
if err != nil {
fmt.Printf("error: %v", err)
return
}
fmt.Println(v)
}
XML actually is a tree data structure, and we can define a almost same struct in Go, then use xml.Unmarshal
to convert from XML to our struct object. The sample code will print following content:
{{ servers} 1 [{{ server} Shanghai_VPN 127.0.0.1} {{ server} Beijing_VPN 127.0.0.2}]
<server>
<serverName>Shanghai_VPN</serverName>
<serverIP>127.0.0.1</serverIP>
</server>
<server>
<serverName>Beijing_VPN</serverName>
<serverIP>127.0.0.2</serverIP>
</server>
}
We used xml.Unmarshal
to parse XML document to corresponding struct object, and you should see that we have something like xml:"serverName"
in our struct. This is a feature of struct which is called struct tag
for helping reflection. Let's see the definition of Unmarshal
again:
func Unmarshal(data []byte, v interface{}) error
The first argument is XML data stream, the second argument is the type of storage, for now it supports struct, slice and string. XML package uses reflection to achieve data mapping, so all fields in v should be exported. But we still have a problem, how can it knows which field is corresponding to another one? Here is a priority level when parse data. It tries to find struct tag first, if it cannot find then get field name. Be aware that all tags, field name and XML element are case sensitive, so you have to make sure that one-one correspondence.
Go reflection mechanism allows you to use these tag information to reflect XML data to struct object. If you want to know more about reflection in Go, please read more about package documentation of struct tag and reflect.
Here are the rules when package xml
parse XML document to struct:
-
If the a field type is string or []byte with tag
",innerxml"
,Unmarshal
assign raw XML data to it, likeDescription
in above example:Shanghai_VPN127.0.0.1Beijing_VPN127.0.0.2
-
If a field called
XMLName
and its type isxml.Name
, then it gets element name, likeservers
in above example. -
If a field's tag contains corresponding element name, then it gets element name as well, like
servername
andserverip
in above example. -
If a field's tag contains
",attr"
, then it gets corresponding element's attribute, likeversion
in above example. -
If a field's tag contains something like
"a>b>c"
, it gets value of element c of node b of node a. -
If a field's tag contains
"="
, then it gets nothing. -
If a field's tag contains
",any"
, then it gets all child elements which do not fit other rules. -
If XML elements have one or more comments, all of these comments will be added to the first field that has the tag that contains
",comments"
, this field type can be string or []byte, if this kind field does not exist, all comments are discard.
These rules tell you how to define tags in struct, once you understand these rules, everything as easy as the sample code. Because tags and XML elements are one-one correspondence, we can also use slice to represent multiple elements in same level.
Note that all fields in struct should be exported(capitalize) in order to parse data correctly.
What if we want to produce XML document instead of parsing it, how can we do it in Go? xml
package provides two functions which are Marshal
and MarshalIndent
where the second function has indents for your XML document. Their definition as follows:
func Marshal(v interface{}) ([]byte, error)
func MarshalIndent(v interface{}, prefix, indent string) ([]byte, error)
The first argument is for storing XML data stream for both functions.
Let's has an example to see how it works:
package main
import (
"encoding/xml"
"fmt"
"os"
)
type Servers struct {
XMLName xml.Name `xml:"servers"`
Version string `xml:"version,attr"`
Svs []server `xml:"server"`
}
type server struct {
ServerName string `xml:"serverName"`
ServerIP string `xml:"serverIP"`
}
func main() {
v := &Servers{Version: "1"}
v.Svs = append(v.Svs, server{"Shanghai_VPN", "127.0.0.1"})
v.Svs = append(v.Svs, server{"Beijing_VPN", "127.0.0.2"})
output, err := xml.MarshalIndent(v, " ", " ")
if err != nil {
fmt.Printf("error: %v\n", err)
}
os.Stdout.Write([]byte(xml.Header))
os.Stdout.Write(output)
}
The above example prints following information:
<?xml version="1.0" encoding="UTF-8"?>
<servers version="1">
<server>
<serverName>Shanghai_VPN</serverName>
<serverIP>127.0.0.1</serverIP>
</server>
<server>
<serverName>Beijing_VPN</serverName>
<serverIP>127.0.0.2</serverIP>
</server>
</servers>
As we defined before, the reason we have os.Stdout.Write([]byte(xml.Header))
is both of function xml.MarshalIndent
and xml.Marshal
do not output XML header by itself, so we have to print it in order to produce XML document correctly.
Here we see Marshal
also receives v in type interface{}
, so what are the rules when it produces XML document?
- If v is a array or slice, it prints all elements like value.
- If v is a pointer, it prints content that v point to, it prints nothing when v is nil.
- If v is a interface, it deal with interface as well.
- If v is one of other types, it prints value of that type.
So how can it decide elements' name? It follows following rules:
- If v is a struct, it defines name in tag of XMLName.
- Field name is XMLName and type is xml.Name.
- Field tag in struct.
- Field name in struct.
- Type name of marshal.
Then we need to figure out how to set tags in order to produce final XML document.
-
XMLName will not be printed.
-
Fields that have tag contains
"-"
will not be printed. -
If tag contains
"name,attr"
, it uses name as attribute name and field value as value, likeversion
in above example. -
If tag contains
",attr"
, it uses field's name as attribute name and field value as value. -
If tag contains
",chardata"
, it prints character data instead of element. -
If tag contains
",innerxml"
, it prints raw value. -
If tag contains
",comment"
, it prints it as comments without escaping, so you cannot have "--" in its value. -
If tag contains
"omitempty"
, it omits this field if its value is zero-value, including false, 0, nil pointer or nil interface, zero length of array, slice, map and string. -
If tag contains
"a>b>c"
, it prints three elements where a contains b, b contains c, like following code:FirstName string
Asta Xiexml:"name>first"
LastName stringxml:"name>last"
You may notice that struct tag is very useful when you deal with XML, as well as other data format in following sections, if you still have problems with working with struct tag, you probably should read more documentation about it before get into next section.
- Directory
- Previous section: Text files
- Next section: JSON