15.3 访问并读取页面

在下边这个程序中，数组中的 url 都将被访问：会发送一个简单的 http.Head() 请求查看返回值；它的声明如下：func Head(url string) (r *Response, err error)

返回的响应 Response 其状态码会被打印出来。

package main

import (
	"fmt"
	"net/http"
)

var urls = []string{
	"http://www.google.com/",
	"http://golang.org/",
	"http://blog.golang.org/",
}

func main() {
	// Execute an HTTP HEAD request for all url's
	// and returns the HTTP status string or an error string.
	for _, url := range urls {
		resp, err := http.Head(url)
		if err != nil {
			fmt.Println("Error:", url, err)
		}
		fmt.Println(url, ": ", resp.Status)
	}
}

输出为：

http://www.google.com/ : 302 Found
http://golang.org/ : 200 OK
http://blog.golang.org/ : 200 OK

译者注 由于国内的网络环境现状，很有可能见到如下超时错误提示： Error: http://www.google.com/ Head http://www.google.com/: dial tcp 216.58.221.100:80: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

在下边的程序中我们使用 http.Get() 获取并显示网页内容； Get 返回值中的 Body 属性包含了网页内容，然后我们用 ioutil.ReadAll 把它读出来：

示例 15.8 http_fetch.go：

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
)

func main() {
	res, err := http.Get("http://www.google.com")
	checkError(err)
	data, err := ioutil.ReadAll(res.Body)
	checkError(err)
	fmt.Printf("Got: %q", string(data))
}

func checkError(err error) {
	if err != nil {
		log.Fatalf("Get : %v", err)
	}
}

当访问不存在的网站时，这里有一个CheckError输出错误的例子：

2011/09/30 11:24:15 Get: Get http://www.google.bex: dial tcp www.google.bex:80:GetHostByName: No such host is known.

译者注 和上一个例子相似，你可以把google.com更换为一个国内可以顺畅访问的网址进行测试

在下边的程序中，我们获取一个 twitter 用户的状态，通过 xml 包将这个状态解析成为一个结构：

示例 15.9 twitter_status.go

package main

import (
	"encoding/xml"
	"fmt"
	"net/http"
)

/*这个结构会保存解析后的返回数据。
他们会形成有层级的XML，可以忽略一些无用的数据*/
type Status struct {
	Text string
}

type User struct {
	XMLName xml.Name
	Status  Status
}

func main() {
	// 发起请求查询推特Goodland用户的状态
	response, _ := http.Get("http://twitter.com/users/Googland.xml")
	// 初始化XML返回值的结构
	user := User{xml.Name{"", "user"}, Status{""}}
	// 将XML解析为我们的结构
	xml.Unmarshal(response.Body, &user)
	fmt.Printf("status: %s", user.Status.Text)
}

输出：

status: Robot cars invade California, on orders from Google: Google has been testing self-driving cars ... http://bit.ly/cbtpUN http://retwt.me/97p<exit code="0" msg="process exited normally"/>

译者注 和上边的示例相似，你可能无法获取到xml数据，另外由于go版本的更新，xml.Unmarshal 函数的第一个参数需是[]byte类型，而无法传入 Body。

我们会在 15.4节中用到 http 包中的其他重要的函数：

http.Redirect(w ResponseWriter, r *Request, url string, code int)：这个函数会让浏览器重定向到 url（可以是基于请求 url 的相对路径），同时指定状态码。
http.NotFound(w ResponseWriter, r *Request)：这个函数将返回网页没有找到，HTTP 404错误。
http.Error(w ResponseWriter, error string, code int)：这个函数返回特定的错误信息和 HTTP 代码。
另一个 http.Request 对象 req 的重要属性：req.Method，这是一个包含 GET 或 POST 字符串，用来描述网页是以何种方式被请求的。

go为所有的HTTP状态码定义了常量，比如：

http.StatusContinue		= 100
http.StatusOK			= 200
http.StatusFound		= 302
http.StatusBadRequest		= 400
http.StatusUnauthorized		= 401
http.StatusForbidden		= 403
http.StatusNotFound		= 404
http.StatusInternalServerError	= 500

你可以使用 w.header().Set("Content-Type", "../..") 设置头信息。

比如在网页应用发送 html 字符串的时候，在输出之前执行 w.Header().Set("Content-Type", "text/html")（通常不是必要的）。

练习 15.4：扩展 http_fetch.go 使之可以从控制台读取url，使用 12.1节学到的接收控制台输入的方法（http_fetch2.go）

练习 15.5：获取 json 格式的推特状态，就像示例 15.9（twitter_status_json.go）

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

15.3.md

15.3.md

15.3 访问并读取页面

链接

Files

15.3.md

Latest commit

History

15.3.md

File metadata and controls

15.3 访问并读取页面

链接