../../images/detect-gfw-block/gfw.jpeg

我们经常遇到这种情况,一夜过去,某一个境外网站便突然无法访问了。这个网站是被GFW屏蔽了吗?如果是的话,那么GFW是使用了哪种方法屏蔽的。

本文将简单介绍「如何系统性的判断某网站是否被GFW屏蔽」。

因笔者的操作系统原因,所有示例均在Linux系统下操作。


浏览器访问网页的基本过程如下:

  • DNS解析

  • TCP连接

  • HTTP请求

故本文也以此顺序讲解。


DNS解析

要想访问网页正确的DNS解析必可少,因此DNS投毒 (DNS poisoning)是GFW常用的屏蔽手段。

../../images/detect-gfw-block/dns_01.png

DNS投毒示意图 (图片来源)

此处以P站为例进行讲解。

当发现浏览器无法访问P站时,首先查看DNS解析是否正确。

$ dig www.pixiv.net

; <<>> DiG 9.14.8 <<>> www.pixiv.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59236
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.pixiv.net.                 IN      A

;; ANSWER SECTION:
www.pixiv.net.          32      IN      A       31.13.78.66

;; Query time: 22 msec
;; SERVER: 114.114.114.114#53(114.114.114.114)
;; WHEN: Thu Nov 28 20:00:38 CST 2019
;; MSG SIZE  rcvd: 58

此处可结合IP地理位置及国外无污染DNS解析结果进行对比。

#使用 ipip.net 网站数据查询IP地理位置
$ curl http://freeapi.ipip.net/31.13.78.66
["新加坡","新加坡","","",""]%

#使用 DoH 获得纯净解析结果
$ curl -H 'accept: application/dns-json' "https://cloudflare-dns.com/dns-query?name=www.pixiv.net&type=A" --no-progress-meter | jq
{
"Status": 0,
"TC": false,
"RD": true,
"RA": true,
"AD": false,
"CD": false,
"Question": [
    {
    "name": "www.pixiv.net.",
    "type": 1
    }
],
"Answer": [
    {
    "name": "www.pixiv.net.",
    "type": 5,
    "TTL": 126,
    "data": "pixiv.net."
    },
    {
    "name": "pixiv.net.",
    "type": 1,
    "TTL": 36,
    "data": "210.140.131.219"
    },
    {
    "name": "pixiv.net.",
    "type": 1,
    "TTL": 36,
    "data": "210.140.131.221"
    },
    {
    "name": "pixiv.net.",
    "type": 1,
    "TTL": 36,
    "data": "210.140.131.224"
    }
]
}

$ curl http://freeapi.ipip.net/210.140.131.221
["日本","福岛县","白河","","idcf.jp"]%

通过以上结果可知,国内、国外DNS解析相差较大,该网站可能被DNS投毒,因此进行下一步测试。

$ dig www.pixiv.net @example.com

; <<>> DiG 9.14.8 <<>> www.pixiv.net @example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19483
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.pixiv.net.                 IN      A

;; ANSWER SECTION:
www.pixiv.net.          195     IN      A       69.63.176.15

;; Query time: 4 msec
;; SERVER: 93.184.216.34#53(93.184.216.34)
;; WHEN: Thu Nov 28 20:13:10 CST 2019
;; MSG SIZE  rcvd: 47

$ dig www.pixiv.net txt  @example.com

; <<>> DiG 9.14.8 <<>> www.pixiv.net txt @example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49789
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.pixiv.net.                 IN      TXT

;; ANSWER SECTION:
www.pixiv.net.          72      IN      A       67.228.74.123

;; Query time: 10 msec
;; SERVER: 93.184.216.34#53(93.184.216.34)
;; WHEN: Thu Nov 28 20:15:31 CST 2019
;; MSG SIZE  rcvd: 47

$ dig www.pixiv.net aaaa  @example.com

; <<>> DiG 9.14.8 <<>> www.pixiv.net aaaa @example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26497
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.pixiv.net.                 IN      AAAA

;; ANSWER SECTION:
www.pixiv.net.          208     IN      A       69.171.224.12

;; Query time: 13 msec
;; SERVER: 93.184.216.34#53(93.184.216.34)
;; WHEN: Thu Nov 28 20:15:45 CST 2019
;; MSG SIZE  rcvd: 47

向境外未开放53端口的主机发送DNS查询请求,本不应收到查询结果,但收到错误的A纪录。可确定该网站已被DNS投毒。

除此而外,使用全局ping工具(如站长之家ping检测IPIP ping工具) 可获得更直观的结果。

../../images/detect-gfw-block/dns_02.png

境外检测点DNS解析结果正确,境内DNS解析结果错误。

TCP连接

对于部分网站(如google,twitter等)的IP,GFW会直接在三层进行屏蔽,不给任何访问机会。

$ curl -H 'accept: application/dns-json' "https://cloudflare-dns.com/dns-query?name=www.google.com&type=A" --no-progress-meter | jq
{
"Status": 0,
"TC": false,
"RD": true,
"RA": true,
"AD": false,
"CD": false,
"Question": [
    {
    "name": "www.google.com.",
    "type": 1
    }
],
"Answer": [
    {
    "name": "www.google.com.",
    "type": 1,
    "TTL": 73,
    "data": "172.217.0.36"
    }
]
}

通过DoH查询出正确的IP地址

$ telnet 172.217.0.36 80
Trying 172.217.0.36...
^C
$ telnet 172.217.0.36 443
Trying 172.217.0.36...
^C

无法连通相应IP的80、443端口

$ mtr 172.217.0.36  --report
Start: 2019-11-28T20:32:34+0800
HOST: localhost                   Loss%   Snt   Last   Avg  Best  Wrst StDev
1.|-- _gateway                   0.0%    10    5.3   5.7   3.2  12.7   3.1
2.|-- 10.1.2.21                 50.0%    10   89.6  22.0   3.4  89.6  37.8
3.|-- 10.254.2.2                 0.0%    10   50.8  10.5   2.7  50.8  14.6
4.|-- 202.204.[MASK].[MASK]      0.0%    10   10.8   7.8   4.1  17.3   4.1
5.|-- 101.4.117.49              10.0%    10    4.7  17.8   4.7  85.3  25.5
6.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
7.|-- 101.4.113.109             90.0%    10    5.7   5.7   5.7   5.7   0.0
8.|-- 101.4.114.194              0.0%    10    5.6   7.9   5.6  14.1   2.6
9.|-- 101.4.117.254              0.0%    10    5.4  19.6   5.4 105.5  30.5
10.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0

跟踪路由可发现,数据包丢失在骨干网。

此外,您可以使用IPIP提供的traceroute工具,查看全国各地的路由情况。

../../images/detect-gfw-block/tcp_01.png

HTTP请求

GFW除了上述屏蔽方法,其还可以分析HTTP请求及SNI头对特定网站进行阻断。

HTTP阻断

此节以草榴为例。

注意,草榴已被DNS投毒,以下演示均使用了防污染DNS。

$ curl -v http://t66y.com
*   Trying 104.26.10.160:80...
* TCP_NODELAY set
* Connected to t66y.com (104.26.10.160) port 80 (#0)
> GET / HTTP/1.1
> Host: t66y.com
> User-Agent: curl/7.67.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

访问草榴HTTP版本,访问出错,提示:链接被重置。

$ curl -v https://t66y.com
*   Trying 104.26.11.160:443...
* TCP_NODELAY set
*   Trying 2606:4700:20::681a:aa0:443...
* TCP_NODELAY set
* Connected to t66y.com (104.26.11.160) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=CA; L=San Francisco; O=Cloudflare, Inc.; CN=sni.cloudflaressl.com
*  start date: Nov 21 00:00:00 2019 GMT
*  expire date: Oct  9 12:00:00 2020 GMT
*  subjectAltName: host "t66y.com" matched cert's "t66y.com"
*  issuer: C=US; ST=CA; L=San Francisco; O=CloudFlare, Inc.; CN=CloudFlare Inc ECC CA-2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55702c251810)
> GET / HTTP/2
> Host: t66y.com
> user-agent: curl/7.67.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 256)!
< HTTP/2 200
< date: Thu, 28 Nov 2019 12:52:19 GMT
< content-type: text/html
< content-length: 1338
< set-cookie: __cfduid=d976a8685fe1c5351a8a9467c0fdb5fb11574945539; expires=Sat, 28-Dec-19 12:52:19 GMT; path=/; domain=.t66y.com; HttpOnly
< x-powered-by: PHP/5.6.40
< cf-cache-status: DYNAMIC
< expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
< server: cloudflare
< cf-ray: 53cc7a77bb57eb4d-LAX
<

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >
<html>
……
……

而草榴HTTPS版本则可以正常访问。


为了进一步验证屏蔽是基于HTTP阻断,我们可以进行如下curl测试。

$ curl -v --connect-to ::example.com:  http://t66y.com
* Connecting to hostname: example.com
*   Trying 93.184.216.34:80...
* TCP_NODELAY set
*   Trying 2606:2800:220:1:248:1893:25c8:1946:80...
* TCP_NODELAY set
* Connected to example.com (93.184.216.34) port 80 (#0)
> GET / HTTP/1.1
> Host: t66y.com
> User-Agent: curl/7.67.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

上面的测试连接到 example.com(93.184.216.34),但尝试请求 http://t66y.com。我们可以看到该连接同样的阻断了。

相反,如下所示,如果我们向t66y.com(104.26.11.160) 请求 http://example.com,可发现请求成功。

$ curl -v --resolve 't66y.com:80:104.26.11.160' --connect-to ::t66y.com: http://example.com
* Added t66y.com:80:104.26.11.160 to DNS cache
* Connecting to hostname: t66y.com
* Hostname t66y.com was found in DNS cache
*   Trying 104.26.11.160:80...
* TCP_NODELAY set
* Connected to t66y.com (104.26.11.160) port 80 (#0)
> GET / HTTP/1.1
> Host: example.com
> User-Agent: curl/7.67.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 409 Conflict
< Date: Thu, 28 Nov 2019 13:16:29 GMT
< Content-Type: text/plain; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: close
< Set-Cookie: __cfduid=d23e69c40d4692b519f5d5f9a54307e081574946989; expires=Sat, 28-Dec-19 13:16:29 GMT; path=/; domain=.example.com; HttpOnly
< Cache-Control: max-age=6
< Expires: Thu, 28 Nov 2019 13:16:35 GMT
< Server: cloudflare
< CF-RAY: 53cc9ddc0ebcebad-LAX
<
* Closing connection 0
error code: 1001%

SNI过滤

此节以维基百科为例。

首先,确认维基百科DNS解析正确。

$ dig www.wikipedia.org @114.114.114.114

; <<>> DiG 9.14.8 <<>> www.wikipedia.org @114.114.114.114
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20468
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; ANSWER SECTION:
www.wikipedia.org.      42366   IN      CNAME   dyna.wikimedia.org.
dyna.wikimedia.org.     33      IN      A       198.35.26.96

;; Query time: 28 msec
;; SERVER: 114.114.114.114#53(114.114.114.114)
;; WHEN: Thu Nov 28 21:46:01 CST 2019
;; MSG SIZE  rcvd: 91

$ curl -H 'accept: application/dns-json' "https://cloudflare-dns.com/dns-query?name=www.wikipedia.org&type=A"
{"Status": 0,"TC": false,"RD": true, "RA": true, "AD": false,"CD": false,"Question":[{"name": "www.wikipedia.org.", "type": 1}],"Answer":[{"name": "www.wikipedia.org.", "type": 5, "TTL": 10359, "data": "dyna.wikimedia.org."},{"name": "dyna.wikimedia.org.", "type": 1, "TTL": 77, "data": "198.35.26.96"}]}%

但仍然无法直接访问,这又是为什么?

$ curl -v https://www.wikipedia.org
*   Trying 198.35.26.96:443...
* TCP_NODELAY set
*   Trying 2620:0:863:ed1a::1:443...
* TCP_NODELAY set
* Connected to www.wikipedia.org (198.35.26.96) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to www.wikipedia.org:443
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to www.wikipedia.org:443

抓包可见发出 TLS handshake, Client hello 后,连接即被阻断。

../../images/detect-gfw-block/sni_01.png

抓包结果

为了进一步验证屏蔽是基于SNI过滤的假设,我们可以进行如下curl测试。

$ curl -v --connect-to ::example.com: https://www.wikipedia.org
* Connecting to hostname: example.com
*   Trying 93.184.216.34:443...
* TCP_NODELAY set
*   Trying 2606:2800:220:1:248:1893:25c8:1946:443...
* TCP_NODELAY set
* connect to 2606:2800:220:1:248:1893:25c8:1946 port 443 failed: Network is unreachable
* Connected to example.com (93.184.216.34) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to www.wikipedia.org:443
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to www.wikipedia.org:443

上面的curl测试连接到example.com(IP 93.184.216.34),但尝试使用www.wikipedia.org的SNI进行TLS握手。 正如我们从上面的输出中看到的,TLS handshake, Client hello一发出,连接就中断了。

相反,如下所示,如果我们在与www.wikipedia.org进行TLS握手时尝试使用example.com的SNI(我们使用--resolve选项跳过DNS解析),请求是成功的并且能够完成TLS握手。

$ curl -v --resolve 'www.wikipedia.org:443:198.35.26.96' --connect-to ::www.wikipedia.org: https://example.com
* Added www.wikipedia.org:443:198.35.26.96 to DNS cache
* Connecting to hostname: www.wikipedia.org
* Hostname www.wikipedia.org was found in DNS cache
*   Trying 198.35.26.96:443...
* TCP_NODELAY set
* Connected to www.wikipedia.org (198.35.26.96) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Wikimedia Foundation, Inc.; CN=*.wikipedia.org
*  start date: Nov  8 10:47:06 2019 GMT
*  expire date: Nov 22 07:59:59 2020 GMT
*  subjectAltName does not match example.com
* SSL: no alternative certificate subject name matches target host name 'example.com'
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, close notify (256):
curl: (60) SSL: no alternative certificate subject name matches target host name 'example.com'
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

总结

本文大致介绍了判断网站是否被墙的检测方法。

当然除此而外,你也可以使用诸如 greatfire 之类的检测网站进行检测。