码迷,mamicode.com
首页 > 其他好文 > 详细

TCP

时间:2015-04-09 19:36:03      阅读:257      评论:0      收藏:0      [点我收藏+]

标签:

TCP

1. 背景

  1. TCP 是啥?

TCP是一种协议,和 UDP 一样属于 OSI 七层模型里面的第四层传输层的一种协议。

  1. TCP 协议是干啥的?

TCP provides a connection-oriented, reliable, byte stream service.

TCP 提供一种面向连接的、可靠的字节流服务。

  1. 怎么理解 TCP 的面向连接(connection-oriented)?

  2. 怎么理解 TCP 的可靠性(reliable)?

  3. 主动打开(active open)与被动打开(passive open)

active open 发送第一个 SYN 的一端将执行主动打开

passive open 接受 active open 这端发送的 SYN 并返回下一个 SYN 的执行被动打开

  1. 主动关闭与被动关闭

发送第一个 FIN 的一方执行主动关闭,接受这个 FIN 的一方执行被动关闭。

不管是客户端还是服务端,都能主动关闭这个连接。一般由客户端决定何时终止连接。

2. 术语

reordering 排列,排序
acknowledgement 承认,确认
synchronize 同步 同步; 进行同步处理; 三为同步化; 同步处理
Sequence Number 是包的顺序,用来解决网络包的乱序(reordering)问题。
Acknowledgement Number就是ACK--用于确认收到,用来解决不丢包的问题。
SYN 全称 Synchronize Sequence Numbers.
ISN 全称 Inital Sequence Number
socket ?

3. TCP 状态迁移

  • TCP 状态

    ESTABLISHED -- The socket has an established connection.
    
    SYN_SENT -- The socket is actively attempting to establish a connection.
    
    SYN_RECV -- A connection request has been received from the network.
    
    FIN_WAIT1 -- The socket is closed, and the connection is shutting down.
    
    FIN_WAIT2 -- Connection is closed, and the socket is waiting for a shutdown from the remote end.
    
    TIME_WAIT -- The socket is waiting after close to handle packets still in the network.
    
    CLOSE -- The socket is not being used.
    
    CLOSE_WAIT -- The remote end has shut down, waiting for the socket to close.
    
    LAST_ACK -- The remote end has shut down, and the socket is closed. Waiting for acknowledgement.
    
    LISTEN -- The socket is listening for incoming connections.  Such sockets are not included in the output unless you specify the --listening (-l) or --all (-a) option.
    
    CLOSING -- Both sockets are shut down but we still don‘t have all our data sent.
    
    UNKNOWN -- The state of the socket is unknown.
    
  • 正常的客户端TCP状态迁移:

CLOSED -> SYN_SENT -> ESTABLISHED -> FIN_WAIT_1 -> FIN_WAIT_2 -> TIME_WAIT -> CLOSED
  • 正常的服务器TCP状态迁移:
CLOSED -> LISTEN -> SYN_RCVD -> ESTABLISHED -> CLOSE_WAIT -> LAST_ACK -> CLOSED
  • 服务端主动关闭
client

CLOSED -> SYN_SENT -> ESTABLISHED -> CLOSE_WAIT -> LAST_ACK -> CLOSED

server

CLOSED -> LISTEN -> SYN_RCVD -> ESTABLISHED -> FIN_WAIT_1 -> FIN_WAIT_2 -> TIME_WAIT -> CLOSED
  • 客户端和服务端同时关闭

  • TCPIP_State_Transition_Diagram

link http://www.cs.northwestern.edu/~agupta/cs340/project2/TCPIP_State_Transition_Diagram.pdf

  1. TCP 三次握手(three way handshake) {#tcp-三次握手-three-way-handshake}

对于建链接的三次握手,主要是要初始化Sequence Number 的初始值(ISN)。

通信的双方要互相通知对方自己的初始化的Sequence Number。

  • TCP 三次握手
client     SYN seq=x --->       server
server  <--- SYN seq=y, ACK=x+1  client
client      ACK=y+1  --->        server
  • 三次握手 TCP 状态变迁

    1. 当 client 开始连接时,server 还处于 LISTENING 状态。

    2. client 发送第一个SYN包后,client 就处于 SYN_SENT 状态。

    3. server 收到 client 的 SYN 包后就处于 SYN_RCVD 状态,并且给 client 返回一个 ACK 包,发送一个 SYN 包。

    4. client 收到 server 返回的 ACK 包和 SYN 包,client 就处于 ESTABLISHED 状态,并且给 server 返回一个 ACK 包。

    5. server 收到 client 返回的 ACK 包,server 就处于 ESTABLISHED 状态。

client Status                              server Status

CLOSED                                  LISTENING
   |        client      SYN seq=x --->       server          |
SYN_SENT                                    |
   |                                    SYN_RCVD
   |        client  <--- SYN seq=y, ACK=x+1  server          |
ESTABLISHED                                 |
        client      ACK=y+1  --->        server          |
                                    ESTABLISHED
  • tcpdump 三次握手
$ sudo tcpdump -S -nn -i p33p1 tcp and host www.baidu.com

16:10:57.732489 IP 10.240.155.182.49884 > 115.239.211.110.80: Flags [S], seq 2696447430, win 29200, options [mss 1460,sackOK,TS val 2695718698 ecr 0,nop,wscale 7], length 0
16:10:57.734058 IP 115.239.211.110.80 > 10.240.155.182.49884: Flags [S.], seq 282159576, ack 2696447431, win 29200, options [mss 1440,sackOK,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop,nop,wscale 7], length 0
16:10:57.734099 IP 10.240.155.182.49884 > 115.239.211.110.80: Flags [.], ack 282159577, win 229, length 0
  • 关于 sequence number 的产生

在三次握手协议中,Clinet一定要监听服务器发送过来的ISNs, TCP使用的sequence number是一个32位的计数器,从0-4294967295。TCP为每一个连接选择一个初始序号ISN,为了防止因为延迟、重传等扰乱三次握手, ISN不能随便选取,不同系统有不同算法。

5. TCP 四次挥手

  • TCP 四次挥手
client     FIN --->     server
client  <--- ACK         server
client  <--- FIN         server
client      ACK --->     server
  • 四次挥手状态变迁

    1. 当 client 请求关闭连接时,client 给 server 发送一个 FIN 包,客户端就进入 FIN_WAIT_1 状态,等待对方的确认包

    2. server 收到 client 发来的 FIN 包后,给 client 返回一个 ACK 确认包,进入 CLOSE_WAIT 状态

    3. client 收到 server 发来的 ACK 包后,结束 FIN_WAIT_1 状态,进入 FIN_WAIT_2 状态,等待 server 发过来关闭请求

    4. server 认为数据传输完成,给 client 发送一个 FIN 包后,进入 LAST_ACK 状态

    5. 当 client 收到 server 的 FIN 包后,FIN_WAIT_2 状态结束,同时给 server 返回一个 ACK 确认包,client 进入 TIME_WAIT状态

    6. 当 server 收到 client 的ACK 包后,server 结束 LAST_ACK 状态,进入 CLOSED 状态

    7. 此时 server 真正关闭啦连接,但是客户端还在 TIME_WAIT 状态,client 等待 2MSL 后返回到 CLOSED 状态


      Note 2MSL
client Status                              server Status

ESTABLISHED                             ESTABLISHED
   |        client      FIN --->     server             |
FIN_WAIT_1                                 |
   |                                       |
   |        client  <--- ACK         server         |
FIN_WAIT_2                              CLOSE_WAIT
   |        client  <--- FIN         server         |
   |                                    LAST_ACK
   |        client      ACK --->     server         |
TIME_WAIT                               CLOSED
   |
CLOSED
  • tcpdump 四次挥手

16:10:57.737305 IP 10.240.155.182.49884 > 115.239.211.110.80: Flags [F.], seq 2696447509, ack 282160286, win 240, length 0 16:10:57.738883 IP 115.239.211.110.80 > 10.240.155.182.49884: Flags [.], ack 2696447510, win 193, length 0 16:10:57.738931 IP 115.239.211.110.80 > 10.240.155.182.49884: Flags [F.], seq 282160286, ack 2696447510, win 193, length 0 16:10:57.738964 IP 10.240.155.182.49884 > 115.239.211.110.80: Flags [.], ack 282160287, win 240, length 0

6. TIME_WAIT 状态

  • MSL (Maximum Segment Lifetime) 报文段最大生存时间

任何报文段被丢弃前在网络中的最大时间。


Note MSL 时间,refer RFC 793


  • TIME_WAIT状态

TIME_WAIT状态也称为 2MSL 等待状态。

  • MSL 处理原则

当 TCP 执行一个主动关闭,并发回最后一个 ACk,该连接必须在 TIME_WAIT 状态停留的时间为2倍的 MSL。这样可让 TCP 再次发送最后的 ACk 以防这个 ACk 丢失(另一端超市并重发最后的 FIN )。

6.1. 关于 TIME_WAIT 数量太多

link http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html

  • 关于tcp_tw_reuse

  • 关于tcp_tw_recycle

  • 关于tcp_max_tw_buckets

7. LVS web 应用抓包分析

7.1. 说明

client 123.58.180.188

lvs vip  123.58.180.123

realserver

lvs --> realserver DR

7.2. 抓包

  • client
curl -I -H Host:photo.163.com 123.58.180.123

sudo netstat -anlp|grep 123.58.180.123

tcp        0      0 123.58.180.188:14514    123.58.180.123:80       TIME_WAIT   -
sudo tcpdump -i eth0 tcp and host 123.58.180.123 -nnn

14:42:51.151998 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
14:42:51.152299 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [S.], seq 122778619, ack 1660001801, win 2896, options [mss 1460,sackOK,TS val 2328633369 ecr 3477472664,nop,wscale 9], length 0
14:42:51.152346 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
14:42:51.152508 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [P.], seq 1660001801:1660001878, ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 77
14:42:51.152701 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [.], ack 1660001878, win 6, options [nop,nop,TS val 2328633376 ecr 3477472664], length 0
14:42:51.280587 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [P.], seq 122778620:122779256, ack 1660001878, win 6, options [nop,nop,TS val 2328633408 ecr 3477472664], length 636
14:42:51.280616 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.280773 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [F.], seq 1660001878, ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.280930 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [F.], seq 122779256, ack 1660001879, win 6, options [nop,nop,TS val 2328633408 ecr 3477472696], length 0
14:42:51.280962 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779257, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
  • LVS
eth0 抓包
sudo tcpdump -i eth0 tcp and host 123.58.180.188 -nnn -c 50

14:42:51.152382 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
14:42:51.152672 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
14:42:51.152848 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [P.], seq 1660001801:1660001878, ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 77
14:42:51.280959 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.281096 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [F.], seq 1660001878, ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.281291 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779257, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0

eth1 抓包
sudo tcpdump -i eth1 tcp and host 123.58.180.188  and port 80 -nnn -c 500 -S

14:42:51.152422 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
14:42:51.152690 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
14:42:51.152864 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [P.], seq 1660001801:1660001878, ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 77
14:42:51.280975 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.281112 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [F.], seq 1660001878, ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.281309 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779257, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
  • realserver
bond0 抓包
sudo tcpdump -i bond0 tcp and host 123.58.180.188 -nnn -c 50

14:42:51.152419 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [S.], seq 122778619, ack 1660001801, win 2896, options [mss 1460,sackOK,TS val 2328633369 ecr 3477472664,nop,wscale 9], length 0
14:42:51.152853 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [.], ack 1660001878, win 6, options [nop,nop,TS val 2328633376 ecr 3477472664], length 0
14:42:51.280718 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [P.], seq 122778620:122779256, ack 1660001878, win 6, options [nop,nop,TS val 2328633408 ecr 3477472664], length 636
14:42:51.281079 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [F.], seq 122779256, ack 1660001879, win 6, options [nop,nop,TS val 2328633408 ecr 3477472696], length 0

bond1 抓包
sudo tcpdump -i bond1 tcp and host 123.58.180.188  and port 80  -nnn -c 500 -S

14:42:51.152356 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
14:42:51.152619 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
14:42:51.152836 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [P.], seq 1660001801:1660001878, ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 77
14:42:51.280922 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.281048 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [F.], seq 1660001878, ack 122779256, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0
14:42:51.281240 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122779257, win 9, options [nop,nop,TS val 3477472696 ecr 2328633408], length 0

7.3. 三次握手分析

  • client抓包分析

client 对 lvs 发起连接,通过抓包,能看到 client 和 lvs vip 之间的完整的三次握手信息,并没有看到 client 和 realserver 之间有任何的交互信息。

14:42:51.151998 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
14:42:51.152299 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [S.], seq 122778619, ack 1660001801, win 2896, options [mss 1460,sackOK,TS val 2328633369 ecr 3477472664,nop,wscale 9], length 0
14:42:51.152346 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0

所以,从 client 这边来看,lvs 就是完整提供服务的 web server, client 是无法分辨 lvs 后端是否存在 realserver。

  • LVS 抓包分析

    1. 收到 client 发来的第一个 SYN 包

      lvs 看到 client 发的 SYN 包
      
      14:42:51.152382 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
      
      client 给 lvs 发的 SYN 包
      
      14:42:51.151998 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
      
    2. 收到 client 返回的 ACk 包

      hlight}
      lvs 看到 client 返回的三次握手最后一个 ACK 包
      
      14:42:51.152672 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
      
      client 给 lvs 发送三次握手的最后一个 ACK 包
      
      14:42:51.152346 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
      
  • realserver 抓包分析

    1. realserver 在网卡 bond1 收到 client 发来的 SYN 包

      hlight}
      bond1 收到 client 发来的 SYN 包
      
      14:42:51.152356 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
      
      client 给 lvs 发送 SYN 包
      
      14:42:51.151998 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [S], seq 1660001800, win 2920, options [mss 1460,sackOK,TS val 3477472664 ecr 0,nop,wscale 9], length 0
      
    2. realserver 通过网卡 bond0 给 client 发送 SYN 包,并返回 client 发送过来的 SYN 包的确认 ACK 包

      hlight}
      realserver 给 client 发送 SYN + ACK 包
      
      14:42:51.152419 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [S.], seq 122778619, ack 1660001801, win 2896, options [mss 1460,sackOK,TS val 2328633369 ecr 3477472664,nop,wscale 9], length 0
      
      client
      
      14:42:51.152299 IP 123.58.180.123.80 > 123.58.180.188.32935: Flags [S.], seq 122778619, ack 1660001801, win 2896, options [mss 1460,sackOK,TS val 2328633369 ecr 3477472664,nop,wscale 9], length 0
      
    3. realserver 在网卡 bond1 收到 client 返回的确认 ACK 包

      hlight}
      bond1 收到 client 返回的 ACK 包
      
      14:42:51.152619 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
      
      client 给 lvs 返回 ACK 包
      
      14:42:51.152346 IP 123.58.180.188.32935 > 123.58.180.123.80: Flags [.], ack 122778620, win 6, options [nop,nop,TS val 3477472664 ecr 2328633369], length 0
      
    4. 给 client 返回 ACK 包,这个 ACK 包用来确认 client 给 web server 发送的 http 请求头的数据包

      realserver
      
      10:42:29.079043 IP 123.58.180.123.80 > 123.58.180.188.14514: Flags [.], ack 78, win 6, options [nop,nop,TS val 2325027858 ecr 3473867146], length 0
      
      client
      
      10:42:29.079175 IP 123.58.180.123.80 > 123.58.180.188.14514: Flags [.], ack 78, win 6, options [nop,nop,TS val 2325027858 ecr 3473867146], length 0
      

7.4. LVS DR 模式数据包流向

  1. client 发送 SYN 包流向

    client -→ lvs-vip-pub-ip-eth0 -→ lvs-pri-ip-eth1 -→ rs-pri-ip-bond1

  2. realserver 发送 SYN 包,并返回 ACK 包

    rs-pub-bond0 -→ client

  3. client 返回 ACK 包

    client -→ lvs-vip-pub-ip-eth0 -→ lvs-pri-ip-eth1 -→ rs-pri-ip-bond1

7.5. 网卡流量分析

7.5.1. LVS 节点

  • 外网网卡流量

外网只进不出

  • 内网网卡流量

内网只出不进

eth0 流量 = eth1 流量

LVS 在 DR 模式下的流量主要为 http 的请求头信息,从外网网卡 eth0 流进通过内网网卡 eth1 转发到后端 realserver, LVS 本身不处理任何请求。所以外网 eth0 流进的流量和内网 eth1 流出的流量相同。

网卡选型 LVS 外网网卡接受的 SYN 和 ACK 小包的数量还是很多的,总体流量上面不算太高,数据包的数量上面还是比较多的,网络中断会比较高,对网卡的处理小包的要求比较高。所以网卡最好选择 Intel 家的多对列网卡,慎用 Broadcom 家的网卡。

7.5.2. realserver 节点

  • 内网网卡进的流量

    1. lvs 到 realserver 的流量

    2. 后端节点到 upstream server 的流量

  • 内网网卡出的流量

    1. upstream server 到 upstream 节点的流量
  • 外网网卡流量

    1. realserver 到外网的节点的流Last edited by root, 2015-03-16 15:25:49

TCP

标签:

原文地址:http://www.cnblogs.com/276815076/p/4410406.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!