TCP Long-lived Connection

Long live the king! 👑

TCP long-lived connection is set by setting keep-alive bit in its header^[2]. The working mechanism is to attach a timer with your connection. When the keep-alive timer reaches 0, it will send a probe packet with no data on it and ACK flag on. By receiving the packet back, it knows the peer is alive.
We don’t need to worry about sending no-data packet on application layer, where TCP is stream-based rather than packet based. So the underlying work is done automatically. The disadvantage is slightly higher network load.
It can help:
- check for dead peers (if one side does not receive ACK packet from the other)
- prevent disconnection due to network inactivity (avoid being eliminated from routing table of router by showing up frequently)
- decrease data transmission delay (avoid resetting TCP connection for each data transmission over same link)^[1].

Here the load balancer is assigning connection at TCP layer. Thus the client is connecting ‘directly’ with server behind load balancer.
When a server fails, the client will detect that (with TCP long-lived connection) and re-connect (thru load balancer).
When the failed server comes back, we would like to direct some load from current servers to the new one. There are 3 possible solutions^[3]:
- make the connection not long-lived (i.e. 10mins, but which is not short, so after 10mins it needs to reset connection thru load balancer, thus sacrificing part of the performance)
- let the load balancer intervene to force re-connect (which is intrusive)
- let the client acts as load balancer (to detect imbalance or node changes, which is also intrusive) Among these 3, the first one would be the most recommended one according to the blog.