An analysis of TCP secure SN generation in Linux and its privacy issues

Standards of TCP/IP protocol suite, recommend that operating systems should include a timer in generating an initial sequence number for a new TCP connection. ISN is the first sequence number sent by either party in any TCP connection. The timer helps operating system detect stale copies of old incarnations of a single connection by creating monolithic ISNs for those ISNs that are generated consecutively and very close to each other (because if a couple of miliseconds elapse before a new ISN is created, the time difference can make the new 4-byte ISN become smaller since it is extracted form the right-most bits of a 64 bit larger number in Linux). The proposed formula in RFC 6528 to create an ISN is as follows:

ISN = M + F(localip, localport, remoteip, remoteport, secretkey)

where M is the timer value in question. The timer shouldn’t be so slow that during two consecutive ticks two connections can be initiated. In Linux kernel a 64ns timer is used for this use. This is fast enough to prevent accepting duplicate instances of the same connection in most cases. The primary goal of using a timer is enhancing the protocol efficiency and implementation performance. This feature, however, potentially harms the anonymity of users. The amount of load on CPU influences its temperature which in turn induces changes to the crystal oscillator on which the accuracy of the hardware clock is dependent. This leads to changing clock skew of the timer which has been shown to be remotely detectable. [1] Apart from the fact that, this issue can be harmful for the cryptography operations in the kernel in the long run, clock skew in ISN also can in theory be used to deanonymize users [2][3]. In order to see the chain of ISNs affected by the 64ns timer in Linux, I wrote a small program to initiate multiple connections from a constant source port to a remote server. This can happen in real world scenarios and also a malicious code may make the OS do this. (Sometimes the malware itself cannot reveal the the identity and location of the victim – like when the user is using Whonix – unless doing a side channel attack like this one.) In the chart below, you can see the result of running the program and the generated ISNs using the current implementation in Linux kernel. (horizontal line is the connection number)

isn_orig

The sequence numbers are indeed the way they are supposed to be; monolithic. But since the difference between each pair is based on the system timer, they can be correlated to the real time elapsed between observing each two ISNs. The differences of pairs in this case are as follow.

dif_isns

These differences show the number of 64ns time slices between ISNs. Comparing the differences with the real differences observed when seeing the packets, is fairly easy by an adversary who has access to the network traffic of the victim. By changing the load on the CPU, the adversary can detect a change in the pattern of differences and then can verify whether this is the user who is initiating these new connections with high precision; effectively breaking anonymity. Even when the user is behind a gateway firewall like Whonix gateway it’s still possible to change the load on the gateway and observe the clock skew. The written differences above are totally within a reasonable time range to be included in an adversarial calculation. These are the results of only 10 consecutive connections to a specific host. Having more connections increases the accuracy of verification and reduces the chances of false positives significantly.

Solution

In order to cope with this problem we need to get rid of the timer completely. For our tests we chose Linux and made a patch for the kernel. Here I concisely explain the Linux behavior in this matter and the results of patching the kernel to eliminate the problem in question. Normally Linux checks for old duplicates while it’s in TIME_WAIT state taking the following execution flow(it may be slightly different in different kernel versions):

---> net/ipv4/af_inet.c: init_inet--------------
                                                |
     net/ipv4/protocol.c: inet_add_protocol     net/ipv4/tcp_ipv4.c: tcp_v4_rcv --------
                                             |
     net/ipv4/tcp_minisocks.c           syn && !th->rst && !th->ack && !paws_reject &&
        (after(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt) ||
                   .
                   .
                   .

By removing the role of the timer, if there is an old instance of the same connection that has been stuck somewhere in the path, which arrives after a later instance, the receiver may mistakenly initiate the connection using an old SYN. Firstly it’s unlikely that this situation (reinitiating the connection with an older SYN) happens in general, and second the cost of recovering from such an error is actually negligible and depends on the exact error handling mechanism implemented in different versions of the kernel and different operating systems but both parties recover from the problematic instance after fixing and agreeing on new sequence numbers. So the potential side-effects of applying the patch seem to be well tolerable. When anonymity is a serious concern, we can ignore this potential efficiency cost which in our experiment in regular usage of network, was actually intangible. In order to test this, I changed the ISN generator of the kernel in a way it creates secure random ISNs independent from any sort of timer. Running the same program after this change results in the following chart.

isn_cust-2

In this case, even for two SYN packets with the same source and destination IP and port, there’s no meaningful pattern in the observed ISNs, and relating the differences between them to the actual time of transmissions of packets is not possible. This will address the anonymity flaw that currently exists in the way TCP ISNs are created in different operating systems.

The Cost

There’s no added time cost in this patch to the kernel for generating a secure initial sequence number.

Usage

We decided to implement the code in a Linux kernel module that actually hot-patches the kernel to add a new algorithm to create TCP initial sequence numbers and does not replace the original algorithm presently used in the kernel. The new algorithm is used by the networking subsystem of the OS after activating the module which dynamically hooks into the kernel and installs the algorithm. The code is now available from my tirdad repository [4] and also you can install the module using apt-get [5]. If you choose to install the Debian package the functionality will be activated upon system boot. However the package that I’ve put in the upstream repository provides you with a separate loader to activate/deactivate the module to switch back and forth between tirdad and regular kernel ISN generator.

[1] https://dl.acm.org/citation.cfm?id=1180410
[2] https://phabricator.whonix.org/T543
[3] https://trac.torproject.org/projects/tor/ticket/16659
[4] https://github.com/0xsirus/tirdad/
[5] https://github.com/Whonix/tirdad

Contact: sirus.shahini@gmail.com
         twitter.com/0xsirus
         https://www.cs.utah.edu/~sirus
         PGP 0xB5333DFC0AFD91AF
University of Utah
School of Computing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s