An Emperical Approach to TCP/IP Sequence Number Analysis

(C) 2001 Michal Zalewski <lcamtuf@bos.bindview.com>
(C) 2001 Mark Loveless <loveless@bos.bindview.com>

 

0) Introduction

In TCP/IP stream connections over public IP networks, unique sequence numbers are the only available method of protection against blind packet spoofing attacks in distributed networks. An attacker who is able to predict the sequence numbers used for TCP/IP connections can insert malicious data into private TCP sessions, potentially exposing network security. This research is an attempt to analyse PRNG (pseudo-random number generators) used for TCP sequence number choosing in different operating systems, and to expose potential flaws in used algorithms. At the same time, we would like to provide useful sequence number analysis tools to the open-source community.

To fully understand this whitepaper, you should have basic knowledge of TCP/IP protocols [1] and TCP spoofing attacks [2]. Basic knowledge of mathematics is assumed as well.
 

1) Analysis method

For the purpose of this research, we used two different analysis techniques. First of all, we developed visualisation utility to reconstruct three-dimensional phase-space attractors for each PRNG function used [3]. This method, known as "delayed coordinates", is well-known and widely used in analysis of dynamic systems (especially nonlinear systems and deterministic chaos): assuming we have one-dimensional discrete input, we can reconstruct n-dimensional function attractors using previous (delayed) values of function as missing coordinates. This trivial but powerful method can reveal the complex nature of underlying pseudo-random equation, showing not only data distribution, but subtle dependencies between subsequent results as well. Instead of using function values itself, we decided to calculate first derivate for input data to generate more suggestive and useful results, showing function dynamics. So, if s stands for input set, and x, y and z are point coordinates we are looking for, the equation looks this way:

  x[n] = s[n-3] - s[n-2]
  y[n] = s[n-2] - s[n-1]
  z[n] = s[n-1] - s [n]

Our relatively simple, svgalib-based visualisation utility supports interactive 3-D viewing of PRNG attractor and some supplementary functions (like screenshoots). The general rule for this technique is simple - if we cannot see any regularity in output data, this does not mean there is no regularity - it might be too complex to see (in 3-D representation, at least; and, in some cases, you might want to use wavelet transformations, as well). In fact, it is not enough to build good PRNG for TCP/IP sequence number generation purposes - because every PRNG not seeded by external entropy source would have all possible internal states 'exhausted' at some point, returning to a previous internal state and start generating extactly the same numbers (this is rather basical law of dynamic systems with finite number of possible internal states). Of course, it might happen relatively fast or might take long years, depending on implementation, but it is strongly recommended to add external entropy to the PRNG pool, turning it into RNG.

On the other hand, whenever there is any regularity, this means (P)RNG failure because there is a higher probability of specific, predictable behaviour (and, as we are talking about discrete, 32-bit output values, such conditions makes sequence number prediction easier). In this case, our task is to predict how serious this failure is. For example, PRNG using 24-bit (16777216 combinations) increases in every step instead of full 32-bit (4294967296 possibilities) space does not mean - practically speaking - any additional risk (risk estimation depends on approach; we are assuming average Internet connection lifetime and average bandwitch). At the same time, space reduced to 14 bits (16384 possibilities) means risk, because the attack is now feasible for average attacker, requiring him to send usually up to 8192 packets to get positive results. Knowing the moment when connection is established, attacker can send rapid fire of nn thousands packets to insert malicious data into communication stream.

Another supplementary method we used was relatively simple utility that generates potentially useful "spoofing set" - a set of most popular ISN increases attacker should use to predict next sequence number. Set size was limited to 1000 elements. Additionally, input data coverage was calculated (number of deltas covered in "spoofing set" / overall number of deltas). This method is less powerful, because it is not able to detect corelation between subsequent results, but at the same time it might be really helpful in most trivial cases, providing real-world measure to theoretical visualisation results.

We used normalized set of 100000 sequence numbers for every test (whenever possible), gathered in short time intervals (that's important for time-dependant algorithms). For time-based algorithms or algorithms where relatively small ISN changes are introduced in per-connection or per-timetick basis, this would result in relatively small, saturated objects (boxes for time/connection based increments, points and their 'halos' for time-based changes).
 

2) Introduction to PRNG analysis

In this section, we won't focus on actual PRNG implementations. Right now, we would like to discuss some idealized sets of data, that could be applied in real-life analysis.

First picture shows completely random 32-bit data (generated by relatively good RNG implementation with external entropy source)   - that example might be the perfect TCP sequence number generator:

[random data]

What you see is a uniform, homogenic cloud of flat-distributed points, filling whole available 32-bit phase-space with white noise. You (hopefully) cannot see any structure, groups of points nor any signs of regularity. As a counterexample, here's pretty complex function that produces values that are looking random (certainly more random than most PRNGs used as TCP/IP sequence number generators), but in fact there's simple dependency between subsequent results:

[complex function]

As you can see, it looks completely different. Chaos is replaced with order, corelation between subsequent results is clearly visible, creating subtle three-dimensional path. What does it mean for "wannabe" RNG? Well, obviously, knowing such n-dimensional characteristics of a specific algorithm, even not knowing the extact underlying function, you do not have to search whole 32-bit space of possible values but insignificant fragment of this space filled by this path instead.

Another important thing in (P)RNG analysis: what if RNG uses good entropy source and adds this entropy to pool in every cycle (in the way that makes 1-dimensional distribution looks just fine), but because pool-to-output data representation algorithm is a little bit broken, it causes partial (maybe just statistical) corelation between subsequent result? For example, you can be sure in 60% of cases that 1234 would be followed by 5678, but other than that output is random? In this case, you would see the following:

[statistical corelation]

What you see above is another example of flaw that can be easily traced using this method, and which is relatively difficult to trace without visualisation tools or in-depth function analysis. Some regions of phase-space are more occupied (in terms of attractor reconstruction, we would say some values are "attracted" by these regions). If considerably large amount of points can be found near any relatively small region (point, curve, plane, etc), you can in most cases successfully delimit your search for new values to this region, and you do not have to search whole available function space. Patterns are different for every "imperfect" algorithm, but if present, can mean problems.

Another we should cover is what happens when random data is mixed with predictable numbers - eg. what if top 16 bits of every result are truly random (taken from the first example), and less significant 16 bytes are predictable (taken from the second example)... That reduces search space 64k times, but is really hard to catch when examining function output... Or what if even results are predictable, and odd results are not predictable (this allows attacker to successfully guess sequence numbers for 50% of packets, which is certainly more than enough for IP spoofing)? Here is an example - just to let you know it is something you simply cannot overlook:

[mixed data]

Other cases that are worth mentioning are PRNGs with trivial time dependency. Pictures of such PRNGs usually contain regular groups of points on three surfaces and one very saturated center point where these surfaces are connected:

[trivial tdep]

Another pattern is generated for random, but limited (relatively small) increments. Such PRNGs are pretty good, except they are not filling whole available space, and high bits are not as random as it should be:

[small increases]

Since we've covered the mathematical basics and their representations, let's focus on specific imlementations.

3) Linux

[linux]

Linux 2.2 TCP/IP sequence numbers are not so good as they might be, but are certainly good enough, making 24-bit wide cloud. A wider 32-bit cloud would be better, but 24 bits give over 16 million combinations, making spoofing practically impossible in most of real-life scenarios.

"Spoofing set" is in this case empty (there are no dominating deltas). Not covered on this image are six attraction points placed relatively far away from the main cloud. These groups have small size (just approx 10 bits wide) and cover approximately 1% of all data (thanks to Elias Levy for pointing it out).
 

4) Windows:

In Windows 2000 and NT4 SP6a (with the latest hotfixes), TCP/IP PRNG has been significantly improved since previous releases, and it is now getting relatively good ranks (note: for NT4 SP6, we had very predictable results, 23-element spoofing set with 100% coverage). Unfortunately, almost 100% of results are enclosed within 14-bit wide 3-D cube. Regular cubes are usually a clear sign of time-dependent PRNGs. As this cube has a filled interior, this isn't trivial time dependency, but time- or cycle-based 14-bit increases. This box has three shadows, visible on the screenshoot below, but saturation of the shadows is marginal. The 14-bit wide box gives 8192 combinations, although probably a bit less because box seems to have higher density near its center. Sending less than 4200 packets to get 50% chance of connection hijacking isn't something impossible or even too difficult. Thus, we believe Windows 2000 and NT4 sequence numbers are carrying not enough randomness.

[windows 2000]

The spoofing set generated for this cube (1000-element) has a coverage of 8%. This means a high risk - or to be more clear, success in 2 out of 5 attempts of rapid sending 1000 spoofed packets.

Back to Windows NT 4.0 SP3, sequence numbers are much easier to predict. The Spoofing set has 68 elements and covers 100% of input data. Graph for this PRNG looks really trivial:

[Windows NT]

Please note this 8-bit wide graph covers over 99% of input data - and even then, this data is not randomly distributed around, but makes certain patterns. Problems in pre-SP6a Windows NT and Windows 98 / 95 were already addressed in several advisories.

Below is Windows 95 output (536-element set with 100% coverage), just confirming Windows 95 sequence numbers are dramatically trivial to predict (Windows 95 cloud is 8-9 bit wide):

[Windows 95]

What is really difficult to understand is why this algorithm was "weakened" in Windows 98 (SE). In next release of this operating system, the spoofing set is reduced to 191 elements with 100% coverage, and graph output shows a time-dependency pattern:

[Windows 98]

5) Cisco IOS

Recently, some security advisories addressing serious TCP/IP sequence numbers flaw in Cisco IOS operating system were released. Short tests against Cisco IOS implementation show the nature of this vulnerability:

[Cisco IOS]

What you see above is a trivial, probably microsecond-clock based time dependency at its finest, with most of the results attracted to one point and "echos" around.The spoofing set of 1000 elements gives approx 10% coverage, but it is even simplier to predict sequence numbers. The attacker would simply have to synchronize his own clock with the IOS clock, taking care of packet rtt times, and perform the attack at any time sending packets for eg. 100 microseconds +/-.

Here, for comparsion, are the results for patched IOS. The main attractor is still there, but some additional noise was introduced to make prediction more difficult (yet not difficult enough!):

[Cisco ALT]

6) AIX (4.3)

This is a trivial cyclic increments example. Output doesn't look too impressive because there's just a few values used:

[AIX]

The spoofing set has size of 5 elements (64000, 128000, 448000,512000,1216000) and 100% coverage. This makes this system 100% vulnerable to TCP/IP spoofing attacks.
 

7) FreeBSD

FreeBSD 4.2 seems to have problems more or less similar to Windows 2000. Their cube is a little bit larger, but at the same time their spoofing set has over 5% coverage. Attack is possible, but difficult. The cube is approx 16-bit wide (that gives approximately 65535 possibilities, not calculating relatively long and saturated 'shadows').

[FreeBSD]

8) OpenBSD

The problem here is pretty similar as in FreeBSD case. With our input data, we were able to generate a spoofing set with approximately 3% coverage. This means attacks are not very feasible, but on the other hand the TCP/IP sequence number generator might be significantly improved to avoid this risk. The cube is 17-18 bits wide, so the problem isn't so dangerous as in Windows 2000. Tested against OpenBSD 2.8:

[OpenBSD]

9) HP/UX

HPUX10 seems to have (practically speaking) NO random sequence numbers. It has constant increases of 64000, so a spoofing set of one element with 100% coverage. This is the default configuration, which fortunately can be modified, but once modified all you get is a 15-bit wide well-known cube pattern. HPUX11 seems to have significantly improved random number generator. Used function seems to be really weird, reminding the shape of the Mir orbital station. Unfortunately, this is certainly not enough. There is a 508-element "spoofing set" which covers 100% of the data - in other words, the internal state of PRNG makes a complete turn in 508 steps.

[HPUX11]

10) Solaris

Solaris has three different settings for the tcp_strong_iss kernel parameter. When it is set to 0, completely predictable numbers are generated (9-element set with 100% coverage). With the default setting of 1, Solaris 7 generates relatively good ISNs, but it has extactly the same problem as Windows 2000. The spoofing set has 3% coverage, which is not much, and does not make an attack very feasible (cube is approximately 17-bit wide).

[Solaris 7]

In Solaris 8 with tcp_strong_iss=1, the "randomness" source seems to behave a little bit different, generating a smaller spoofing set (2%). The data makes oval, Linux-like cloud (but it is significantly smaller, 18-bit radius):

[Solaris 8]

Setting tcp_strong_iss to 2 seems to provide relatively good ISNs. We have found a small set of dominating deltas, but it has less than 0.03% coverage. Output data makes nice, 32-bit cloud in phase space:

[Solaris ISS2]






11) BSDI 4.0

[BSDI]

Both BSDI's 3.0 and 4.0's to FreeBSD. They give practically the same results, with a little bit more visible shadows (that makes attack more difficult).
 

12) IRIX

The IRIX operating system, even in versions that are meant to have relatively good ISN subsystem (recent 6.5 releases), seems to be flawed:

IRIX

This graph shows 15-bit "flower" containing over 98% of all ISN deltas.
 

13) MacOS

The older MacOS9 operating system has a trivial and predictable ISN generator. With approximately 500 deltas and 99% set coverage, sequence numbers are completely predictable (even thought it gets good nmap ranks). Output pattern reminds us of an X-wing fighter:

[old MacOS]

The MacOS X operating system is another candidate for possible TCP sequence number guessing attacks. With set coverage of approx 10% and 95% of the data focused in 15-bit wide area, this implementation does not seems to be very secure:

[MacOS X]

14) Multiple network devices

Most of the network devices like HP printers, many routers (Motorola, Netopia, US Robotics and Intel), numerous 3Com switches (and so on) have completely predictable sequence numbers that use constant increments). These devices would have one-point or few-point representation of their PRNG engines. These problems are widely known and have been discussed on numerous forums (such as BugTraq), thus we do not think they are worth separate discussion in this paper.
 

15) Summary, conclusions

The general conclusions can seem a little bit scary. What comes to our attention is that for the most part every implementation described above (maybe except reconfigured Solaris) has more or less serious flaws that are making short-time TCP sequence number prediction attacks possible. We applied relatively loose measures, classifying attacks as "feasible" if they can be accomplished using relatively low bandwitch and reasonable amount of time. But, as network speeds are constantly growing, it would be not a problem to search the entire 32-bit ISN space in several hours, assuming a local LAN connection to the victim host (and assuming the network doesn't come crashing down, although an attack could be throttled to compensate). While it seems obvious that OS vendors should do their very best to make such attacks as difficult as possible, it obviously isn't so. What we're observing are sequence number frames between 1 (immediate guess) to 24 bits (15-60 minutes for an attack on fast uplinks), with dominating results between 15 and 19 bits (a few seconds to several minutes of attack time in most cases). This is even in server systems that are supposed to have strong ISNs.

For the purpose of overall comparsion of ISN generators implementations, we are using the following equation:

  overall ISN security factor = ( estimated bit width of main attractor )^2 / ( 101 - spoof set coverage )

Results are presented below (red line - assumed attack feasibility limit):

[comparsion]

NOTE: The scale is proportional to log(t), where "t" is time required to perform attack against specific implementation. Cisco IOS results are not considering clearly visible time-dependency patterns.

Solaris with tcp_strong_iss set to 2 (non-default configuration) wins the competition. At the same time, standard Solaris configurations are getting not really impressive notes. Linux 2.2.1x is getting relatively good note. Other systems received mediocre or very bad notes.

Of course the good news is that a lot of this is dependent upon certain conditions. For an attack to work, you have to have the following:

- An open TCP port to get an initial ISN, or a sniffed ISN. If there are no open ports reachable by the attacker, then the ISN would have to be guessed. Knowing a current ISN would certainly help tailor an attack to increase the odds of its success. This is still very possible, but the level of intelligence required to make the attack in the first place (such as sniffing and probing) would probably point to an alternate attack method that would be far more likely to succeed. For example, if the target is running a vulnerable version of BIND the attacker would probably not try to blind spoof the SMTP port. Or if an attacker could sniff an ISN, they might be able to sniff a needed password just as easily.

- Routing that will allow forged packets to reach the target. Proper firewall and router rules can prevent forged packets from reaching the target. This is not to say spoofing is impossible. For example, traffic into a DMZ may be allowed via ports 21, 22, 25, 53, and 80 from anywhere, with port 80 traffic directed at one particular IIS system coming from a VPN connection. A blind spoof to port 80 to execute the RDS exploit could lead to a NetCat listener on port 53 on that IIS host.

- A service listening on the target (or client / server application, in case the attacker is going to compromise existing connection integrity) that can be properly exploited via a blind spoof. Outside of denial of service and potentially exploiting trust relationships, many attacks against listening services often require more than just guessing the ISN, they require the manipulation of several potentially psuedo random factors. If the target doesn't really have an exploitable service running, then your options (ISN spoofing or otherwise) can be extremely limited.

Finally, we would like to remind that there is no output analysis method that can prove ISN generation is done in secure and random way. Our technique can detect numerous types of common failures, but does not mean our survey winners do not have other problems with ISN generators.
 

16) Links, references, documentation

[1] TCP/IP networking: http://msdn.microsoft.com/library/backgrnd/html/tcpipintro.htm

[2] TCP/IP spoofing, ISN weakness: http://www.sans.org/infosecFAQ/threats/intro_spoofing.htm

   Harris, B. and Hunt, R., "TCP/IP security threats and attack methods",
   Computer Communications, vol. 22, no. 10, 25 June 1999, pp.
   885-897.

   Guha, B. and Mukherjee, B., "Network Security via Reverse Engineering
   of TCP Code: Vulnerability Analysis and Proposed Solutions",
   IEEE Network, vol. 11, no. 4, July/August 1997, pp. 40-48.

[3] Phase-space reconstruction: http://www.mpipks-dresden.mpg.de/~tisean/TISEAN_2.1/docs/chaospaper/node6.html

   Hegger, R., Kantz, H., and Schreiber, T., "Practical implementation of
   nonlinear time series methods: The TISEAN package", Chaos, vol.
   9, no. 2, June 1999, pp. 413-435.

   Schreiber, T. and Schmitz, A., "Surrogate time series", Physica D,
   vol. 142, no. 3-4, 15 August 2000, pp. 346-382.

Tool sources: (not available at the moment)
Used data samples: (not available at the moment)


17) Credits

Some of data sets (Solaris, AIX, HPUX) contributed by skyper <skyper@segfault.net>
Windows 95/98 data contributed by noah williamsson <noah@hd.se>
Huge portions of data contributed by Mark Loveless of RAZOR <loveless@razor.bindview.com>
Cisco and BSDI data contributed by (anonymous).
Solaris tcp_strong_iss and other data contributed by Elias Levy <aleph1@securityfocus.com>
Routers data sets contributed by Piotr Zurawski <szur@ix.pl>
Some of HPUX samples contributed by Solar Designer <solar@openwall.com>

Additional thanks to Matt Power and Dave Mann for review and suggestions, and to other RAZOR Team members.