The Wall Street Journal reported yesterday that "A Russian "cyber-militia" has effectively knocked the central Asian republic of Kyrgyzstan offline in recent days". See the full article here: http://online.wsj.com/article/SB123310906904622741.html
It's quite hard to find traces of DoS attacks on the internet if they only swamp individual websites, but the article suggests that total connectivity to all of Kyrgyzstan has been severely impaired. This led me to wondering if signs of the attack are visible on the global routing table. This can happen if the attack is so severe, that edge routers connecting ISP's to one another can no longer exchange routing data, because the data link between them is fully saturated with attack traffic.
To start the investigation, it's necessary to find the two main internet service providers the article talks about. According to RIPE membership information (http://www.ripe.net/membership/indices/KG.html), the two largest Kyrgyztan-based ISP's are:
The other Kyrgyztani ISP's I found along the way are:
AS12764, AKNET, upstream Kyrgyztelecom
AS48271, City Telecom, upstream ElCat
AS39214, Comintech, upstreams DE-INSAT and Transfer
AS41750, Netcom, upstreams Kyrgyztelecom and BMCKG JSC BiMoCom Ltd
AS29061, Saimanet Telecommunications, upstream Kazakhtelecom
AS34639, Totel LTD, upstream ElCat
AS25035, Transfer LTD, upstream Saimanet
AS42210, BMCKG JSC BiMoCom Ltd, upstreams Saimanet and Kyrgyztelecom
AS41329, SkyMobile LTD, upstreams Saimanet and ElCat
As you can see, there are four operators connecting the rest of them to three operators outside Kyrgyzstan. To determine where data most likely is flowing, I conducted a very basic statistical analysis of the situation. Robotex provides information on the amount of unique IP addresses allocated to each of the operators, as well as peer and upstream distribution. Naturally this doesn't reveal actual traffic amounts or traffic flows, but rather gives the most probable path for any given traffic source. This helps in determining through with path traffic most likely flows into Kyrgyzstan.
Here is a visualization of the relations between the operators, with upstream distribution and cumulative IP-address amounts included (click for larger view):
As the figure shows, the original assumption that ElCat LTD and Kyrgyztelecom JSC appear to be the two main internet service providers seems correct. Saimanet comes quite close, though, so it can be included in the analysis as well.
To limit the amount of IP address prefixes to be analyzed, I focused on the prefixes which contain the corporate websites of the three internet service providers. This information can be found by comparing the DNS records of the websites to whois information in the RIPE database. The prefixes are:
Now, to find out if the BGP advertisements for these prefixes have changed, one has to dig historical routing data. The method of choice this time was to use BGPlay, a Java-based route update visualizer provided by RIPE Routing Information Service.
The prefix for Saimanet, 22.214.171.124/20, has been stable for some parts for the internet during this incident, as have the other prefixes as well. RRC05 in Vienna, RRC10 in Milan and RRC12 in Frankfurt show several path changes which involve AS9002 (RETN) and AS6854 (Synterra), both large Russian ISP's. The AS-paths seen present them as transit AS's, so this could indicate traffic engineering to protect their own networks from the attack traffic, or blackholing requests from downstream.
The most interesting changes for 126.96.36.199/20 took place on 27.1.2009 just after midnight (UTC). RRC10, RRC11 in NYC and RRC14 in Palo Alto show routes disappearing completely for a couple of minutes. Shortly before this happens, most of the paths seen have shifted to use AS20485 (Transtelecom) as the sole upstream for Kazakhtelecom. Several advertisements first disappear from AS6854 and then re-appear from AS20485, before the observed outage. In RRC14 the outage is the most dramatic, lasting up to 7 hours for some operators. RRC12 also supports this, but the path changes are not as dramatic. Another common denominator is that after the outage AS20485 is no longer seen as an upstream of Kazakhtelecom.
The prefix of ElCat, 188.8.131.52/24, shows first interesting changes in advertisements seen by RRC03 in Amsterdam on 22.1.2009 when first the aforementioned AS6854 is no longer seen as an upstream for Kazakhtelecom at 20:35:45 UTC (184.108.40.206/20 showed similar changes around the same time). It then returns within 2 hours.
Just after this, starting at 23:06:41 at RRC12 and 23:42:36 AS3216 (SOVAM) carries no routes to AS8449 (ElCat), leaving AS12997 as the only upstream for AS8449. This doesn't last long, as the paths return starting at 23:11:36 at RRC12 and 23:44:21 at RRC03. The same change in upstreams for AS8449 can be seen again in RRC12 on 22.1.2009 between 23:06:45 and 23:54:29 and on 30.1.2009 between 12:41:55 and 12:47:18. Twice during the last change, AS3216 carries no routes to AS8449 for a couple of minutes. The short duration of these events might indicate link saturation between the operators.
In the data from RRC03, the change can be seen reversed (AS12997 -> AS3216) 26.1.2009 between 02:19:24 and 02:28:43. The AS-paths seen from RRC03 are longer than the ones from RRC12, which I believe is the reason why the two upstream changes seen from RRC12 are not seen here.
Data from RRC10 supports the observations above. It also shows several changes in paths between the two upstreams, which I believe is more connected with the usual convergence changes in the internet. Also, the number of paths seen from there is more limited than from RRC03 and RRC12, so the changes don't tell as much.
So why are changes in between upstream operators important and why don't all RRC's see them? For starters, DoS attacks I have seen lately often originate from a limited number of sources. This is because the method of choice today for bringing down a site or an ISP is a UDP flood, i.e. sending as many small UDP packets as possible to the target. UDP is good protocol for this as it doesn't place a strain on the sending host and as it uses bandwidth more efficiently. This enables the attacker to reach the goal with just a couple of well connected computers. As said, edge routers connecting ISP's to one another can no longer exchange routing data if the data link between them is fully saturated with attack traffic. In other words, these observed changes between the upstreams might be indications of heavy attack traffic.
Secondly, by design, routing convergence in BGP takes time. The observed changes between upstreams took just a couple of minutes, so it is very likely that the changes just don't propagate very far in such a short period of time. This is true especially when connectivity is lost due to link saturation and not traffic engineering. Traffic engineering often involves sending out advertisements as soon as possible, whereas when a BGP session with a neighboring router drops, dampening often comes to play. Changes in BGP paths consume valuable CPU processing time on the routers, so network engineers don't want to waste it. Flapping links cause flapping routes and generate excessive advertisements, which in turn waste CPU time. Therefore the advertisements about path changes are dampened in case the session returns quickly, especially if the router has an alternate path it can use during downtime. Also, advertisements spread gradually along the AS-path, so the further away you are from the source, the longer it takes for the advertisement to reach your router.
Last, but not least, is the prefix of the largest ISP in Kyrgyzstan; Kyrgyztelecom, 220.127.116.11/20. It starts off with a bang by disappearing from the routing table on 18.1.2009 at around 16:50 UTC and returning about 5 minutes later. This can be seen on RRC's 00, 03, 04 and not so dramatically on RRC13 as it's further away from the source. This event also demonstrates the need for redundancy, as Kyrgyztelecom has only one upstream. Complete outage may have been avoided by using several upstreams.
The disappearance happens again on 21.1.2009 at around 2:35. This time we see the same phenomena as with Saimanet, paths first changing from AS3216 to AS20485, then disappearing before returning to AS3216. With some operators it takes up to 20:35 the next day before the routes return. These two event suggest that AS20485 is used as a backup transit provider only by Kazakhtelecom. It seems the connection between them just can't handle the load of the attack.
The events on 25.1.2009 around 18:00 look like traffic engineering. AS9198 (Kazakhtelecom) distributes traffic between upstreams and AS20485 appears to be in trouble again. Starting from 02:15 the following night is a mess. Note that ElCat was shifting upstream traffic from AS12997 to AS3216 at the same time, most likely as a result the following events. Paths first move from one upstream to another before disappearing again at around 2:21. First paths return about 7 minutes later. The prefix goes mostly off the air for a few minutes again at around 3:15, 3:37, 5:05 and 5:22. By 06:00 the paths have pretty much converged to what they were in the end of January 2009.
Disclaimer: This brief look into various public tools is by no means conclusive. I have no real data or timelines on the attacks themselves, so I can only speculate based on what I see in BGPlay. Any feedback especially on conceptual discrepancies is welcome.
|<< <||> >>|