Network Outage due to DDOS attack from 1pm to 5pm.
03 May 2011
A network outage today was caused by a sophisticated attack from systems in China, Russia and Romania which caused much of the core network to lock up.
We have resolved the issue and taken steps to make sure it will not cause similar problems in the future.
As you are probably aware, hacking activity has escalated in the weeks since the assaults on Lybia, and now the Bin Laden incident. Major and minor networks have been affected, some of which you have read in the news.
Today we had an attempt to break in using a combination of Remote Desktop hacking on all IP numbers, together with spoofed IP numbers and an attempt at a packet flood based D-DOS from several different countries and systems. This combination made the network very hard to diagnose.
In the initial phases of the attack the port scanning and packet flooding made the system generally sluggish. This was then made worse by the spoofed MAC addresses filling up ARP tables and consequently making switch performance very poor. This was then made worse by a previously undetected and insignificant flaw on one of the network configurations which triggered a lot of packet loss, and finally sealed by a fire alarm being set off at the data centre (not in our area which has automatic fire suppression) causing out engineers to be evacuated. This last occurred just as we had started a resolution and prolonged the outage by an hour.
Not a good day.
We are one of the few ISP's to firewall RDP access, which I know has been seen as irksome in the past. Todays activity proved that it is a wise policy. The effects of the attack would probably have been less if it had succeeded, but all the errors were by-products of its failure and no security breach occurred.
We are reasonable confident that we have taken effective measures:
1. All the areas from which the attacks originate have been blocked.
2. We have re-configured the network to be more tolerant of many small failed packets.
3. We have reworked the switch topography to provide less knock on from APR tables filling.
4. Our ARP tables need only be 1024 entries long, but we were using 8192. This has been extended further.
5. The connection flaw which was revealed has been eliminated.
Apologies for the trouble this has caused; I shall apologise because I don't believe the hackers would care to reveal themselves enough to do so. It would be rather risky for them!
|