J3di Try not. Do... or do not. There is no try.


8
Feb/10
0

UK Server Crash

The VMware ESXi bare-metal OS that runs the UK server VM crashed at around 13:00 GMT today.

After establishing that it was the server that had crashed, and not RapidSwitch's network, the server was power cycled and came back up as expected :)

26
Sep/09
0

Updates from RapidSwitch Re:UK Server Downtime

RapidSwitch sent the following update in the small hours of the morning:

We were provided a solution by Cisco TAC at approximately 01:40 UK time. This was the third solution provided and we are pleased to say that it seems to have been effective.Network capacity and performance is running at normal levels, and over 99% of clients are back online.

And just after midday today:

Please be advised that we are now able to officially provide an all-clear notification to all clients regarding the network disturbances, outages, interruptions to service and emergency maintenance experienced by clients in RSH-North since Midday, Thursday 24th September. To clarify, normal service has been resumed for the large majority of clients since before 3am however we are now able to issue a full all clear.

Hopefully this means that the connectivity problems are over for the UK server :)

We will add the UK server back to the DNS round-robin, and cease the redirecting of the DNS records for the UK later this evening (assuming there is no more downtime).

Update @ 27/09 01:00: We have reverted the changes to the DNS records; they are now back to how they were before the downtime started.

Filed under: General
25
Sep/09
1

UK Server Downtime

Due to RapidSwitch encountering unforeseen problems with their router software upgrade last night, there have been large bouts of downtime over the last 24 hours when the UK server has been unreachable.

We have set up a secondary/backup DNS service which will store a copy of the main DNS server's records, so that you are still able to resolve the US and DE server's IP addresses during any future downtime.

In addition, we have removed the UK server from the DNS round-robin for irc.J3di.org, and are also redirecting the records for the UK server to point at the US server until this crisis has abated.

For more frequent (and always available) updates, you might like to follow @J3di_IRC, @Fr3d_org or even @RapidSwitch on Twitter.

We'd also like to pass on RapidSwitch's profuse apologies for the disruption caused by this downtime.

24
Sep/09
0

Emergency Maintenance @ RS – Will Affect UK Server

We have just been notified by RapidSwitch, whose datacenter Fr3d uses to host the UK server, that it is likely that there will be a loss of connectivity on their network this evening at around 23:00 UK time.

This is due to RapidSwitch having to perform emergency network maintenance:

The maintenance is to perform an emergency upgrade of Cisco software. [...] The Cisco TAC team have diagnosed a fault with the software on the router in the form of a memory leak. Cisco has supplied us with a new version of the software for the router which will fix the memory leak and slow performance.

- RapidSwitch Support

They say that this maintenance should take no longer than 45 minutes, and that they will do all they can to speed everything up and reduce the maintenance time.

To maintain network cohesion (as much as possible), the DE server will automatically failover its link to the US server in the event it loses its connection, and cannot reconnect, to the UK server.

Update @ 01:50 25/09: RapidSwitch seem to be back online now. No word from them yet as to why the maintenance took so long; we will update this post when we know more.

22
Sep/09
0

Upgrading to InspIRCd 1.2.0 “Stable”

InspIRCd 1.2.0 has been released as "stable", and we will therefore be upgrading to it over the next day or so.

This release should fix the problems that the US server had been having in July and August.

Due to the US server having a netsplit earlier, we upgraded it a short while ago, and it is now back up.

The UK and DE servers will be upgraded in the small hours of tomorrow morning -- tomorrow being 23/9 -- to reduce impact to users as much as possible.

The upgrade should take less than 5 minutes, however thing don't always go as planned, so please bare with us if it ends up taking a little longer!

Update @ 02:00: The upgrade went as planned with less than 1 minute of downtime :-)