Monday, November 22, 2010

Removal of secondary mail server mail-relay20.c2internet.net

Due to hardware failure the server mail-relay20.c2internet.net is being removed from service.

This server's only role was to operate as a secondary/backup mx to customers requesting this functionality where those customers operated their own primary mail servers.

While initially this type of setup was the norm as the war against spam continues these type of backup servers have been targetted as easy routes in. This brings rise to a few problems;

It's not uncommon for these backup servers to be whitelisted/trusted by the primary server, thus totally defeating any anti-spam techniques they are utilising. The backup servers will accept all mail for the domains where it is told to be the secondary, if when forwarding that email to the primary server the primary server rejects a mailbox as unknown the backup server will want to send a non-delivery report. If the originating email was from a forged email address then these NDR's clog the system further which just puts extra load on the server for no real good reason. Worst case is the NDR's are sent to a valid email address but one which had nothing to do with the original email, at which point the server is generating backscatter which is every bit as bad as spam.

If the primary mail server was to fail most sending servers will now quite happily queue email, notify the sender of any sending delays and generally look after sending the email again after a few minutes when the server comes back up.

With all this in mind will we shortly be removing all entries from DNS for mail-relay20.c2internet.net. The unusual thing here is customers who have been using the service may well see a drop in the amount of incoming spam to that of which they had been used to.

This does not affect customers that have their own secondary mail servers

DSL Connections via BT

-- 15:15

Fault is now cleared - we will continue to monitor.

Apologies for any inconvenience.



-- 14:26

We're seeing a number of lines coming back up, though as yet have had no notification of this fault being cleared. We will continue to monitor.


-- 13:49

There is an issue affecting a number of tail circuits that are provided over the BT wholesale network. This is affecting a number of ISPs and is not related to anything within our network or anything under our direct control. The issue is being investigated and more information will be posted as soon as is available.

Wednesday, November 10, 2010

Service Outage report for 10th November 2010

17:32-

We've just had confirmation that the outage was from two separate faults happening in two separate geographic locations, one fault was on the providers Leeds to Sheffield connection, the other on their Warrington to Birmingham connection.



13:42-

At 10:50am this morning we lost both our west and east-bound connections from Manchester to London, this had the outcome of partitioning our core network into two. This partitioning would have caused routing issues and due to the location of name and radius servers within the network name lookup and xDSL authentication would also have failed.

Our transit feed out of Manchester was also experiencing problems which as this issue cleared at the same time our connections came back up was no doubt down to the same core root problem.

With the issue affecting multiple providers it was clear the problem was itself not within any equipment within our direct control or the outcome of any of our actions within the network.

Our main telephone system is also based out of Manchester however when the server went offline it failed over onto the backup analogue PSTN system, the number of incoming calls obviously proving a challenge.

We're currently in discussion with our network provider for the Manchester to London connections as these routes should be separate and diverse, initially they also went via separate providers however due to consolidation within the market one provider has ended up owning both networks. If it transpires that our provider has without our knowledge or authorisation joined these pathways then of course action will be taken.

At 12:40pm both connections came back up, with the exception of transit our of Manchester once the network had re-converged connections and traffic flows returned to normal. Approximately ten minutes after our connections re-established transit via our transit provider also re-established.

Our apologies for this outage and the inconvenience.