Thursday, 3 June 2010

Hadoop Cluster Problems

Network Instability

The cluster had intermittent network availability, sometimes it would accept ssh connections only sometimes. The connections only seemed to last a period of 10 minutes before they were disconnected.
It was initailly thought that it may have been something to do with the dhcp server but it turned out that this was not the cause.
One of the symptoms was during a ping to the DNS server one would get a few returns then a few Destination Host Unreachable errors.
It was eventually traced to having two gateways setup in the '/etc/network/interfaces' one on the public network and one on the private network, when this was corrected to having only one public gateway this fixed the problem.

No comments:

Post a Comment

BT Infinity Finally!

I have finally received super-fast broadband and it was enabled on Wednesday morning around 10am. I never noticed till a day later when I lo...