jeudi 4 janvier 2007

Action! Host-Certificate-wrong alarms on Castor-2 diskservers

Carlos C calls about Host-Certificate-wrong alarms that the operators received this afternoon. This is the first time ever that these alarms fire...

This alarm is generated on Castor diskservers when the host certificate will expire in less than 2 weeks. This is a 'last resort', as the service manager has already been contacted by other means well before. In this case, the service manager has not acted (it's Christmas time!), and the expiry data of January 18 is coming close :)

Thanks to our new host-certificate-manager script (integrated in PrepareInstall) it is simple to fix this:


PrepareInstall --noks --noaims --hostcert lxfsra{060{3,4,5},080{1,2,3,6,7,8}}
wash root@lxfsra060\[3,4,5],lxfsra080\[1-3,6-8] ccm-fetch \; ncm-ncd --co sindes


Even though the alarm was merely a warning, there has been a nasty side-effect; the nodes are disabled by the CastorFilesystemConfiguration.pl cronjob running every ten minutes on the rmmaster nodes. High time to get rid of this cronjob...

mardi 2 janvier 2007

Action! Mail flood coming from lxfsrk421

Frederic Hemmer forwards:

From: SmtpMonitorSink
Sent: Tuesday, January 02, 2007 8:33 AM
To: exchange-service (Exchange service list)
Subject: CERNMX06: Flood blocked !

CERNMX06:
Flood from: 128.142.169.11 in scope InternalIpOutgoing-ByIp blocked !

This is an automatic information email, do not reply.

It seems that the machine is trying to send mails to byniek.zb@wp.pl, and that this has started to fail at 7:15 this morning. In /var/log/maillog there are plenty of records like this one:


Jan 2 09:31:57 lxfsrk421 sendmail[3830]: l028Vqt6003828: to=, delay=00:00:05, xdelay=00:00:05, mailer=relay, pri=30478, relay=cernmxlb.cern.ch. [137.138.166.163], dsn=5.7.1, stat=User unknown
Jan 2 09:31:57 lxfsrk421 sendmail[3830]: l028Vqt6003828: l028Vvt6003830: DSN: User unknown
Jan 2 09:31:57 lxfsrk421 sendmail[3830]: l028Vvt6003830: to=, delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=31502, relay=cernmxlb.cern.ch., dsn=4.0.0, stat=Deferred: Connection reset by cernmxlb.cern.ch.


I stop sendmail on lxfsrk421 at 9:30, and notify PDB.Service@cern.ch.