I hate to be surprised when it is too late to replace a failing hard drive. SmartMonTools stands for SMART Monitoring Tool, will query your hard drive for its health status. If you do this daily and setup an alert system to your email, you will most likely avoid a bad surprise in the future.
I highly recommend installing and using this smartmontools monitoring and alert for any server.
Here is how I have deployed on EACH on of my server:
THIS CAN BE INSTALLED ON ANY BARE METAL SERVER, FOR PROMOX PVE, THIS MEANS YOUR HARDWARE NODE.
aptitude update && aptitude -y install smartmontools
2. edit default daemon start configuration:
nano /etc/default/smartmontools
unremark all commented lines
enable_smart="/dev/sda /dev/sdb /dev/sdc"
start_smartd=yes
smartd_opts="--interval=28800"
3. edit smartd.conf (in this example I have 3 SATA drives: sda, sdb, sdc)
nano /etc/smartd.conf
/dev/sda -d sat -a -s L/../../7/4 -m john@smith.com,jack@jill.com
/dev/sdb -d sat -a -s L/../../7/5 -m john@smith.com,jack@jill.com
/dev/sdc -d sat -a -s L/../../7/6 -m john@smith.com,jack@jill.com
The above example will do the following:
1. scan sda at 4am Saturday
2. scan sdb at 5am Saturday
3. scan sdc at 6am Saturday
Email alert will be sent to john@smith.com and jack@jill.com if there is something wrong.
NOTE about the -s parameter:
The second from the last is the DAY parameter:
Sunday is day # 1
Monday is day # 2
...
Saturday is day #7
4. restart smartmontools
/etc/init.d/smartmontools restart
5. check current HEALTH status:
smartctl -H /dev/sdb
smartctl -H /dev/sdc
DONE!
No comments:
Post a Comment