How to Add Raid Monitoring to Zabbix Agent
On zabbix
In this case,we need to add raid monitoring to notify us if there is an error with the raid.note: i'm using MegaCli for this
- Login to zabbix server web interface
- add new template,name Template_LSI_RAID_Active
- add new application into it,name RAID
- add item,set value similar with this
- add trigger,set value like this
Back to template list,and select you new LSI Template,
- add Macro
*you should have new template now,it can be used to another host
- Link your host with LSI Template,it should have new triggers about Raid Status
- Login to your linux agent,
- edit your conf
nano /etc/zabbix/zabbix_agentd.conf
- add this to end of configuration line
UserParameter=raid.lsimegaraid.numoptimallds,sudo /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL -NoLog | grep '^State[[:space:]]\+:[[:space:]]\\*Optimal' | wc -l
- save file and restart agent
- It should be done.
Let me explain about this .conf new value :
- UserParameter=raid.lsimegaraid.numoptimallds = used to add new task for agent,in this case it will run trigger match up with "key" raid.lsimegaraid.numoptimallds (your LSI key)
- sudo /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL -NoLog = this is command to check the status of RAID,you can run this on the console
[root@linux ~]# /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL -NoLog
Adapter 0 -- Virtual Drive Information:Virtual Drive: 0 (Target Id: 0)Name :RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0Size : 456.175 GBMirror Data : 456.175 GBState : OptimalStrip Size : 64 KBNumber Of Drives : 2Span Depth : 1Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBUCurrent Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBUDefault Access Policy: Read/WriteCurrent Access Policy: Read/WriteDisk Cache Policy : Disk's DefaultEncryption Type : NoneIs VD Cached: No
- grep '^State[[:space:]]\+:[[:space:]]\\*Optimal' = used to find character inside the result of above command,this is regexp related.
- wc -l = to give us the count of character found by above command.
If we combine all of that command,we will find this
[root@linux ~]# /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -aALL -NoLog | grep '^State[[:space:]]\+:[[:space:]]\\*Optimal' | wc -l
1
That mean,by running that raid check command,we find 1 variable that says "Optimal" and wc -l tell us by number
we got a conclusion that our raid was no error.Got this ?
Here is the process :
- Agent send the data
- Zabbix server will have that "1" value,match up with macros {$EXPECTED_OPTIMAL_LDS}
*If raid was error,macro should get "0" value instead of "1",because wc - l doesnt find any "Optimal" variable.
because of "0" doesn't match up with macros value,they will give alert regarding to this.
I hope you got my explanation
Cheers !
Is there any graphs / etc for this?
ReplyDeleteHi,
DeleteThere should be graph for this, but it will only give you value 0 and 1