Monitoring your HBase cluster with a Raspberry Pi
Complex
want to monitor
- HBase Master
- HBase Region Servers
- Hadoop NameNodes
- Hadoop DataNodes
- Hadoop JobTracker
jmx monitoring
- HBase Master provides information of dead region servers
- If you cannot connect with Master to get information, you know it's dead (account for stand-by)
- Hadoop NameNode provides information of dead DataNodes
- If you cannot connect with the NameNode to get DataNode information = it's dead
Jenkins + Java
- Jenkins job that periodically runs (ex. 15 min)
- Java program is invoked from job that evaluates health of cluster through JMX
- Some other health checks are applied in the Jenkins job (ex. pings TaskTracker page to make sure it is up)
- Sends additional alerts
- Email alerts to individuals
- IRC notifications in our #hbase-medics channel
raspberry pi + Jenkins
- Python program that pings Jenkins API for health of job
- If un-healthy, set high signal on pin, turn on light
- If healthy, set low signal on pin, turn off light (or remain off)
- Only evaluate health and turn on light during business hours (when people are typically around)
~$26.00
$1.00
~$10.00
demo
#!/usr/bin/env python
import urllib
import json
import sys
import time
import RPi.GPIO as GPIO
import datetime
def main_loop():
while True:
now = datetime.datetime.now()
# If it is after hours, set the signal to low and skip evaluating
if now.hour < 8 or now.hour > 20:
GPIO.output(12, GPIO.LOW)
time.sleep(3600)
continue
try:
resp = urllib.urlopen('http://MYJENKINS_HOST/job/MYJENKINS_JOB/api/json')
if resp.getcode() != 200:
print 'Invalid HTTP response code at ', now.strftime("%Y-%m-%d %H:%M"), str(resp.getcode())
continue
# Sometimes we get invalid responses from Jenkins: ValueError("No JSON object could be decoded")
jsonResp = json.load(resp)
color = jsonResp['color']
# Checking that it does not equal red, vs. checking that it is blue (successful), as it appears the color changes
# when checking when the build is occurring.
if color != 'red':
# Set the signal low
GPIO.output(12, GPIO.LOW)
else:
# The job is failed, spin the light
print 'Jenkins job is failed at ', now.strftime("%Y-%m-%d %H:%M")
GPIO.output(12, GPIO.HIGH)
except Exception, err:
print 'Failed to get JSON from Jenkins at ', now.strftime("%Y-%m-%d %H:%M"), ' with ', str(err)
pass
time.sleep(30)
if __name__ == '__main__':
try:
GPIO.setwarnings(False)
GPIO.setmode(GPIO.BOARD)
GPIO.setup(12, GPIO.OUT)
main_loop()
except KeyboardInterrupt:
print >> sys.stderr, '\nShutting down...\n'
GPIO.cleanup()
sys.exit(0)
Monitor HBase w/Raspberry Pi
By Carl Chesser
Monitor HBase w/Raspberry Pi
Monitoring a HBase cluster with a Raspberry Pi
- 1,047