Ruby server monitoring script
Tuesday, June 17th, 2008I wrote this script as a basic fix to monitor if certain sites are up. The script needs to be on a different server than where the sites exist for the best results and obviously with a cron job running every 5-10 minutes or more often if needed.
The ruby server monitoring script could also be expanded to ssh into another box and run the command needed to boot the server back up.
First let’s go over what the db looks like…
Sites
- id
- domain (url to curl)
- active (turn monitoring on/off for sites)
Reports
- id
- sent_at (time when report was sent)
- down (count of how many sites were down)
- fixed (report has been marked as fixed)
Ruby monitoring script
#!/usr/bin/ruby #change line above to reference your ruby location require 'rubygems' require 'net/smtp' require 'mysql' require 'time' require 'curb' def check @time = nil # Connect to db that stores all of our sites to monitor db = Mysql::new("localhost","user","pass","database") # Make sure we are only looking at sites that are active sites = db.query("SELECT * FROM sites WHERE active = 1") # Getting the most recent report since we only want to run the script to send a report every 2 hours last_report = db.query("SELECT * FROM reports WHERE fixed = 0 ORDER BY sent_at DESC Limit 0,1") last_report.each {|r| @time = r[1]} # If there isn't a report in the database yet it sets the time to greater than 2 hours so the script will run and report if needed. if @time.nil? @diff = 7201 else # Calculating the difference in time @report = Time.parse(@time) @now = Time.now @diff = @now - @report end # Change this number to whatever you woudl prefer in seconds. Currently it is 2 hours unless @diff.to_i < 7200 body = "" sites_down = 0 sites.each do |s| # Prints the site name to the user print s[1] # need a begin/rescue in case the domain can't be curled begin # Uses curb to curl the domain c = Curl::Easy.perform(s[1]) puts " - #{c.response_code}" # check http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for examples of response codes unless c.response_code < 399 # reports to the screen when manually ran that the site is down puts "--DOWN--" #increments the number of sites down and writes to the body of the email sites_down = sites_down + 1 body = body + "DOWN - #{s[1]}\n" end rescue # if the script is unable to curl the site we still mark it as down since we want to make sure we are in good shape sites_down = sites_down + 1 body = body + s[1] + " unable to curl, check record in db and ensure that it is being pointed\n" puts "--DOWN--" end #only send if sites are down end if sites_down > 0 #send text and email record_report = db.query("INSERT INTO reports(down) VALUES(#{sites_down})") # I use this one for a text message. send_email("A server needs help! #{sites_down} down!","destination_address") # this is for an inbox since it provides more details to aid in troubleshooting later send_email(body,"myemail@blah.com") end else # Since there is still a report that is in the db that was recorded less than 2 hours ago we just want to notify the user if running it from shell. puts "Please run 'up.rb' first and then 'mon.rb' again to curl all sites." puts "You are seeing this message because there is a report that hasn't been marked as fixed within the past 2 hours." end end def send_email(body,to) msg = < To: Admin Subject: Some sites are down #{body} END_OF_MESSAGE Net::SMTP.start('localhost') do |smtp| smtp.send_message msg, "from_address", to end end #need to run the function check
Script to update reports table so regular monitoring can continue (up.rb)
#!/usr/bin/ruby require 'mysql' db = Mysql::new("localhost","user","pass","database") up = db.query("UPDATE reports SET fixed = 1 WHERE fixed = 0") puts "Reports marked as fixed, resuming normal monitoring."
You should also set up a crontab such as….
This one runs every 5 minutes.
MAILTO=”"
*/5 * * * * sh -c $’/usr/bin/ruby /home/username/mon.rb’


