Occasionally an application may crash unexpectedly. Instead of reinventing the wheel I found a simple unix/linux daemon in Python for the daemon functionality and just added the run section logic. Using the bits below I can monitor or potentially restart the failed application. The run section forks a daemon which checks for the existence of a processes using
pgrep every 5 seconds in a loop. It reads from a file named observe.list for any number of named processes. Each line in observe.list should contain a unique process name like
puppetmasterd and on the next line
httpd or whatever you'd like to watch. I tried using
SysLogHandler but I'm not sure it was available with python 2.4.3 which ships with RHEL 5 or CentOS 5.
#!/usr/bin/env python import sys, time, subprocess, syslog from daemon import Daemon # file which contains the list of process to observe process_file = open('observe.list', 'r') process_list =  for line in process_file: process_list.append(line.rstrip('\n')) def isRunning ( process_name ): ps = subprocess.call("pgrep "+process_name, shell=True, stdout=subprocess.PIPE) if ps is 1: return False else: return True class Observe(Daemon): def run(self): while True: for process in process_list: if isRunning(process) == False: syslog.syslog(process + " not running!") time.sleep(5) else: syslog.syslog(process + ' is running!') time.sleep(5) if __name__ == "__main__": daemon = Observe('/tmp/observe.pid') if len(sys.argv) == 2: if 'start' == sys.argv: daemon.start() elif 'stop' == sys.argv: daemon.stop() elif 'restart' == sys.argv: daemon.restart() else: print "Unknown command" sys.exit(2) sys.exit(0) else: print "usage: %s start|stop|restart" % sys.argv sys.exit(2)