BASH: A simple script to check if a process is running
Scripting is very, very useful. Don’t make the same mistake I did and wait 5 years into you career to start learning! In the example below, I’m using the ‘ps -aux’ command piped into grep to determine if a process is still running. for my example I’m using Firefox, but intend to use this with a Asterisk phone system. Every minute cron will launch my script to see if Asterisk is still running. If it is, it will do nothing. If it isn’t, then it will attempt to restart asterisk and notify the Administrator. I hope this example helps someone!
——————————————————————————————–
#!/bin/bash
#set -x
#
# Variables Section
#==============================================================
# list process to monitor in the variable below.
PROGRAM1=”asterisk”
# varible checks to see if $PROGRAM1
# is running.
APPCHK=$(ps aux | grep -c $PROGRAM1)
#
#
# $Company & $SITE variables are for populating the alert email
COMPANY=”VoiceIP Solutions”
SITE=”Seattle”
# $SUPPORTSTAFF is the recipient of our alert email
SUPPORTSTAFF=”savelono@gmail.com”
#==================================================================
# The ‘if’ statement below checks to see if the process is running
# with the ‘ps’ command. If the value is returned as a ‘0’ then
# an email will be sent and the process will be safely restarted.
#
if [ $APPCHK = ‘0’ ];
then
echo mail -s “Asterisk PBX at $COMPANY $SITE may be down” $SUPPORTSTAFF < /dev/null
else
echo “$PROGRAM1 is running $APPCHK processes” >> asterisk-check.log
fi
echo $APPCHK
exit
15 Comments »
RSS feed for comments on this post. TrackBack URL
Thanks so much for posting this! I’ve been looking for a good “babysitting” script example. I have one small possible problem with this logic, though. This might be based on your version of linux, but when I do a
ps aux | grep $process
on my version of linux, it returns the $process as well as the grep statement. Example:$ ps aux | grep cupsd
root 11510 0.0 0.0 4024 704 pts/7 R+ 11:52 0:00 grep cupsd
root 22920 0.0 0.0 10592 2264 ? Ss 2009 8:27 cupsd
So if I were to do a:
$ ps aux | grep -c cupsd
It would return:
2
(one count for the process and one count for the grep statement)
Since you are doing:
if [ $APPCHK = ‘0’ ];
Even if the service was NOT running, you’d still get a count for the grep statement itself, so the alert would never be sent….
Might I suggest:
APPCHK=$(ps aux | grep $PROGRAM1 | grep -v “grep $PROGRAM1” | wc -l)
The first grep in this statement grabs every line in the ps aux that contains $PROGRAM1. This result is piped through the second grep that omits the line that has the “grep $PROGRAM1” statement itself, and the wc -l returns an accurate count of the remaining data.
Your welcome and thanks for the suggestion. I can’t believe I missed that! If you have any more ideas on tightening up this script please post! Take care.
Thanks for this. One question though…
Where in the above script does it actually restart the program if APPCHK value is 0? It might just be me, but I was expecting to see a line that actually starts the process that has terminated:
echo “/etc/init.d/someprocess start”
Thanks again,
Mike
Init scripts that start programs exit after they complete in most cases that I am aware of.
Okay. But where in the example above is the process (in this case “Asterisk”) actually restarted? I get that it will send an email, but i don’t understand yet where it’s restarted.
For example:
(below echo mail…)
service cups restart
Sorry for the questions, I just don’t get how it will be restarted.
Asterisk is usually started with an init script. The above would likely run on cron job or be daeomonized. In this Scenario I might not want to restart Asterisk as a secodary PBX would pick up the traffic. The script would alert me to a fail-over. Or I could add a line, “service asterisk restart”
Works with the following patch to correctly return the process status:
-PROGRAM1=”asterisk”
+PROGRAM1=”/usr/sbin/asterisk”
-APPCHK=$(ps aux | grep -c $PROGRAM1)
+APPCHK=$(ps aux | grep $PROGRAM1 | grep -v -c grep)
I tried this on my computer and didn’t work. Finally I found two versions of the script that can do the task. The first one is based on the script above, replacing aux by -ea:
#!/bin/bash
# Variables Section
#==============================================================
# list process to monitor in the variable below.
PROGRAM1=$1
# APPCHK varible checks to see if $PROGRAM1 is running.
APPCHK=$(ps -ea | grep $PROGRAM1 | grep -v grep | wc -l)
#==================================================================
# The ‘if’ statement below checks to see if the process is running
# with the ‘ps’ command. If the value is returned as a ‘0’ then
# a message is prompted
if [ $APPCHK -eq 0 ];
then
echo “$PROGRAM1 may be down”
else
echo “$PROGRAM1 is running”
fi
exit
========================================================================
And the next version is, imho, a bit more elegant:
#!/bin/bash
PIDorNAME=$1
ps aux | grep $PIDorNAME > /dev/null
if [ $? -eq 0 ]
then
echo “Still running”
else
echo “Not running any more”
fi
second version give always still running,ps aux | grep $NAME will not fail,it dislay grep process.
This should solve substring problems with service name. ” -u root” is optional, but services are usually run by root.
#!/bin/sh
SERVICE=’your-service-here’
if ps -o ” %c ” -u root | grep ” $SERVICE ” > /dev/null
then
echo “$SERVICE is running”
else
echo “$SERVICE is NOT running”
fi
Just use pgrep 😉
The script works fine. The only error that I found is that
[ $APPCHK = ‘0’ ] is never true. APPCHK runs a grep command with
the name of the program I’m checking so grep “finds” hinself and
$APPCHK = 1 when my program is not running. Otherwise is 2 or graeter.
Thanks !!!
but what if there are more than one service to check?
Copy the script and change the variables.
Thanks a lot!
Some changes I needed are mentioned above. On my UNIX system pgrep isn’t available. The lines with mail just print and didn’t send any email. I made these modifications:
– APPCHK=$(ps aux | grep -c $PROGRAM1)
+ APPCHK=$(ps -ef | grep -v grep |grep -c ${PRG1})
– echo mail -s … < /dev/null
+ SITE=$(uname -n)
+ echo "$MailBody" | mailx -s "$PRG1 is running still with $APPCHK process(es) on $SITE" ${SUPPORTSTAFF}