Python перезапуск программы при ошибке

I am running my Python script in the background in my Ubuntu machine (12.04) like this —

nohup python > test.out &

Now, it might be possible that at some stage my above Python script can die for whatever reason.

So I am thinking to have some sort of cron agent in bash shell script which can restart my above Python script automatically if it is killed for whatever reason.

Is this possible to do? If yes, then what’s the best way to solve these kind of problem?

After creating the testing.conf file like this —

chdir /tekooz
exec python

I ran below sudo command to start it but I cannot see that process running behind using ps ax?

root@bx13:/bezook# sudo start testing
testing start/running, process 27794
root@bx13:/bezook# ps ax | grep
27806 pts/3    S+     0:00 grep --color=auto

Any idea why px ax is not showing me anything? And how do I check whether my program is running or not?

This is my python script —

while True:
    print "Hello World"

asked Jan 5, 2014 at 7:33

On Ubuntu (until 14.04, 16.04 and later use systemd) can use upstart to do so, better than a cron job. You put a config setup in /etc/init and make sure you specify respawn

It could be a minimal file /etc/init/testing.conf (edit as root):

chdir /your/base/directory
exec python

And you can test with /your/base/directory/

from __future__ import print_function

import time

with open('/var/tmp/testing.log', 'a') as fp:
    print(time.time(), 'done', file=fp)

and start with:

sudo start testing
tail -f /var/tmp/testing.log

and stop with:

sudo stop testing

answered Jan 5, 2014 at 7:59

You could also take a more shell oriented approach. Have your cron look for your script and relaunch it if it dies.

  1. Create a new crontab by running crontab -e. This will bring up a window of your favorite text editor.

  2. Add this line to the file that just opened

    */5 * * * * pgrep -f || nohup python /home/you/scripts/ > test.out
  3. Save the file and exit the editor.

You just created a new crontab which will be run every 5 minutes and launch your script unless it is already running. See here for a nice little tutorial on cron. The official Ubuntu docs on cron are here.

The actual command being run is pgrep which searches running processes for the string given in the command line. pgrep foo will search for a program named foo and return its process identifier. pgrep -f makes it search the entire command line used to launch the program and not only the program name (useful because this is a python script).

answered Jan 5, 2014 at 9:24

You shouldn’t really use this for production, but you could:


while true; do
  nohup python >> test.out
done &

If, for any reason, python process exits, the shell loop will continue and restart it, appending to the .out file as desired. Nearly no overhead and takes very little time to set up.

answered Jan 5, 2014 at 12:17

You can have the testing program redirect the output using a commandline option
and then use a simple python script to restart the program indefinitely:

import subprocess

while True:
        print subprocess.check_output(['python', ''])
    except KeyboardInterrupt:

you can put this program in the background, and once you want to stop just pull it into the foreground and kill it.

answered Jan 5, 2014 at 8:16

But you can look at alternatives init and in the Python code for Pardus: mudur daemon in particular.

If you decide to go with a cron job (and PID file handling) then consider reading this PEP 3143 and perhaps using its reference implementation.

As I alluded to in my other comments, robust PID file handling is tricky. It’s prone to races and corner cases. It gets trickier if there’s any chance that your PID file ends up on an NFS or other networked filesystem (some of the atomicity guarantees you get with the file handling semantics on proper local UNIX/Linux filesystems go away on some versions and implementations of NFS, for example). Also the semantics around file locking under UNIX can be tricky. (Does an flock or fcntl lock get released promptly, in your target OS, when the process holding it is killed with SIGKILL, for example?).

answered Jan 5, 2014 at 7:58
Jim Dennis

You can also use monit Or Process monitoring with ps-watcher

Monit is an open source utility for managing and monitoring,
processes, programs, files, directories and filesystems on a UNIX
system. Monit conducts automatic maintenance and repair and can
execute meaningful causal actions in error situations.

Here is example for your scenario:

check process myprocessname
        matching "myprocessname"
        start program = "nohup /usr/bin/python /path/ > /tmp/test.out &"
        stop program = "/usr/bin/killall myprocessname"

Take look at monit examples

answered Jan 5, 2014 at 9:43
Rahul Patil

You need a supervisor, you can use supervisor. It is python based supervisor, therefore easy to modify if you need to.

Control is with files with .ini file syntax.

answered Jan 5, 2014 at 10:59

Terdon’s answer, did not work for me, because
pgrep -f was never ‘failing’. It would grab the pid for the cron job (because of the -f option). However, without the -f option pgrep won’t find because there’s no process called

My solution to this was to change

pgrep -f
pgrep -f | pgrep python

this means the full crontab job would be:

*/5 * * * * pgrep -f | pgrep python || nohup python /home/you/scripts/ > test.out

answered Jun 15, 2017 at 20:00

In my case, as a quick-fix, I wanted to keep my program running when it exited with en error or it was killed.
On the other hand, I wanted to stop the execution when the program terminated correctly (return code = 0)

I have tested it on Bash. It should work fine in any other shell


echo ""
echo "Use: $0 ./"
echo ""

echo "Executing $1 ..."

(while [ $EXIT_CODE -gt 0 ]; do
    # loops on error code: greater-than 0

answered Dec 27, 2018 at 12:53

For terdon’s answer, pgrep -f will never return false according to the comments in here:

I think the issue is that cron spawns a shell to run your command, and the arguments of that shell are matched by pgrep since you are using -f

For Matt’s answer, pgrep -f is useless since pgrep python matches any running Python script. So if two Python script cronjob, the second cronjob will never run.

And then I found the solution to solve pgrep -f in the comment here:

My cron for running two Python scripts:

* * * * * pgrep -f '^/usr/bin/python36 /home/ec2-user/myscript1\.py' || nohup /usr/bin/python36 /home/ec2-user/

0 * * * * pgrep -f '^/usr/bin/python36 /home/ec2-user/myscript2\.py' || nohup /usr/bin/python36 /home/ec2-user/

answered Mar 8, 2019 at 19:28

In Ubuntu this works for me thanks to --wait


while :
  sleep 5
  gnome-terminal --wait -- sh -c "python3 'myarg1'"

answered Nov 17, 2020 at 20:44

There’s a Python module for that, forever.

The advantage being, hopefully, in using the same language for both the code and the watchdog. If it needs to be improved, one can find it in

cd $(python -c "import site; print(site.getusersitepackages())")

I’d install it with

python -mpip install --user --upgrade forever

and later use it with

python -t 9 -i 9 python

answered Feb 12, 2022 at 12:06

When the python script crashes, the program is not running anymore, therefore the script cannot execute more lines of code.

You have 2 options:

  1. Make sure your python script doesn’t crash, which is very much recommended. You can do this by handling the exceptions thrown by your program.

Option 1

I assume you are new to python, so here is an example of a python script that handles an exception calls the same function again.

from time import sleep

def run_forever():
        # Create infinite loop to simulate whatever is running
        # in your program
        while True:

            # Simulate an exception which would crash your program
            # if you don't handle it!
            raise Exception("Error simulated!")
    except Exception:
        print("Something crashed your program. Let's restart it")
        run_forever() # Careful.. recursive behavior
        # Recommended to do this instead

def handle_exception():
    # code here

  1. If you want to restart the python script you would need another python script (assuming you want to do this with python) that checks if the process is still alive and if not then run it again with python.

Option 2

This is the script that starts another python script called ‘’ via the command python
Make sure you have the right file path, if you put the scripts in the same folder, you usually don’t need the full path and only the script name.

Notably, make sure that command ‘python‘ is recognized by your system, it could in some cases by ‘python3’

from subprocess import run
from time import sleep

# Path and name to the script you are trying to start
file_path = "" 

restart_timer = 2
def start_script():
        # Make sure 'python' command is available
        run("python "+file_path, check=True) 
        # Script crashed, lets restart it!

def handle_crash():
    sleep(restart_timer)  # Restarts the script after 2 seconds


In case you are interested in the code I used for the test file: ‘’, I post it here.

from time import sleep
while True:
    raise Exception("Hello")

Decorator is a good approach.

from functools import wraps
import time

class retry:
    def __init__(self, success=lambda r:True, times=3, delay=1, raiseexception=True, echo=True):
        self.success = success
        self.times = times
        self.raiseexception = raiseexception
        self.echo = echo
        self.delay = delay
    def retry(fun, *args, success=lambda r:True, times=3, delay=1, raiseexception=True, echo=True, **kwargs):
        ex = Exception(f"{fun} failed.")
        r = None
        for i in range(times):
            if i > 0:
                r = fun(*args, **kwargs)
                s = success(r)
            except Exception as e:
                s = False
                ex = e
                # raise e
            if not s:
            return r
            if echo:
                print(f"{fun} failed.", "args:", args, kwargs, "nresult: %s"%r)
            if raiseexception:
                raise ex
    def __call__(self, fun):
        def wraper(*args, retry=0, **kwargs):
            retry = retry if retry>0 else self.times
            return self.__class__.retry(fun, *args, 
                                        raiseexception = self.raiseexception,
                                        echo = self.echo,
        return wraper

some usage examples:

@retry(success=lambda x:x>3, times=4, delay=0.1)
def rf1(x=[]):
    return len(x)
> rf1()

[1, 1]
[1, 1, 1]
[1, 1, 1, 1]

@retry(success=lambda x:x>3, times=4, delay=0.1)
def rf2(l=[], v=1):
    assert len(l)>4
    return len(l)
> rf2(v=2, retry=10) #overwite times=4

[2, 2]
[2, 2, 2]
[2, 2, 2, 2]
[2, 2, 2, 2, 2]

> retry.retry(lambda a,b:a+b, 1, 2, times=2)

> retry.retry(lambda a,b:a+b, 1, "2", times=2)

TypeError: unsupported operand type(s) for +: 'int' and 'str'

asked Mar 15, 2016 at 17:42

answered Mar 15, 2016 at 17:55

answered Jul 14, 2018 at 16:32

answered Oct 13, 2017 at 18:13

answered Nov 17, 2020 at 20:46

answered Mar 15, 2016 at 17:44

3 ответа

8 Апр 2016 в 13:53

8 Апр 2016 в 14:16

8 Апр 2016 в 23:32
From time to time, it may be necessary to restart the script. For example, if you fix a bug in it or change its configuration. One way of doing so is to kill the script and run it again. However, this requires manual intervention, which you may forget to do. When you fix a vulnerability in the script, you want to be sure that you do not forget to restart the script. Otherwise, someone may exploit the vulnerability if you did not restart the script. It would be nice if there existed a way of restarting the script within itself after it detected that its sources or a configuration file changed. In the rest of this post, we will show such a way.

# Parse the arguments and configuration files.

while True:
    # Wait for inputs and act on them.
    # ...

That is, it processes the arguments and loads the configuration from the configuration files. After that, the script waits for inputs and processes them in an infinite loop.

Next, we describe how to watch files for changes. After that, we show how to restart the script.

Checking Watched Files For Changes

First, we define the paths to the files whose change we want to watch:


We watch the global configuration file, the local configuration file, and the script itself, whose path can be obtained from the special global variable __file__. When the script starts, we get and store the time of the last modification of these files by using os.path.getmtime():

from os.path import getmtime

WATCHED_FILES_MTIMES = [(f, getmtime(f)) for f in WATCHED_FILES]

Then, we add a check if any of these files have changed into the main loop:

while True:
    for f, mtime in WATCHED_FILES_MTIMES:
        if getmtime(f) != mtime:
            # Restart the script.

    # Wait for inputs and act on them.
    # ...

If either of the files that we watch has changed, we restart the script. The restarting is described next.

Restarting the Script

#!/usr/bin/env python
$ chmod a+x

Then, you can run the script via

$ ./
os.execv(__file__, sys.argv)

Otherwise, when you run the script via

$ python

use this code:

os.execv(sys.executable, ['python'] + sys.argv)

Either way, do not forget to import the sys module:

import sys

To explain, the arguments of os.execv() are the program to replace the current process with and arguments to this program. The __file__ variable holds a path to the script, sys.argv are arguments that were passed to the script, and sys.executable is a path to the Python executable that was used to run the script.

The os.execv() function does not return. Instead, it starts executing the current script from its beginning, which is what we want.

Concluding Remarks

If you use the solution above, please bear in mind that the exec*() functions cause the current process to be replaced immediately, without flushing opened file objects. Therefore, if you have any opened files at the time of restarting the script, you should flush them using f.flush() or os.fsync(fd) before calling an exec*() function.

Complete Source Code

The complete source code for this post is available on GitHub.

Одной из важнейших задач при разработке серверных частей API или Ботов, обрабатывающих различные оповещения, является обеспечение их бесперебойной работы. Иными словами, необходимо автоматизированно произвести запуск всех необходимых скриптов при перезагрузке системы и перезапуск в случае ошибок и падений.

Одним из главных иструментов является supervisor.

Для установки достаточно набрать следующую команду в терминале (считаем что менеджер пакетов pip у вас уже уставнолен. Как это сделать читайте в статье Создаем бота для telegram).

sudo pip install supervisor

Супервизор позволяет следить за состоянием приложений, останавливать, запускать и перезапускать их. Для начала нам нужно создать конфигурационный файл для вашей программы, где мы опишем основные правила перезапуска и логирования. По умолчанию можно воспользоваться файлом:

Удалим все содержимое и впишем следующий текст:

В примере мы указываем, что хотим обеспечить контроль скрипта, находящегося в директории /var/www/ и уточняем еще пару настроек логирования ошибок. Если супервизор у вас еще не запущен, выполните команду:

Теперь можно попросить супервизор прочитать данные из конфига:

Для просмотра статуса запущенных процессов воспользуйтесь командой:

Процесс под названием “github_server” (название указывается в конфиге) будет находится в состоянии “STOPPED”.

Python перезапуск программы при ошибке

Теперь в статусе будет отображено состояние “RUNNING”. Для остановки процесса можно пользоваться командой:

Наконец, если принудительно завершить процесс:

sudo killall python

Он все равно будет перезапущен супервизором:

Состояние до завершения: github_server RUNNING pid 21755, uptime 0:03:57
Состояние после завершения: github_server RUNNING pid 21929, uptime 0:00:01

Видно, что сервер был перезапущен. Более подробно можно почитать в официальной документации.

