Download Install Tutorial Docs FAQ Tools WikiLicense Team IRC Planet Involvement Shop Book

Ticket #581 (defect)

Opened 2 years ago

Last modified 6 months ago

Problem Re-spawning on Mac OS X 10.5.2

Status: closed (worksforme)

Reported by: mailmesuf@gmail.com Assigned to: fumanchu
Priority: high Milestone: 3.1
Component: engine Keywords: osx mac tiger execv
Cc: luke@adpinion.com

This must be a platform problem because it does seem to work on a Debian server.

I'm running:

  • Mac OS X 10.4.7
  • Mac Python 2.4.3
  • cherrypy 3 from svn (2 okt 2006)

The server dies and i get the following error after i modify the return string from the example on the Cherrypy.org frontpage:

[02/Oct/2006:21:20:16] HTTP HTTP Server shut down
[02/Oct/2006:21:20:16] ENGINE CherryPy shut down
[02/Oct/2006:21:20:16] ENGINE Re-spawning root.py
Traceback (most recent call last):
  File "root.py", line 8, in ?
    cherrypy.quickstart(HelloWorld())
  File "/Volumes/storage/Projects/bookr/svn/cherrypy/__init__.py", line 29, in quickstart
    engine.start()
  File "/Volumes/storage/Projects/bookr/svn/cherrypy/_cpengine.py", line 78, in start
    self.block()
  File "/Volumes/storage/Projects/bookr/svn/cherrypy/_cpengine.py", line 88, in block
    self.autoreload()
  File "/Volumes/storage/Projects/bookr/svn/cherrypy/_cpengine.py", line 150, in autoreload
    self.reexec()
  File "/Volumes/storage/Projects/bookr/svn/cherrypy/_cpengine.py", line 115, in reexec
    os.execv(sys.executable, args)
OSError: [Errno 45] Operation not supported

Attachments

macexecv.patch (1.1 kB) - added by wsimmons on 11/08/06 13:31:32.
Possible fix

Change History

10/06/06 12:14:26: Modified by fumanchu

  • keywords changed from osx mac tiger respawning to osx mac tiger execv.
  • priority changed from normal to high.

Bah. Looks like execv wasn't available for Mac until Python 2.4.1/2.5. We'll have to special-case Mac platforms on previous Python versions and use a different call.

Curious why it fails for you on Python 2.4.2, though. The docs say it's supported: http://www.python.org/doc/2.4.2/lib/os-process.html

10/06/06 15:56:44: Modified by mailmesuf@gmail.com

I think ik might be a bug in Mac Python because when executing execv it crashes the Python interpertor:

mac:~ mailmesuf$ python2.4
Python 2.4.3 (#1, Apr  7 2006, 10:54:33) 
[GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.execv('/bin/echo', ['foo', 'bar'])
bar
mac:~ mailmesuf $

Guess i'll take this one to python.org for further troubleshooting.

10/12/06 15:41:15: Modified by lawouach

Well I get the same behavior under Linux

sylvain@9[~]$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.execv('/bin/echo', ['foo', 'bar']) bar sylvain@9[~]$

10/12/06 15:42:17: Modified by lawouach

The same behavior on Mac OS X 10.4.8 with Python 2.3

10/13/06 13:17:09: Modified by fumanchu

If we can't figure out a programmatic way to detect this, we should at least provide a config entry which triggers the use of something other than execv to respawn.

11/04/06 17:26:58: Modified by fumanchu

Ahem. execv is supposed to end the current process on all platforms. It's not a "crash" when you request it... ;)

11/04/06 17:47:40: Modified by fumanchu

From http://uninformed.org/index.cgi?v=1&a=1&p=16:

Mac OS X has an undocumented behavior concerning the execve() system call
inside a threaded process. If a process tries to call execve() and has more
than one active thread, the kernel returns the error EOPNOTSUPP. After a
closer look at kern_exec.c in the Darwin XNU source code, it becomes
apparent that for shellcode to function properly inside a threaded process,
it will need to call either fork() or vfork() before calling execve().

11/04/06 17:54:00: Modified by fumanchu

  • description changed.

See http://webalizer.rtin.bz/ftp.mrunix.net/pub/webalizer/contrib/log.py for an example of fork used with execve and waitpid.

11/08/06 12:23:46: Modified by wsimmons

This patch (against revision 1427) works for me. This code quits all other active Timer threads since the execv call will not work on darwin with more than one active thread.

Sometimes the first call to execv will fail, but it will succeed on the second call. My guess is that some of the threads are still in the process of exiting.

Tested on Mac OS X 10.4.8 using python 2.4.3 from DarwinPorts?.

Index: cherrypy/_cpengine.py
===================================================================
--- cherrypy/_cpengine.py	(revision 1427)
+++ cherrypy/_cpengine.py	(working copy)
@@ -112,6 +112,23 @@
         
         if sys.platform == "win32":
             args = ['"%s"' % arg for arg in args]
+        elif sys.platform == "darwin":
+            # Can't call "execv" on darwin with more than one active thread
+            for thread in threading.enumerate():
+                if thread == threading.currentThread() or not thread.isAlive():
+                    continue
+                if hasattr(thread, 'cancel'):
+                    cherrypy.log("Quitting thread: %s" % thread, "ENGINE")
+                    thread.cancel()
+                    thread.join()
+                else:
+                    cherrypy.log("Can't quit thread: %s" % thread, "ENGINE")
+
+            try:
+                os.execv(sys.executable, args)
+            except:
+                cherrypy.log("Couldn't restart, retrying...", "ENGINE")
+                os.execv(sys.executable, args)
         os.execv(sys.executable, args)
     
     def autoreload(self):

11/08/06 13:31:32: Modified by wsimmons

  • attachment macexecv.patch added.

Possible fix

11/17/06 12:21:45: Modified by fumanchu

What I don't understand yet is where these "other active Timer threads" are coming from. The intent is that all threads started by CP will have been stopped before the execv call; this should happen in the server.stop and engine.stop calls just before that. If you have started others in your own code or 3rd party code, the recommended way to deal with those is to add a callback to on_stop_engine_list that gracefully shuts down those threads. That's a more appropriate place because not all threads will have a 'cancel' method (and if they haven't been started by Python code, may not even be listed in enumerate). In other words, your application should be taking care of shutting down these threads regardless of what platform you're on.

11/18/06 04:19:49: Modified by wsimmons

There must be a bug in the closing of the threads then, because this error occurs with the most simplest of programs (like the hello world example). On my computer there is always at least one Timer thread running, and I have seen three running some times with one of my programs (and I do not start any of these threads myself). Unfortunately, I do not think that I understand the CherryPy code enough to debug where these threads are coming from, and why they aren't being killed before restart is called.

11/18/06 13:12:13: Modified by fumanchu

  • owner changed from rdelon to fumanchu.
  • status changed from new to assigned.

Aha. There is a bug: cherrypy.engine.monitor_thread.cancel() is called but not .join(). Fixed in [1434].

However, that's not enough, as even the join() call returns before the thread is truly terminated. Using enumerate isn't enough either, as there is a minute amount of code to run in the thread even after it is removed from the list of _active threads (see threading.Thread.__delete, and __bootstrap, which calls it).

So we can:

  1. Use enumerate and pray.
  2. Use enumerate and time.sleep and pray a little bit less.
  3. Try to use an OS-specific system call to check the number of threads for our process and wait until it's 1.

11/18/06 13:21:47: Modified by fumanchu

  • status changed from assigned to closed.
  • resolution set to fixed.

Oh, never mind. We can just keep retrying execv until we don't get the exception. :) See [1436]. I also changed the session timer in [1435].

11/18/06 13:29:05: Modified by fumanchu

Engine.reexec_retry (timeout in seconds for os.execv retry) added in [1437].

10/18/07 21:51:32: Modified by dan

Retrying on the child thread didn't work properly because the main thread waits indefinitely for all of the others to exit.

r1757 fixes this by moving the call to execv to the main thread (after the child threads have exited).

03/23/08 10:34:57: Modified by guest

  • status changed from closed to reopened.
  • summary changed from Problem Re-spawning on Mac OS X 10.4.7 to Problem Re-spawning on Mac OS X 10.5.2.
  • resolution deleted.
  • milestone changed from 3.0 to 3.1.

Hi, This is still occurring for me on 10.5.2 (and has been since at least 10.4.10) using CherryPy 3.1.0 beta3.

[23/Mar/2008:08:33:07] ENGINE Re-spawning rundemo.py
Traceback (most recent call last):
  File "rundemo.py", line 474, in <module>
    cherrypy.quickstart(cp_app, config = 'cherrypy.cfg')
  File "/Library/Python/2.5/site-packages/CherryPy-3.1.0beta3-py2.5.egg/cherrypy/__init__.py", line 254, in quickstart
    engine.block()
  File "/Library/Python/2.5/site-packages/CherryPy-3.1.0beta3-py2.5.egg/cherrypy/restsrv/wspbus.py", line 231, in block
    self._do_execv()
  File "/Library/Python/2.5/site-packages/CherryPy-3.1.0beta3-py2.5.egg/cherrypy/restsrv/wspbus.py", line 250, in _do_execv
    os.execv(sys.executable, args)
OSError: [Errno 45] Operation not supported

03/23/08 11:03:37: Modified by guest

  • cc set to luke@adpinion.com.

03/25/08 19:48:33: Modified by fumanchu

luke, have you tried increasing the value of engine.reexec_retry?

04/26/08 18:53:01: Modified by fumanchu

  • status changed from reopened to closed.
  • resolution set to worksforme.

Hosted by WebFaction

Log in as guest/cpguest to create tickets