GAE python threads not executing in parallel

2020-02-23 03:36发布

I am trying to create a simple web app using Python on GAE. The app needs to spawn some threads per request received. For this I am using python's threading library. I spawn all the threads and then wait on them.

t1.start()
t2.start()
t3.start()

t1.join()
t2.join()
t3.join()

The application runs fine except for the fact that the threads are running serially rather than concurrently(confirmed this by printing the timestamps at the beginning/end of each thread's run() method). I have followed the instructions given in http://code.google.com/appengine/docs/python/python27/using27.html#Multithreading to enable multithreading

My app.yaml looks like:

application: myapp
version: 1
runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /favicon\.ico
  static_files: favicon.ico
  upload: favicon\.ico

- url: /stylesheet
  static_dir: stylesheet

- url: /javascript
  static_dir: javascript

- url: /pages
  static_dir: pages

- url: .*
  script: main.app

I made sure that my local GoogleAppLauncher uses python 2.7 by setting the path explicitly in the preferences.

My threads have an average run-time of 2-3 seconds in which they make a url open call and do some processing on the result.

Am I doing something wrong, or missing some configuration to enable multithreading?

3条回答
Lonely孤独者°
2楼-- · 2020-02-23 04:16

Are you experiencing this in the dev_appserver or after uploading your app to the production service? From your mention of GoogleAppLauncher it sounds like you may be seeing this in the dev_appserver; the dev_appserver does not emulate the threading behavior of the production servers, and you'd be surprised to find that it works just fine after you deploy your app. (If not, add a comment here.)

Another idea: if you are mostly waiting for the urlfetch, you can run many urlfetch calls in parallel by using the async interface to urlfetch: http://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html

This approach does not require threads. (It still doesn't properly parallelize the requests in the dev_appserver; but it does do things properly on the production servers.)

查看更多
放我归山
3楼-- · 2020-02-23 04:21

If your threads are mostly waiting for datastore operations, you may try the NDB module that's part of 1.6.2. The semantics will be close enough to what you are doing.

IIRC, the multithreading flag enables one server instance to serve multiple requests on separate threads, but won't allow you to start threads yourself. If you didn't need to sync them before returning, you could put them on separate tasks and delegate them to one or more task queues.

查看更多
Rolldiameter
4楼-- · 2020-02-23 04:31

The multithreading notes for GAE are merely for how requests are handled - they don't fundamentally change how Python threads work. Specifically, the "CPython Implementation Detail" note in the threading module docs still applies.

It's also worth mentioning the note in the "Sandboxing" section of the GAE docs:

Note that threads will be joined by the runtime when the request ends, so the threads cannot run past the end of the request.

查看更多
登录 后发表回答