Discussion:
[python-win32] PyParallel - an experimental "multicore" fork of Python 3 for Windows
Trent Nelson
2015-07-31 20:14:43 UTC
Permalink
Hi folks,

Bit off-topic, but just wanted to let people know about an experimental
proof-of-concept fork of Python 3 I've been working on for the past
couple of years called PyParallel: http://pyparallel.org. It essentially
gets around the GIL limitations and allows Python code to run simultaneously
in multiple threads from within a single interpreter/process.

It's Windows only -- so I figured it might be of interest to this list.
It exclusively uses the threadpool APIs that got introduced in Vista,
and has been built from the ground up to leverage the "Windows-way" of
achieving high-performance socket I/O (using overlapped I/O *and*
multiple cores to service completions -- not just polling GQCS() in a
single-threaded event loop).

The performance is pretty phenomenal and it's appearing to scale very
linearly with CPU cores and I/O bandwidth, which is neat. The installer
includes a PyParallel-compatible version of NumPy and pyodbc, so you can
access large NumPy arrays or connect to databases in parallel callbacks.

It's an experimental project though at heart -- don't go using it in
production yet or anything.

Regards,

Trent.
Bob Hood
2015-07-31 21:05:14 UTC
Permalink
Post by Trent Nelson
Hi folks,
Bit off-topic, but just wanted to let people know about an experimental
proof-of-concept fork of Python 3 I've been working on for the past
couple of years called PyParallel: http://pyparallel.org. It essentially
gets around the GIL limitations and allows Python code to run simultaneously
in multiple threads from within a single interpreter/process.
It's Windows only -- so I figured it might be of interest to this list.
It exclusively uses the threadpool APIs that got introduced in Vista,
and has been built from the ground up to leverage the "Windows-way" of
achieving high-performance socket I/O (using overlapped I/O *and*
multiple cores to service completions -- not just polling GQCS() in a
single-threaded event loop).
The performance is pretty phenomenal and it's appearing to scale very
linearly with CPU cores and I/O bandwidth, which is neat. The installer
includes a PyParallel-compatible version of NumPy and pyodbc, so you can
access large NumPy arrays or connect to databases in parallel callbacks.
It's an experimental project though at heart -- don't go using it in
production yet or anything.
Outstanding, Trent. Python's horrible multithreading support is something
that has really kept it from being used in serious environments. Your work
may not cover all platforms yet, but this is a great first step!

Thanks for your efforts.
Zachary Ware
2015-08-01 04:01:15 UTC
Permalink
Post by Bob Hood
Outstanding, Trent. Python's horrible multithreading support is something
that has really kept it from being used in serious environments. Your work
may not cover all platforms yet, but this is a great first step!
I'd like to add a '[citation needed]' to your FUD here :). I think
Youtube and CERN both qualify as 'serious environments' from at least
two different perspectives. But that's a bit off-topic.

Trent, is PyParallel still based on Python 3.3, and do you have any
plans to rebase onto a newer branch at some point?
--
Zach
Trent Nelson
2015-08-03 00:13:22 UTC
Permalink
Hey Zach!
Post by Zachary Ware
Post by Bob Hood
Outstanding, Trent. Python's horrible multithreading support is
something that has really kept it from being used in serious
environments. Your work may not cover all platforms yet, but this
is a great first step!
I'd like to add a '[citation needed]' to your FUD here :). I think
Youtube and CERN both qualify as 'serious environments' from at least
two different perspectives. But that's a bit off-topic.
Trent, is PyParallel still based on Python 3.3, and do you have any
plans to rebase onto a newer branch at some point?
Yup, still technically based off 3.3.5. Once 3.5 has settled I'll
rebase on that, probably in a few months. I've been avoiding it because
of the memory allocation changes in 3.4 is going to make merging a bit
of a pain.

Trent.
Dennis Lee Bieber
2015-07-31 23:17:55 UTC
Permalink
Post by Trent Nelson
Hi folks,
Bit off-topic, but just wanted to let people know about an experimental
proof-of-concept fork of Python 3 I've been working on for the past
couple of years called PyParallel: http://pyparallel.org. It essentially
gets around the GIL limitations and allows Python code to run simultaneously
in multiple threads from within a single interpreter/process.
The name might be a problem... The old pyserial library also had a
parallel port module pyParallel, and is the first hit on Google for
"pyparallel"
--
Wulfraed Dennis Lee Bieber AF6VN
***@ix.netcom.com HTTP://wlfraed.home.netcom.com/
Trent Nelson
2015-08-01 00:39:18 UTC
Permalink
Post by Dennis Lee Bieber
Post by Trent Nelson
Hi folks,
Bit off-topic, but just wanted to let people know about an experimental
proof-of-concept fork of Python 3 I've been working on for the past
couple of years called PyParallel: http://pyparallel.org. It essentially
gets around the GIL limitations and allows Python code to run simultaneously
in multiple threads from within a single interpreter/process.
The name might be a problem... The old pyserial library also had a
parallel port module pyParallel, and is the first hit on Google for
"pyparallel"
Yeah that's definitely unfortunate... I found out about the old
parallel port module with the same name after I'd settled on
PyParallel. The website has only been up for a few days so I'm
not surprised it's not ranking very high.
Post by Dennis Lee Bieber
--
Wulfraed Dennis Lee Bieber AF6VN
Trent.
P***@Dell.com
2015-08-01 14:54:16 UTC
Permalink
Post by Trent Nelson
Hi folks,
Bit off-topic, but just wanted to let people know about an experimental
proof-of-concept fork of Python 3 I've been working on for the past
couple of years called PyParallel: http://pyparallel.org. It essentially
gets around the GIL limitations and allows Python code to run simultaneously
in multiple threads from within a single interpreter/process.
It's Windows only -- so I figured it might be of interest to this list.
Impressive. Is the Windows only aspect because you used Windows proprietary thread APIs? Or because there is something special about Windows thread mechanisms that doesn’t carry over to, say, POSIX pthread? I assume the former, in other words it should be adaptable to standard thread architectures with some effort.

paul
Trent Nelson
2015-08-03 00:10:19 UTC
Permalink
Post by P***@Dell.com
Post by Trent Nelson
Hi folks,
Bit off-topic, but just wanted to let people know about an
experimental proof-of-concept fork of Python 3 I've been working on
http://pyparallel.org. It essentially gets around the GIL
limitations and allows Python code to run simultaneously in multiple
threads from within a single interpreter/process.
It's Windows only -- so I figured it might be of interest to this list.
Impressive. Is the Windows only aspect because you used Windows
proprietary thread APIs? Or because there is something special about
Windows thread mechanisms that doesn’t carry over to, say, POSIX
pthread? I assume the former, in other words it should be adaptable
to standard thread architectures with some effort.
It's actually closer to the latter. I don't create a single thread with
PyParallel -- I use the threadpool facilities exclusively. POSIX doesn't
have anything like this -- the pthread abstraction is focused around the
thread (the "worker"), not the work. (OS X has GCD, which comes close.)

You *could* port PyParallel to POSIX, but you're either going to have to
implement all the stuff Windows provides for free, which is a pretty big
engineering effort. Even then, you're constrained by side-effects of
the readiness-oriented I/O model (versus the completion-oriented model
of Windows), so it becomes a lot harder to saturate all your I/O channels
*and* peg all of your cores at 100% under load. (A typical multithreaded
architecture will have a single thread dedicated to accept(), and then
simply round-robin dispatch new FDs to a manual thread pool. A thread
picks up the FD and adds it to its epoll/kqueue set -- that FD is now
bound to that thread, which can quickly result in lop-sided resource
allocation. The Windows I/O model revolves around the I/O request
packet, which allows the "thread agnostic" separation between the I/O
event (a write or read has completed) and the underlying thread used to
handle the callback.)
Post by P***@Dell.com
paul
Trent.
Glyph
2015-08-03 05:04:09 UTC
Permalink
Post by Trent Nelson
(OS X has GCD, which comes close.)
For what it's worth, libdispatch has been ported to other POSIX platforms: https://lists.macosforge.org/pipermail/libdispatch-dev/2011-April/000485.html <https://lists.macosforge.org/pipermail/libdispatch-dev/2011-April/000485.html>

-glyph
Trent Nelson
2015-08-03 12:15:56 UTC
Permalink
Post by Glyph
Post by Trent Nelson
(OS X has GCD, which comes close.)
For what it's worth, libdispatch has been ported to other POSIX
https://lists.macosforge.org/pipermail/libdispatch-dev/2011-April/000485.html
<https://lists.macosforge.org/pipermail/libdispatch-dev/2011-April/000485.html>
Hmmm, it looks like pthread_workqueue has made it over too, which is a
good sign. I know when FreeBSD first ported GCD it wasn't able to add
in the "scheduler feedback" aspect that allowed the threadpool library
to create new threads on demand:

https://github.com/tpn/pdfs/blob/master/Grand%20Central%20Dispatch%20-%20FreeBSD%20Dev%20Summit%20(18%20Sep%202009).pdf

(See the page starting with "pthread_workqueue"... although I just
realized that PDF is nearly 6 years old!)

The GCD approach is definitely the best option for getting PyParallel
working on other platforms, so this is interesting.
Post by Glyph
-glyph
Trent.

Trent Nelson
2015-08-03 00:08:45 UTC
Permalink
Post by P***@Dell.com
Post by Trent Nelson
Hi folks,
Bit off-topic, but just wanted to let people know about an
experimental proof-of-concept fork of Python 3 I've been working on
http://pyparallel.org. It essentially gets around the GIL
limitations and allows Python code to run simultaneously in multiple
threads from within a single interpreter/process.
It's Windows only -- so I figured it might be of interest to this list.
Impressive. Is the Windows only aspect because you used Windows
proprietary thread APIs? Or because there is something special about
Windows thread mechanisms that doesn’t carry over to, say, POSIX
pthread? I assume the former, in other words it should be adaptable
to standard thread architectures with some effort.
It's actually closer to the latter. I don't create a single thread with
PyParallel -- I use the threadpool facilities exclusively. POSIX doesn't
have anything like this -- the pthread abstraction is focused around the
thread (the "worker"), not the work. (OS X has GCD, which comes close.)

You *could* port PyParallel to POSIX, but you're either going to have to
implement all the stuff Windows provides for free, which is a pretty big
engineering effort. Even then, you're constrained by side-effects of
the readiness-oriented I/O model (versus the completion-oriented model
of Windows), so it becomes a lot harder to saturate all your I/O channels
*and* peg all of your cores at 100% under load. (A typical multithreaded
architecture will have a single thread dedicated to accept(), and then
simply round-robin dispatch new FDs to a manual thread pool. A thread
picks up the FD and adds it to its epoll/kqueue set -- that FD is now
bound to that thread, which can quickly result in lop-sided resource
allocation. The Windows I/O model revolves around the I/O request
packet, which allows the "thread agnostic" separation between the I/O
event (a write or read has completed) and the underlying thread used to
handle the callback.)
Post by P***@Dell.com
paul
Trent.
Loading...