Python Generator, Coroutine and Continuation

25 Sep 2016

There are some ‘advanced’ constructs in Python that are very interesting but lesser known. They are generator, coroutine and continuation. They are actually not constructs specifically invented by Python; instead, they are concepts/components already existed long time ago in computing theory and practice. However, I will try to approach these concepts by giving a high level explanations and pointing to Python’s language support to these constructs, so we know how they are done in Python.

Preface - How Subroutine Works

Before we can comprehend those magic three constructs, firstly we need to understand how subroutine works in Python.

When Python makes a call, it allocates a frame object and put it on a call stack (not a traditional call stack/C stack in static memory; but a call stack in Python virtual machine). In the frame object, the following information is kept:

  1. local variables (name -> object bindings)
  2. offset to the current bytecode instruction; the offset is relative to the start of the code object’s immutable bytecode vector
  3. a evalution stack holding temps and dynamic block-nesting information

So basically, a frame stores a subroutine’s state information, and it’s a data structure kept by Python virutal machine (on OS heap memory, not on OS call stacks). When a subroutine returns, Python VM decrefs the frame and then the frame typically goes away.

Generator

Generators add two abstract operations, suspend and resume. When a generator suspends, it is exactly like a return except no decref-ing the frame, and that’s it! Therefore the information kept by the frame is not thrown away after suspension. A resume then restarting the frame at its next bytecode instruction, with the retained frame’s locals and eval stack. ‘Suspend’ is something only a generator can do, and ‘resume’ is something only its caller can do (although the caller can also be a generator).

And in Python, a generator is simply a function that yields.

Coroutine

Coroutines add yet another new abstract operation to generators: transfer. This gives coroutines the ability to also consume data (data that is transferred to them) whereas generators only produce data (data producer)! Transfer names a coroutine to transfer to, and gives a value to deliver to it. When A transfers to B, it acts like a generator suspends with respect to A and like a generator resumes with respect to B. Coroutines are working together as colleagues, no one is a subordinate of another. And unlike subroutines, coroutines have multiple entry and exit points, whereas subroutines have only one.

Coroutine is a very powerful technique to create light-weight, userspace (as opposed to kernel-space), concurrent and cooperative computing agents without using threads. It is so powerful that the core idea and mechanism behind several concurrent prorgramming (e.g. Greenlet) and non-blocking I/O networking (e.g. gevent and eventlet) libraries.

In Python, coroutine is introduced (via enhanced generators) since Python 2.5. The syntax and API of enhanced generator can be found in the Python2 documentation. For Python 3, the support of coroutine is even stronger, there are two ways of creating coroutines, one is via generators like Python 2 (but with an improved yield from syntax), the other is by using async def statement (introduced since Python 3.5).

Some examples and very nice introductory concepts of generator-based coroutine can be found here.

Continuation

Continuations are something different from generators and coroutines. They are much simpler than generators, even than a regular function call. In essence it is a supercharged goto statement (or nicknamed gotos with arguments). In theory a continuation is a function that computes the rest of the program, or its future (the continuation of the program). It is the basis for all control flow, from goto statements to exception handling, recursion, generators, coroutines, backtracking and even loops.

In implementation terms a continuation adds an abstract operation: capture, which captures the program counter, call stack, and local block stack at its point of invocation, and packages all that into a first-class continuation object. In other words, the program resumption state is captured at some point and assigned to a variable.

A continuation can be captured anywhere, and be invoked at will from anywhere else. Continuation is not like a function call though, in practice, it is a call that never returns to its caller (a goto!); it is abandoning the current continuation, replacing it with another one.

In Python there is no direct language support (call-with-current-continuation) to write code in Continuation Passing Style (CPS), however, Python can achieve CPS without call/cc. See examples here. And checkout more detailed explanations from Stackless Python.

Conclusion

A coroutine can be seen as a more general generator; a generator is a semi-coroutine. In the modern implementation of Python they are not difficult to implement because all the ‘state’ information is contained in a single frame object and is managed by Python VM. Continuation is something even more fundamental, a construct that unifies all, you can write continuation passing programs in Python, or in any language that supports some form of closures and automated garbage collection, although it might not be as intuitive and convenient.

P.S. Most parts of this post are extracts and summaries based on Tim Peter’s answer of a discussion in a mailing list.