Skip to content
Bytes by Ying
Go back

Concurrency with Python: Functional Programming

Edit page

The Concurrency with Python Series:


Overview

In contrast to the threading/locking concurrency model I described previously, the functional concurrency model abstracts most if not all hardware primitives out of the application picture. No mutable state, and no side effects, can exist in (pure) functional programming — even though the Von Neumann architecture, that all contemporary, commercially useful computer microarchitectures are based on, literally composes finite state machines that remain wholly stateful. This is possible because after compilation, all program instructions and variables are stored in memory, and all functions become CPU branches, regardless of what design paradigms a high-level programming language specifies. In this sense, a functional language and its corresponding toolchain act as a declarative shim for the imperative reality.

Out of all the concurrency models I have covered or likely will cover in this series, functional programming is my favorite, for a number of reasons:

In a philosophical sense, functional programming allows us to abstract our thinking away from reality and our mortal coil. This great Stack Overflow post describes John Backus’s perspective towards imperative programming and its ties to the von Neumann bottleneck between the CPU and memory (quote from post with emphasis reposted here for ease of access):

Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger conceptual units of the task at hand. Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant data itself but where to find it.

Python and Functional Programming

Python is an impure functional programming language. The Python specification, and all its implementations, are tightly coupled to Von Neumann primitives. Guido may say Python remains a wholly object-oriented language as functions are merely first-class objects, as I described in a prior blog post about Python functions in practice.

In particular, specific design decisions may lead to Python not being a great language for using serious functional programming in:

Functional Things You Can Do In Python

Even with its drawbacks in regards to “doing things the functional way”, Python provides enough functional utilities to optimize certain code snippets with functional bits. Here are some useful, Pythonic code snippets you can use in Python to see functional programming in action:

List Comprehensions and Generator Expressions

List comprehensions expose the data within a list at a top level during struct creation, and make you apply a single transformation to the entire list. It makes your dataflow extremely visible, which is a good first step when transitioning from an object-oriented paradigm to a functional paradigm. It also makes you think twice before doing things like use conditional statements, since transforming all the data at once makes you think more about the similarities in the data, rather than the differences.

While writing this blog post, I was pleasantly surprised to discover that list comprehensions were not syntactic sugar around loops, but rather optimized at the bytecode level to cut out unnecessary internal method calls. Ashwini Chaudhary wrote a great Stack Overflow answer to this very question, by using the dis module in a low-level deep dive.

Generator expressions can be thought of as the lazily evaluated analogue to list comprehensions. The two work very well together. Oftentimes when I am debugging a generator expression, I replace it inline with a list comprehension before adding import ipdb; ipdb.set_trace() to examine the generated data, correct my statements, and reinsert the generator.

Guido talks about list comprehensions and generator expressions at length, and how best to use them in this blog post.

Python Functional Libraries

The Python standard library has a large number of built-in functions, with a subset of common functional programming paradigms included like all(), any(), map(), sum(), and zip(). I’ve found these functions work well and remain idiomatic when combined with list comprehensions, as you can port any datashape altering transform (i.e. reducers) to outside the comprehension.

I first experienced the wonders of functional programming in JavaScript using lodash, and as it turns out, Python experienced something similar with a library called toolz, that extends the paradigms of libraries like itertools and functools, and also supports extending multi-process, parallelizing libraries in Python.

Other Functional Optimizations

Trampolining

Trampolining, in high-level programming, is a way to avoid growing your call stack by defining an identity function that captures all control transfers of an otherwise recursive problem. This “thunk” function is then passed into a looping function (trampoline); this way, the thunk function does not recurse. Jake Tauber wrote a great blog post about defining a trampolined factorial function in Python.

I’ve never done this and would not recommend doing this in Python beyond as an academic exercise, both due to the lack of tail call optimization in Python and the latent but very real possibility that your fellow developers would need to think quite hard about how and why you are implementing application-side tail recursion elimination.

Currying

Currying is a form of incremental binding of function arguments, lazily, iteratively and dynamically defining ever more specialized functions given the n-ary function signature passed in at the top level. I think of currying as the functional analogue to the object-oriented way of method chaining on an object instance (if only the methods were dynamically defined). Trilarion wrote a great Stack Overflow answer as to why you may want to use currying in Python.

Again, I also don’t know how Pythonic this is. I would prefer to accept default parameters in the function signature instead.

Monoids, Monads, and Category Theory

I don’t think I can explain monads as nicely as others as of this moment, and I currently have no background in monads, so I will provide links instead.

Nikolay Grozev wrote a great blog post detailing how monads can reduce glue code required in imperative languages, and he referenced this helpful blog post on multiple monads implemented in Python.

I really don’t think using monads in Python is a good idea, for the same reason you shouldn’t start a company in Lisp: it’s too powerful. This Hacker News post on “The Programming Language Conundrum” explains how this power makes it very hard to scale out development teams, since it tightly couples brains and your (effective) language re-specifications. In addition, I don’t think I fully understand how monads fail, and until then, I’d rather write simple code that fails nicely. They are mysterious and cool to me, though, and I would like to understand them better.


This is merely a small sample of the different kinds of patterns functional programming exposes you to. Plenty more are visible online, such as on Jeremy Gibbon’s blog (although the examples may not be written in Python).

So Where’s the Parallelization?

The nice thing about pure functions is you can easily pass them into multiprocessing.Pool.map, where the Python internals handle the task-level execution (and even the chunking of your data if you want). Since pure functions certainly don’t interact with the threading model explicitly, and since Python assumes a single thread of execution otherwise, you can be pretty sure that pure functions mapped over data using multiprocessing.Pool.map will not encounter issues like deadlock or race conditions.

One thing to keep in mind with using multiprocessing.Pool is that child processes instantiated within the process pool are always daemonic. Since Python daemonic processes cannot have children, you cannot nest Pool and pooled processes, resulting in a fairly flat and explicit multiprocess model. See this Stack Overflow answer on the same question, and the initialization of a Pool object, to see how worker threads behave in a process pool.

In addition, I don’t think Python has any built-in parallel reducers like Clojure’s reducers/fold. The closest thing may be something like toolz.sandbox.parallel.fold. You could also implement your own parallel reducers, like this Stack Overflow answer, but you will need to manually keep in mind the interactions your method has within the overall call graph of your application, and it may be expensive to maintain in production.

I don’t think anybody’s made any serious libraries around parallelizing functional work in pure Python because of the global interpreter lock. Much of the highly used numeric processing libraries like pandas and numpy, which allow for pure functional transforms, are written in Cython, which allows release of the GIL and effective parallelism.

Conclusion

If you imagine programming languages along an adoption funnel, with the axis being how “functional” they are, Python would remain at the widest portion of that funnel. It’s not very functional, but it gives object-oriented programmers the opportunity to use functional programming and easily demonstrate its utility through provable benefits, which gives organizations the courage to dive deeper into the functional world.

Personally, I’m quite hopeful that the majority of benefits of functional programming can come out even in an object-oriented language with functional primitives like Python, and that organizations are well-positioned to capture that engineering surplus (although I don’t have the requisite experience to make that claim authoritative).

To learn more about effectively parallel functional patterns (implemented in Clojure), check out “Seven Concurrency Models in Seven Weeks”, by Paul Butcher.


Edit page
Share this post on:

Previous Post
#todayilearned: Isolate your Development Environment
Next Post
Product Dimensionality