Concurrency with Python: Functional Programming

The Concurrency with Python Series:

Overview

In contrast to the threading/locking concurrency model I described previously, the functional concurrency model abstracts most if not all hardware primitives out of the application picture. No mutable state, and no side effects, can exist in (pure) functional programming -- even though the Von Neumann architecture, that all contemporary, commercially useful computer microarchitectures are based on, literally composes finite state machines that remain wholly stateful. This is possible because after compilation, all program instructions and variables are stored in memory, and all functions become CPU branches, regardless of what design paradigms a high-level programming language specifies. In this sense, a functional language and its corresponding toolchain act as a declarative shim for the imperative reality.

Out of all the concurrency models I have covered or likely will cover in this series, functional programming is my favorite, for a number of reasons:

  • Fewer lines of code written: The code required for effective declarative programming, where you describe what you want the program to do, generally involves a subset of the code required for imperative programming, where you describe both what you want your program to do and how your program should do it. This loose upper bound implies that, other things equal, you will likely write less code, which leads to benefits like less technical debt, higher development velocity, and correspondingly wider options when it comes to development prioritization and business direction.

  • Independent upstream evolution: Since you don't micromanage your program, the maintainers of your dependencies can make optimizations underneath the hood without requiring you to change your application code. Hence, you may be able to capture a large swath of performance benefits by simply updating your dependencies, with no code changes required. For example, one reason why SQL has a large moat in terms of query language adoption is because of the amount of work poured into SQL query optimization. This is possible because SQL is a declarative interface, and SQL engines and SQL application code can evolve separately. The fact that SQL is more or less standards-compliant (SQL92 standards document snapshot here) as well simply accelerates this fact as more people and companies can invest and partake in the benefits and as the client-side interface evolves with version independence and backwards compatibility in mind.

  • Better testing confidence: Gary Bernhardt gave a great presentation called "Functional Core, Imperative Shell", where your entire testing suite involved writing unit tests for your "functional soup" of business logic, with a small number of regressions to test third-party connections. If you design your application to formalize the state from your interfaces using a facade, you may be able to parallelize your regression suite from the facade down (in Python, using something like pytest-xdist), resulting in huge performance gains for black-box testing. You could also achieve high alignment between your regression suite and your end-to-end test suite (which you cannot parallelize due to the unknown concurrency models used by your dependencies). However, such benefits may become brittle if your toolchain cannot enforce state checks.

    The stateful alternative to this in practice is to define a large number of end-to-end tests and pray for correct behavior in production, as unit tests only end up calcifying your source code and slow you down when fixing inevitable bugs when writing code the first time around (though unit tests are much more useful when refactoring existing code). I detailed one way to make your life easier in this reality by cheaply creating end-to-end tests in my series on data-driven testing, but even so, it doesn't change the nature of the game.

  • An ability to design software towards property satisfaction: Functional languages are referentially transparent, which makes it easy to design large-scale, composable systems.

    Pure functional languages also make it easier to reason about an entire system using principles of compositionality, a subset of denotational semantics, which is a category of semantics. One benefit that may be nice about such high-level systems abstractions is you can reason about the safety, security, and correctness of your program using mathematical proofs. This is the field of operational semantics.

    Honestly, I mostly care about things like referential transparency and the ability to define rewrite systems, for tangible benefits like lazy evaluation, ease of refactoring, dependency visibility, and ease of parallelization. The other mathematical properties of functional programming are pretty deep rabbit holes that I don't have too much experience in at the moment. They may also be a little ivory-tower for my taste, and most companies will likely not care nor need to care about them.

    As an aside, I believe imandra.ai is one company built around the ethical governance around algorithms (not paid by them, just thought it was cool). This feature may grow more important as technology usage achieves saturation in some markets (HFT comes to mind), companies try ethically riskier strategies in order to pick high-hanging fruit, and company executives need to justify their behavior in concrete terms to potential regulators and the general public. Having publicly available theoretical proofs of algorithms used in production may be one way in order to assuage ethical concerns in the future while also masking any competitive advantages in a commoditized, red-ocean market. I think that's pretty neat.

  • An ability to naturally evolve application software towards dataflow-first programming: In data-intensive applications, it's helpful to separate the control flow from the data the application processes. Designing your request layer using functional principles helps in ensuring your application is implicitly parallelizable as state is scoped and isolated within functions, which allow the underlying execution engine to re-order things without worrying about dependencies. This may help in reducing or eliminating CPU bound restrictions with less mental overhead.

    For more information about designing data-intensive applications, see my book review on "Designing Data-Intensive Applications", by Martin Kleppmann, and buy the book.

In a philosophical sense, functional programming allows us to abstract our thinking away from reality and our mortal coil. This great Stack Overflow post describes John Backus's perspective towards imperative programming and its ties to the von Neumann bottleneck between the CPU and memory (quote from post with emphasis reposted here for ease of access):

Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger conceptual units of the task at hand. Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant data itself but where to find it.

Python and Functional Programming

Python is an impure functional programming language. The Python specification, and all its implementations, are tightly coupled to Von Neumann primitives. Guido may say Python remains a wholly object-oriented language as functions are merely first-class objects, as I described in a prior blog post about Python functions in practice.

In particular, specific design decisions may lead to Python not being a great language for using serious functional programming in:

  • Exclusive usage of heap memory: All memory within a Python process is allocated on a private heap created for that specific Python process, as described in Python's documentation on memory management. This is likely due to how Python is an interpreted and dynamically typed language, where variable types are determined and variable references are assigned at runtime. All of this likely preclude statically determining memory allocation for Python variables, which preclude any usage of stack memory.

    This means that functional best-practices, like usage of recursive functions, may not perform as well in Python as it might in some compiled langages, since you are allocating the call stack on heap memory, and you cannot simply collapse the stack with a pointer restoration. Nikhil Marathe did a great writeup on Python and native code interop with respect to crash reporting, which if I understood it correctly, mentioned how the CPython call stack is a doubly linked list of PyThreadState structs, which is dynamically allocated on heap memory (C typedef here):

    typedef struct _ts {
    /* See Python/ceval.c for comments explaining most fields */
    
    struct _ts *prev;
    struct _ts *next;
    PyInterpreterState *interp;
    
    struct _frame *frame;
    	...
    unsigned long thread_id; /* Thread id where this tstate was created */
    	...
    } PyThreadState;
    

    Hence, stack overflow errors in Python are not based on restrictions within stack memory (which use physical memory addresses), but rather a configurable variable called _PyRuntime.ceval.recursion_limit. Guido states in his blog post that by default, Python will tolerate 1000 recursions before the Python interpreter will throw a RuntimeError (py2.x) or RecursionError (py3.x). However, this limit can be reset using the method sys.setrecursionlimit.

    While recursion limits may not be the same as stack overflows, I'm guessing this feature tamps down on the number of stack overflow errors in production. I think you can still get stack overflow errors, though they look slightly different (I ran into an issue that looked similar to this error in a Django application, reported on Stack Overflow).

    I would think other implementations of the Python specification would behave the same way in this regards as CPython, due to the interpreted nature of the language. Cython may behave differently, due to the compiled nature of Python C extensions.

  • Lack of tail recursion elimination: Python deliberately does not implement tail recursion elimination, where the existing execution context and stack frame is overridden with the results of the tail call (last procedure executed in a function) if the tail call is recursive, in order to save memory from having to append similar stack frames together. From Guido's blog posts on tail recursion elimination, and follow-up to the heated discussion afterwards, Python does not implement this because it destroys the existing stack frame resulting in incomplete stack traces, and because it would give programmers a reason to use recursion more often in application code, which would tie application code to a specific Python implementation's internals and preclude greater source code portability. It is not "Pythonic".

  • Pass object reference by value: Robert Heaton wrote a great blog post explaining how Python was neither pass by reference nor pass by value, but rather "pass object reference by value". To clarify this statement, he gave the following code sample (that I punched into an IPython shell here for clarity and inline evaluation):

    (python3.7) host:~ username$ ipython
    # ...
    
    In [1]: def reassign(thing):
    ...:     thing = [0, 1]
    ...:
    
    In [2]: thing = [0]
    
    In [3]: reassign(thing)
    
    In [4]: thing
    Out[4]: [0]
    
    In [5]: def append(thing):
    ...:     thing.append(1)
    ...:
    
    In [6]: thing = [0]
    
    In [7]: append(thing)
    
    In [8]: thing
    Out[8]: [0, 1]
    

    If the object reference is redefined within the function, the object becomes scoped within the function. However, if the object value is mutated within the function while keeping the object reference intact, the object remains scoped outside the function (and can be of any scope, including global scope).

    I've found this idiosyncratic evaluation strategy to be extremely annoying in practice. Not only does struct scoping depend on your application code, but given how Python evaluates default function arguments once, Python will continually mutate your default arguments in any function defintion for the life of the process. This "pass object reference by value" notion naturally leads to tight coupling across functions in your Python application, and reduces your ability to leverage function composition.

Functional Things You Can Do In Python

Even with its drawbacks in regards to "doing things the functional way", Python provides enough functional utilities to optimize certain code snippets with functional bits. Here are some useful, Pythonic code snippets you can use in Python to see functional programming in action:

List Comprehensions and Generator Expressions

List comprehensions expose the data within a list at a top level during struct creation, and make you apply a single transformation to the entire list. It makes your dataflow extremely visible, which is a good first step when transitioning from an object-oriented paradigm to a functional paradigm. It also makes you think twice before doing things like use conditional statements, since transforming all the data at once makes you think more about the similarities in the data, rather than the differences.

While writing this blog post, I was pleasantly surprised to discover that list comprehensions were not syntactic sugar around loops, but rather optimized at the bytecode level to cut out unnecessary internal method calls. Ashwini Chaudhary wrote a great Stack Overflow answer to this very question, by using the dis module in a low-level deep dive.

Generator expressions can be thought of as the lazily evaluated analogue to list comprehensions. The two work very well together. Oftentimes when I am debugging a generator expression, I replace it inline with a list comprehension before adding import ipdb; ipdb.set_trace() to examine the generated data, correct my statements, and reinsert the generator.

Guido talks about list comprehensions and generator expressions at length, and how best to use them in this blog post.

Python Functional Libraries

The Python standard library has a large number of built-in functions, with a subset of common functional programming paradigms included like all(), any(), map(), sum(), and zip(). I've found these functions work well and remain idiomatic when combined with list comprehensions, as you can port any datashape altering transform (i.e. reducers) to outside the comprehension.

I first experienced the wonders of functional programming in JavaScript using lodash, and as it turns out, Python experienced something similar with a library called toolz, that extends the paradigms of libraries like itertools and functools, and also supports extending multi-process, parallelizing libraries in Python.

Other Functional Optimizations

Trampolining

Trampolining, in high-level programming, is a way to avoid growing your call stack by defining an identity function that captures all control transfers of an otherwise recursive problem. This "thunk" function is then passed into a looping function (trampoline); this way, the thunk function does not recurse. Jake Tauber wrote a great blog post about defining a trampolined factorial function in Python.

I've never done this and would not recommend doing this in Python beyond as an academic exercise, both due to the lack of tail call optimization in Python and the latent but very real possibility that your fellow developers would need to think quite hard about how and why you are implementing application-side tail recursion elimination.

Currying

Currying is a form of incremental binding of function arguments, lazily, iteratively and dynamically defining ever more specialized functions given the n-ary function signature passed in at the top level. I think of currying as the functional analogue to the object-oriented way of method chaining on an object instance (if only the methods were dynamically defined). Trilarion wrote a great Stack Overflow answer as to why you may want to use currying in Python.

Again, I also don't know how Pythonic this is. I would prefer to accept default parameters in the function signature instead.

Monoids, Monads, and Category Theory

I don't think I can explain monads as nicely as others as of this moment, and I currently have no background in monads, so I will provide links instead.

Nikolay Grozev wrote a great blog post detailing how monads can reduce glue code required in imperative languages, and he referenced this helpful blog post on multiple monads implemented in Python.

I really don't think using monads in Python is a good idea, for the same reason you shouldn't start a company in Lisp: it's too powerful. This Hacker News post on "The Programming Language Conundrum" explains how this power makes it very hard to scale out development teams, since it tightly couples brains and your (effective) language re-specifications. In addition, I don't think I fully understand how monads fail, and until then, I'd rather write simple code that fails nicely. They are mysterious and cool to me, though, and I would like to understand them better.

This is merely a small sample of the different kinds of patterns functional programming exposes you to. Plenty more are visible online, such as on Jeremy Gibbon's blog (although the examples may not be written in Python).

So Where's the Parallelization?

The nice thing about pure functions is you can easily pass them into multiprocessing.Pool.map, where the Python internals handle the task-level execution (and even the chunking of your data if you want). Since pure functions certainly don't interact with the threading model explicitly, and since Python assumes a single thread of execution otherwise, you can be pretty sure that pure functions mapped over data using multiprocessing.Pool.map will not encounter issues like deadlock or race conditions.

One thing to keep in mind with using multiprocessing.Pool is that child processes instantiated within the process pool are always daemonic. Since Python daemonic processes cannot have children, you cannot nest Pool and pooled processes, resulting in a fairly flat and explicit multiprocess model. See this Stack Overflow answer on the same question, and the initialization of a Pool object, to see how worker threads behave in a process pool.

In addition, I don't think Python has any built-in parallel reducers like Clojure's reducers/fold. The closest thing may be something like toolz.sandbox.parallel.fold. You could also implement your own parallel reducers, like this Stack Overflow answer, but you will need to manually keep in mind the interactions your method has within the overall call graph of your application, and it may be expensive to maintain in production.

I don't think anybody's made any serious libraries around parallelizing functional work in pure Python because of the global interpreter lock. Much of the highly used numeric processing libraries like pandas and numpy, which allow for pure functional transforms, are written in Cython, which allows release of the GIL and effective parallelism.

Conclusion

If you imagine programming languages along an adoption funnel, with the axis being how "functional" they are, Python would remain at the widest portion of that funnel. It's not very functional, but it gives object-oriented programmers the opportunity to use functional programming and easily demonstrate its utility through provable benefits, which gives organizations the courage to dive deeper into the functional world.

Personally, I'm quite hopeful that the majority of benefits of functional programming can come out even in an object-oriented language with functional primitives like Python, and that organizations are well-positioned to capture that engineering surplus (although I don't have the requisite experience to make that claim authoritative).

To learn more about effectively parallel functional patterns (implemented in Clojure), check out “Seven Concurrency Models in Seven Weeks”, by Paul Butcher.