Type-safe pipe function implementation in Python

🏠 Go home
Functional programming Python Typing 2022-08-10

I'm currently investigating how to implement some pure functional concepts in the Python. One of the very basic things we use in functional programming is composition of functions and a similar concept, piping value through a list of functions. In some languages (like F# or Elixir) there are builtin operators for that. For example F# has something like this.

value |> firstFunction |> secondFunction

In Python (or in Javascript) we can implement the behaviour using function accepting arbitrary number of arguments.

pipe(value, first_function, second_function)

Which we want to be equivalent to this.

second_function(first_function(value))

And actually, creating such a function in Python isn't really hard. We just accept whatever position arguments and then apply one after one. I decided for the recursive implementation, but it should be doable also using for / while loops or reduce function from itertools.

def pipe(init, *fns):
    if len(fns) == 0:
        return init
    return pipe(fns[0](init), *fns[1:])

Now, this works, but we don't have any types to have our back after we screw something up. What we want is the compiler to know, that if f1 :: A -> B and f2 :: B -> C, then pipe(a, f1, f2) has type C. Also we want to make sure that every result of previously evaluated function can be passed to input of the very next function.

For example, if pipe supported only two functions, we want something like this.

from typing import Callable, TypeVar

Fn = Callable[[A], B]

A = TypeVar("A")
B = TypeVar("B")
C = TypeVar("C")

def pipe(init: A, f: Fn[A, B], g: Fn[B, C]) -> C:
    ...

Unfortunately, there doesn't seem to be an easy way to write such a generic function in a type-safe manner. So, we'll employ similar approach functional libraries in Typescript do, we'll create function overloads.

@overload decorator

Python's typing module has the overload decorator. But what is overloading anyway?

Overloading is a technique used to create multiple variants of the same function with different type signatures. Don't confuse with function overriding which is a different technique! Some languages like C# support compile-time dispatch along with overloading. That means, you're allowed to overload a function and when calling the function, compiler will decide which overload to choose. Python doesn't do that by itself but there's actually a singledispatch decorator in the standard library which implements the dispatch behaviour on the first argument of the decorated function. Check the official docs page for more information.

We can't use the dispatch because we'd need to dispatch on all the parameters of the function (although, I can image such a decorator would be possible to implement). Instead, we'll use the overload decorator to communicate to the type-checker which signatures we're aiming to support and then provide a single implementation.

If we wanted to boost our implementation of handling different requests from the post about TypeGuard by returning specific response based on the input model, we could do something like this.

from typing import Any, overload

SearchByIdRequest = ...
SearchByNameRequest = ...

SearchByIdResponse = ...
SearchByNameResponse = ...

SearchRequest = SearchByIdRequest | SearchByNameRequest
SearchResponse = SearchByIdResponse | SearchByNameResponse

@overload
def run_database_query(request: SearchByIdRequest) -> SearchByIdResponse:
    ...

@overload
def run_database_query(request: SearchByNameRequest) -> SearchByNameResponse:
    ...

def run_database_query(request: Any) -> Any:
    # actual implementation
    ...

As the typing docs says, after a series of overloaded definitions there must be exactly one not-decorated definition. During the runtime, all the previous @overload-decorated function will be overwritten by the last definition. The type signature of the last function doesn't really matter, it can be omitted completely so I put there Any. Mypy will check whether the function implementation can handle all the overloads by checking the arguments for us. Otherwise, it is completely in our hands to implement the function correctly with respect to specified overloads.

Final solution

The final implementation looks like this. Of course, it has a big disadvantage that whenever someone calls the pipe with more arguments than the very last overload, the mypy will produce errors. From the experience, I almost never use pipe with more than 10 arguments. Therefore, there will be a small bulk of overloads and we can safely add new ones if needed because the actual implementation can handle an arbitrary number of arguments.

Also, I needed to use the positional-arguments-only syntax in my overloads. Otherwise, mypy would yell it me. In the very last implementation function, I accept arbitrary positional arguments, but overloads say you could invoke the pipe the like this.

pipe(init=1, f=lambda x: x + 1, g=list)

Which is not correct, because the last implementation function accepts only arbitrary positional arguments.

Just for completeness, it is actually not necessary to use the / syntax. If I named all the parameters with double underscore (__init, __f, ...) it would have the same effect for mypy.

from typing import Any, Callable, TypeVar, overload

Fn = Callable[[A], B]

A = TypeVar("A")
B = TypeVar("B")
C = TypeVar("C")
D = TypeVar("D")

@overload
def pipe(init: A, /) -> A:
    ...

@overload
def pipe(init: A, f: Fn[A, B], /) -> B:
    ...

@overload
def pipe(init: A, f: Fn[A, B], g: Fn[B, C], /) -> C:
    ...

@overload
def pipe(init: A, f: Fn[A, B], g: Fn[B, C], h: Fn[C, D], /) -> D:
    ...

# etc...

def pipe(init: Any, *fns: Any) -> Any:
    """Transform value `init` using provided functions one by one."""
    if len(fns) == 0:
        return init
    return pipe(fns[0](init), *fns[1:])