Concurrency, Parallelism, and Distributed Systems
Concurrency refers to running multiple computations more-or-less simultaneously, whereas parallelism refers to using multiple cores or OS-level threads to coordinate computation. We now know that the former is relatively safe and easy to reason about, whereas the latter is extremely difficult and causes many subtle bugs. OCaml currently supports concurrency elegantly, but parallelism support is not built in to the runtime.
Concurrency
- eio:
A concurrency library using the OCaml effect system.
Unlike the other concurrency libraries,
eio
doesn’t require the usage of monads. This makes it easier to code using ‘plain` OCaml code.- video presentation on
eio
.
- video presentation on
- lwt: a monadic concurrency library.
Concurrent code uses monads to express the higher-level abstractions of control flow.
- lwt-pipe: Stream/queue for lwt.
- RWO-lwt: Real World OCaml code examples translated from Async to lwt.
- lwt tutorial part 2 part 3
- Async: another monadic concurrency library developed by Jane Street. This library is covered in Real World OCaml. While the concept is very similar to lwt, small discrepancies make compatibility between the libraries difficult.
- LUV:
Bindings to libuv,
an event loop-based system that runs
node.io
. This is also a replacement for theUnix
module, allowing for full process control in a system-independent manner.
Articles
- The blog post that introduced Async
- A user gives up on Async
- Cooperative Concurrency in OCaml: Using Async
Parallelism
As mentioned above, OCaml currently doesn’t natively support multiple OS-level OCaml threads running simultaneously. A global lock prevents multiple OCaml threads from running simultaneously.
Since we currently don’t have thread-level parallelism, process-level is used instead.
- Parmap: Provides easy-to-use parallel map and fold functions. The library makes use of forking to create short-lived child processes, and memory mapping to feed the data back to the parent process.
- Parany: Generalized map reduce for multicore computers (unfold, map in parallel, fold). Parany can process in parallel an “infinite” stream of elements (too big to fit in memory). Any Parmap functionality can be reimplemented using parany.
- hack-parallel: Parallel processing library using shared memory. Used by Facebook’s Hack.
- lwt-parallel: Lower level mechanism to create child processes in lwt and have it communicate with the parent via socket.
- ForkWork: Similar to Parmap above.
- By interfacing with external C code through the FFI, OCaml can pass off long-running computations to C threads running at the same time as OCaml code. This is made easier nowadays due to CTypes (see ffi)
- Nproc: A process pool implementation for OCaml using lwt. Rather than creating or forking processes as needed, preallocates them and sends them units of work as required.
- Ocamlnet: An enhanced system platform library. It contains the netmulticore library to compute tasks on as many cores of the machine as needed. This is the most powerful implementation of parellelism currently available for OCaml, as it is capable of creating a shared memory region, and running a custom-made garbage collector on said region.
Multicore OCaml 5.0
The most promising and powerful way to use multicore is with the new multicore branch, has recently been incorporated into OCaml 5.0. OCaml 5.0 will use a parallel garbage collector, which means that it will eventually be able to run on multiple cores in the same process. Note that this branch is not yet ready for real work, but it’s rapidly advancing. For more information, consult the Multicore Wiki.
- Parallel Programming in Multicore OCaml: great article on using the Multicore OCaml branch.
Distributed Computing
Distributed computing is similar to process-based parallelism, except that the child processes may or may not be on remote machines. Therefore, distributed computing libraries generally also perform parallelism on the same machine as well.
- Rpc.Parallel: a library for spawning processes on a cluster of machines, and passing typed messages between them.
- Functory: a distributed computing library which facilitates distributed execution of parallelizable computations in a seamless fashion.
- MPI: message Passing Interface bindings for OCaml.
- ocaml-rpc: light library to deal with RPCs in OCaml.
- distributed: Library for distributed computation in OCaml. Similar to Erlang’s model and inspired by Cloud Haskell.
- reactor (alpha): Actor model for OCaml, similar to Erlang Elixir.