项目作者: Lyncs-API

项目描述 :
Utils for interfacing to MPI libraries using mpi4py and dask
高级语言: Python
项目地址: git://github.com/Lyncs-API/lyncs.mpi.git
创建时间: 2020-08-25T09:38:12Z
项目社区:https://github.com/Lyncs-API/lyncs.mpi

开源协议:BSD 3-Clause "New" or "Revised" License

下载


Utils for interfacing to MPI libraries using mpi4py and dask

python
pypi
license
build & test
codecov
pylint
black

This package provides tools for interfacing to MPI libraries based on mpi4py and dask:

  • mpi4py is a complete Python API of the MPI standard.

  • dask is a flexible library for parallel computing in Python.
    Particurally, we use the following sub-modules of the latter:

Installation

NOTE: lyncs_mpi requires a working MPI installation.
This can be installed via apt-get:

  1. sudo apt-get install libopenmpi-dev openmpi-bin

OR using conda:

  1. conda install -c anaconda mpi4py

The package can be installed via pip:

  1. pip install [--user] lyncs_mpi

Documentation

In this package we implement several low-level tools for supporting classes distributed over MPI.
These are described in this guide for developers. In the following we describe the high-level tools
provided in this package.

Client

The Client is a wrapper of dask.distributed.Client made MPI compatible following the instructions
of dask-mpi documentation.

  1. from lyncs_mpi import Client
  2. client = Client(num_workers=4)

If the above script is run in a interactive shell, the Client will start an MPI server in the background
running over num_workers+1 processes. The workers are the effective processes involved in the calculation.
The extra process (+1) is the scheduler that will manage the task scheduling.

The client, the interactive shell in this example, will proceed processing the script: submitting tasks to
the scheduler that will run them on the workers.

The same script can be run directly via mpirun. In this case one needs to execute

  1. mpirun -n $((num_workers + 2)) python script.py

that will run on num_workers+2 processes (as above +1 for the scheduler and +1 for the client that processes the script).

Communicators

Another feature that make lyncs_mpi.Client MPI compatible is the support of MPI communicators.

  1. comm = client.comm
  2. comm1 = client.create_comm(num_workers=2)
  3. comm2 = client.create_comm(exclude=comm1.workers)

In the example, comm = client.comm is the main MPI communicator involving all the workers.
The second comm1 and third comm2 communicators, instead, are communicators over 2 workers each.
The first two workers have been optimally chosen by the client, the other two instead are the remaining
one excluding the workers of comm1.

Another kind of communicators are Cartesian MPI communicators. They can be initialized doing

  1. cart = comm.create_cart([2,2])

where [2,2] are the dimensions of the multi-dimensional grid where the processes are distributed.

Cartesian communicators directly support Dask arrays
and e.g. cart.zeros([4,4,3,2,1]) instantiates a distributed Dask array assigned to the workers
of the communicator with local shape (chunks) (2,2,3,2,1).