DOUG 0.2

Distributed operations

Note:
Most of the description refers to the case with assembled matrix input, the case with elemental input may differ.

The initial process regions are non-overlapping and node locations are stored in Mesh_class::Mesh::eptnmap (element partition map).

The data needed for both cases and the distribution are done through the SpMtx_arrangement::SpMtx_DistributeAssembled subroutine. The preliminary distribution of data is done in SpMtx_arrangement::SpMtx_distributeWithOverlap and the exact data to be exchanged between processes is calculated in SpMtx_arrangement::SpMtx_build_ghost. We need to do two types of operations:

There are two cases which differ in their approach: non-overlapping and overlapping regions, so that even parallel matrix-vector multiplication is done by different subroutines.

Non-overlapping case

With no overlap each mesh node belongs only to one process region. So called interface (ghost values) for region 1 is shown on the picture below.

ghosts.png
Non-overlapping regions with ghost nodes outside the region

Overlapping case

Overlap is constructed by adding connected nodes in the mesh graph one layer at a time. The overlap of 2 (ol=2) means there will be 4 layers on the boundary that are shared by both processes. However, the ghost values are now in the last layer inside the region, not outside, so with ol=1 the ghost values are the same as in non-overlapping case. This is a trick which allows to reduce the amount of calculation in the overlapping case.

ghosts_ol.png
Overlapping regions with ghost nodes in the last layer of the region

The general assumption is that the region values are up-to-date before and after each operation.

Optimizations to mv multiplication

In non-overlapping case ghost values are sent before any calculations. In overlapping case, local values that are interface for any neighbour are computed first and sent out, only then the calculation of the remaining region is performed. This requires to distinguish the matrix values that are "ghost to local" in non-overlapping case and the matrix values that are "local to neighbour ghost" in overlapping case. The latter are shown on the following figure.

ghostsneighs_ol.png
1. neighbor ghost nodes on process 2

These are the values process 2 need to send to process 1 during matrix vector multiplication. Analogous values on process 1 for process 2 can be seen as symmetrical in respect to boundary.

Note:
Neighbour ghost values are actually local values, except they are ghost values for some neighbour process. For simplicity, the references in the following section to 'local values' exclude them.

Data structures

Because the two cases are quite different and the optimizations the data structures are slightly complicated. Matrix values are divided into 6 categories:

  1. neighbour ghost to neighbour ghost (1,1)
  2. local to neighbour ghost (1,2)
  3. neighbour ghost to local (2,1)
  4. local to local (2,2)
  5. ghost to local (from outside the region in non-overlapping case)
  6. A_ghost (ghost to ghost inside the region in overlapping case)
matrix_ol.png
Matrix values for overlapping case

The first 4 are marked by the the members mtx_bbs and mtx_bbe of the SpMtx_class::SpMtx class. The 5. one is everything between mtx_bbe(2,2) and nnz. The last is stored to a separate matrix object A_ghost.

Note:
The 5. set is empty in overlapping case and the 6. is empty in non-overlapping case.

The vector indices that need to be sent and received during matrix-vector multiplication are stored in ax_sendidx and ax_recvidx members of the Mesh_class::Mesh class. The vector indices that need to be exchanged (sent/received) during first level preconditioner are stored in ol_solve array.