From c88c280a665083b94d907ca556a6f7f64e70bbe9 Mon Sep 17 00:00:00 2001 From: Debolskiy Andrey <and.debol@gmail.com> Date: Thu, 26 Jan 2017 17:26:41 +0300 Subject: [PATCH] Fixed gitignore man in src is now being tracked --- .gitignore | 2 +- ParLib.src/man/P_BExchange.3 | 126 ++++++++++++++++ ParLib.src/man/P_BExchange_init.3 | 144 ++++++++++++++++++ ParLib.src/man/P_Transpose.3 | 219 +++++++++++++++++++++++++++ ParLib.src/man/P_Transpose_init.3 | 237 ++++++++++++++++++++++++++++++ 5 files changed, 727 insertions(+), 1 deletion(-) create mode 100644 ParLib.src/man/P_BExchange.3 create mode 100644 ParLib.src/man/P_BExchange_init.3 create mode 100644 ParLib.src/man/P_Transpose.3 create mode 100644 ParLib.src/man/P_Transpose_init.3 diff --git a/.gitignore b/.gitignore index 505ad92..69f2b7b 100644 --- a/.gitignore +++ b/.gitignore @@ -5,5 +5,5 @@ #linked libraries *.a #copied manual and headers -man/ +./man/ include/ diff --git a/ParLib.src/man/P_BExchange.3 b/ParLib.src/man/P_BExchange.3 new file mode 100644 index 0000000..36ead3f --- /dev/null +++ b/ParLib.src/man/P_BExchange.3 @@ -0,0 +1,126 @@ +.TH P_BExchange 3 "10/19/2001" " " "ParLib 1.1" +.SH NAME +P_BExchange \- Performs blocking boundary exchange +.SH SYNOPSIS +.nf +\fB#include "parlib.h"\fR + +\fBint P_BExchange(\fR void *arr, int ndims, int *stride, int *blklen, + int bdim, int overlap[2], MPI_Datatype datatype, + MPI_Comm comm, int period \fB)\fR +.fi +.SH INPUT PARAMETERS +.PD 0 +.TP +.B arr +- address of the local part of communicated array (choice) +.PD 1 +.PD 0 +.TP +.B ndims +- number of dimensions of the array (positive integer) +.PD 1 +.PD 0 +.TP +.B stride +- dimensions of the array (array of +.I ndims +positive integers) +.PD 1 +.PD 0 +.TP +.B blklen +- sizes of communicated blocks (array of \fIndims\fR positive integers) +.PD 1 +.PD 0 +.TP +.B bdim +- decomposed dimension of the array \fIarr\fR (integer from the range [1,\fIndims\fR]) +.PD 1 +.PD 0 +.TP +.B overlap +- thickness of boundaries (array of 2 nonnegative integers) +.PD 1 +.PD 0 +.TP +.B datatype +- datatype of each array element (MPI handle) +.PD 1 +.PD 0 +.TP +.B comm +- communicator (MPI handle) +.PD 1 +.PD 0 +.TP +.B period +- logical variable specifying weather the boundary exchange is periodic (true) or not (false) + +.SH DESCRIPTION +The routine exchanges interprocessor boundaries of array \fIarr\fR, whose dimension \fIbdim\fR is distributed among \fInproc\fR processors of the group defined by communicator \fIcomm\fR. + +We will use Matlab notations for indexes of array \fIarr\fR specifying only the index corresponding to dimension \fIbdim\fR while assuming implicitly the others to be \fI1:blklen(idim)\fR, where \fIidim\fR is the dimension number. + +The exchange is performed as follows. +If the processor \fIiproc\fR is not the first processor of the group, it sends \fIarr(1:overlap(1))\fR to processor \fIiproc-1\fR. +Similarly, if it is not the last one, it sends \fIarr(blklen(bdim)-overlap(2)+1:blklen(bdim))\fR to processor \fIiproc+1\fR. + +If \fIperiod\fR is true, then the first processor also sends \fIarr(1:overlap(1))\fR to the last one and the last sends \fIarr(blklen(bdim)-overlap(2)+1:blklen(bdim))\fR to the first processor. + +.SH NOTES FOR FORTRAN +All INM ParLib routines in Fortran have an additional argument \fIierr\fR +at the end of the argument list. +\fIierr\fR is an integer and has the same meaning as the return value of the routine in C. +In Fortran, INM ParLib routines are subroutines, and are invoked with the \fIcall\fR statement. + +All INM ParLib objects (e.g., \fIBExchange\fR, \fITransposition\fR) are \fIINTEGER\fR arrays of size \fIHANDLE_SIZE\fR in Fortran. +The parameter \fIHANDLE_SIZE\fR is defined in 'parlibf.h' which is recommended to be included in Fortran code beforehand. + +.SH COUTION +Pointer \fIarr\fR should contain the address of the local part of the array that does not include the adjacent halo regions dedicated for the boundaries which have to be received from the neighbors. + +.SH ERRORS +The routine returns the following error codes: +.RS 3 +.TP +.PD 0 +.TP +.B 0 +- No error; the routine completed successfully. +.PD 1 +.PD 0 +.TP +.B 1 +- Nonpositive number of dimensions. +.PD 1 +.PD 0 +.TP +.B 2 +- Dimension +.I bdim +is out of range [1,\fIndims\fR]. +.PD 1 +.PD 0 +.TP +.B 3 +- Negative boundary thickness. +.PD 1 +.PD 0 +.TP +.B 4 +- Nonpositive array dimension. +.PD 1 +.PD 0 +.TP +.B 5 +- Boundary thickness exceeds the block size for dimension +.I +bdim + +.SH SEE ALSO +\fBP_BExchange_init\fR(3), \fBP_BExchange_start\fR(3), \fBP_BExchange_end\fR(3), \fBP_BExchange_free\fR(3) + +.SH AUTHOR +Val N. Gloukhov, gluhoff@inm.ras.ru + diff --git a/ParLib.src/man/P_BExchange_init.3 b/ParLib.src/man/P_BExchange_init.3 new file mode 100644 index 0000000..d1e6a1c --- /dev/null +++ b/ParLib.src/man/P_BExchange_init.3 @@ -0,0 +1,144 @@ +.TH P_BExchange 3 "10/19/2001" " " "ParLib 1.1" +.SH NAME +P_BExchange_init, P_BExchange_start, P_BEchange_end, P_BExchange_free \- Perform nonblocking boundary exchange +.SH SYNOPSIS +.nf +\fB#include "parlib.h"\fR + +\fBint P_BExchange_init(\fR int ndims, int *stride, int *blklen, int bdim, + int overlap[2], MPI_Datatype datatype, MPI_Comm comm, + int period, BExchange *bexchange \fB);\fR +\fBint P_BExchange_start(\fR void *arr, BExchange *bexchange \fB);\fR +\fBint P_BExchange_end(\fR BExchange *bexchange \fB);\fR +\fBint P_BExchange_free(\fR BExchange *bexchange \fB);\fR +.fi +.SH ARGUMENTS +.PD 0 +.TP +.B arr +- address of the local part of communicated array (choice) +.PD 1 +.PD 0 +.TP +.B ndims +- number of dimensions of the array (positive integer) +.PD 1 +.PD 0 +.TP +.B stride +- dimensions of the array (array of \fIndims\fR positive integers) +.PD 1 +.PD 0 +.TP +.B blklen +- sizes of communicated blocks (array of \fIndims\fR positive integers) +.PD 1 +.PD 0 +.TP +.B bdim +- decomposed dimension of the array \fIarr\fR (integer from the range [1,\fIndims\fR]) +.PD 1 +.PD 0 +.TP +.B overlap +- thickness of boundaries (array of 2 nonnegative integers) +.PD 1 +.PD 0 +.TP +.B datatype +- datatype of each array element (MPI handle) +.PD 1 +.PD 0 +.TP +.B comm +- communicator (MPI handle) +.PD 1 +.PD 0 +.TP +.B period +- logical variable specifying weather the bounfary exchange is periodic (true) or not (false) +.PD 1 +.PD 0 +.TP +.B bexchange +- pointer to a structure \fIBExchange\fR + +.SH DESCRIPTION +The routine set exchanges interprocessor boundaries of array \fIarr\fR, whose dimension \fIbdim\fR is distributed among \fInproc\fR processors of the group defined by communicator \fIcomm\fR in nonblocking mode. + +\fBP_BExchange_init ()\fR builds data types required for the exchange + +\fBP_BExcahge_start ()\fR starts the exchange + +\fBP_BExcahnge_end ()\fR waits for the completion of the exchange + +\fBP_BExchange_free ()\fR deallocates memory allocated by the routine \fBP_BExchange_init ()\fR + +We will use Matlab notations for indexes of array \fIarr\fR specifying only the index corresponding to dimension \fIbdim\fR while assuming implicitly the others to be \fI1:blklen(idim)\fR, where \fIidim\fR is the dimension number. + +The exchange is performed as follows. +If the processor \fIiproc\fR is not the first processor of the group, it sends \fIarr(1:overlap(1))\fR to processor \fIiproc-1\fR. +Similarly, if it is not the last one, it sends \fIarr(blklen(bdim)-overlap(2)+1:blklen(bdim))\fR to processor \fIiproc+1\fR. + +If \fIperiod\fR is true, then the first processor also sends \fIarr(1:overlap(1))\fR to the last one and the last sends \fIarr(blklen(bdim)-overlap(2)+1:blklen(bdim))\fR to the first processor. + +.SH NOTES FOR FORTRAN +All INM ParLib routines in Fortran have an additional argument +.I ierr +at the end of the argument list. +.I ierr +is an integer and has the same meaning as the return value of the routine +in C. In Fortran, INM ParLib routines are subroutines, and are invoked +with the +.I call +statement. + +All INM ParLib objects (e.g., \fIBExchange\fR, \fITransposition\fR) are \fIINTEGER\fR arrays of size \fIHANDLE_SIZE\fR in Fortran. +The parameter \fIHANDLE_SIZE\fR is defined in 'parlibf.h' which is recommended to be included in Fortran code beforehand. + +.SH COUTION +Pointer \fIarr\fR should contain the address of the local part of the array that does not include the adjacent halo regions dedicated for the boundaries which have to be received from the neighbors. + +.SH ERRORS +The routine returns the following error codes: +.RS 3 +.TP +.PD 0 +.TP +.B 0 +- No error; the routine completed successfully. +.PD 1 +.PD 0 +.TP +.B 1 +- Nonpositive number of dimensions. +.PD 1 +.PD 0 +.TP +.B 2 +- Dimension +.I bdim +is out of range [1,\fIndims\fR]. +.PD 1 +.PD 0 +.TP +.B 3 +- Negative boundary thickness. +.PD 1 +.PD 0 +.TP +.B 4 +- Nonpositive array dimension. +.PD 1 +.PD 0 +.TP +.B 5 +- Boundary thickness exceeds the block size for dimension +.I +bdim + +.SH SEE ALSO +\fBP_BExchange\fR(3) + +.SH AUTHOR +Val N. Gloukhov, gluhoff@inm.ras.ru diff --git a/ParLib.src/man/P_Transpose.3 b/ParLib.src/man/P_Transpose.3 new file mode 100644 index 0000000..9443fef --- /dev/null +++ b/ParLib.src/man/P_Transpose.3 @@ -0,0 +1,219 @@ +.TH P_Transpose 3 "10/22/2001" " " "ParLib 1.1" +.SH NAME +P_Transpose \- Performs blocking transposition +.SH SYNOPSIS +.nf +\fB#include "parlib.h"\fR + +\fBint P_Transpose(\fR int ndims, void *arr_source, int dim_source, + int *lblks_source, void *arr_dest, int dim_dest, int *lblks_dest, + int *stride, int *blklen, int *overlap, MPI_Datatype datatype, + MPI_Comm comm, int period \fB);\fR +.fi + +.SH INPUT PARAMETERS +.PD 0 +.TP +.B ndims +- number of dimensions of the arrays (positive integer) +.PD 1 +.PD 0 +.TP +.B arr_source +- address of the local part of the source array (choice) +.PD 1 +.PD 0 +.TP +.B dim_source +- decomposed dimension of array \fIarr_source\fR +.PD 1 +.PD 0 +.TP +.B lblks_source +- distribution of array \fIarr_source\fR +.PD 1 +.PD 0 +.TP +.B dim_dest +- decomposed dimension of array \fIarr_dest\fR +.PD 1 +.PD 0 +.TP +.B lblks_dest +- distribution of array \fIarr_dest\fR +.PD 1 +.PD 0 +.TP +.B stride, blklen +- parameters specifying dimensions of the arrays \fIarr_source\fR, \fIarr_dest\fR +and location of the communicated data within the arrays +.PD 1 +.PD 0 +.TP +.B overlap +- overlapping of the output array \fIarr_dest\fR +.PD 1 +.PD 0 +.TP +.B datatype +- data type of the arrays +.PD 1 +.PD 0 +.TP +.B comm +- communicator specifying the processor group +.PD 1 +.PD 0 +.TP +.B period +- true, if the overlapping is periodic, false otherwise (logical) + +.SH OUTPUT PARAMETER +.PD 0 +.TP +.B arr_dest +- address of the local part of the destination array (choice) + +.SH DESCRIPTION +.PD 0 +The routine performes a transposition of array \fIarr_source\fR storing the result in array \fIarr_dest\fR. + +Both of the arrays should have the same number of dimensions \fIndims\fR. +Array \fIarr_source\fR have to be distriduted in dimension \fIdim_source\fR while array \fIarr_dest\fR in dimension \fIdim_dest\fR. + +The dimensions and the location of the communicated data within the arrays are specified by parameters \fIstride\fR, \fIblklen\fR, \fIlblks_source\fR and \fIlblks_dest\fR. +We assume all indexes to be ranged from 1 in this manual. + +The arrays have the following dimensions: + +.nf + | regular dim | \fIdim_source\fR | \fIdim_dest\fR +-----------|-------------|--------------------|----------------- +\fIarr_source\fR | \fIstride\fR(dim) | \fIblklen\fR(\fIdim_source\fR) | \fIstride\fR(\fIdim_dest\fR) +\fIarr_dest\fR | \fIstride\fR(dim) | \fIstride\fR(\fIdim_source\fR) | \fIblklen\fR(\fIdim_dest\fR) +.fi + +The following blocks of the arrays are communicated: + +.nf + | regular dim | \fIdim_source\fR | \fIdim_dest\fR +-----------|-------------|---------------------|------------------- +\fIarr_source\fR |1:\fIblklen\fR(dim)|1:\fIlblks_source\fR(\fIiproc\fR)|1:\fBSUM\fR(\fIlblks_dest\fR) +\fIarr_dest\fR |1:\fIblklen\fR(dim)|1:\fBSUM\fR(\fIlblks_source\fR) |1:\fIlblks_dest\fR(\fIiproc\fR) +.fi + +where the regular dimension is any dimension but \fIdim_source\fR and \fIdim_dest\fR, \fIiproc\fR is the processor number within the given group; \fBSUM\fR denotes the summation upon all processors of the group (e.g., + +.nf + \fBSUM\fR(\fIlblks_source\fR)=SUM(\fIlblks_source\fR(i),i=1,\fInproc\fR)), +.fi + +\fInproc\fR is the number of processors in the group. +These blocks should be augmented by overlapped regions, if overlapping is taking place. + +In the next section, we will assume that the first index corresponds to dimension \fIdim_source\fR while the second to dimension \fIdim_dest\fR. Then the transposition algorithm can be outlined as follows. + +Processor \fIiproc\fR sends + +.nf + \fIarr_source\fR(1:\fIlblks_source\fR(\fIiproc\fR), + \fIjb\fR(\fIjproc\fR)-overlap(1):\fIje\fR(\fIjproc\fR)+overlap(2)) +.fi + +to every processor \fIjproc\fR of the group including himself and receives + +.nf + \fIarr_dest\fR(\fIib\fR(\fIjproc\fR):\fIie\fR(\fIjproc\fR), + 1-overlap(1):\fIlblks_dest\fR(\fIiproc\fR)+overlap(2)) +.fi + +In the nonperiodic case (\fIperiod\fR is false), \fIoverlap\fR(1) is counted for 0, if \fIiproc\fR is equal to 1 as well as \fIoverlap\fR(2), if \fIiproc\fR is equal to \fInproc\fR. +Starting and ending indexes \fIib\fR, \fIie\fR, \fIjb\fR, \fIje\fR are calculated as follows + +.nf + \fIib\fR(1)=1; \fIib\fR(i+1)=\fIib\fR(i)+\fIlblks_dest\fR(i),i=1,...,\fInproc\fR-1; + \fIie\fR(\fInproc\fR)=\fBSUM\fR(\fIlblks_dest\fR); + \fIie\fR(i-1)=\fIie\fR(i)-\fIlblks_dest\fR(i),i=\fInproc\fR,...,2; +.fi + +.nf + \fIjb\fR(1)=1; \fIjb\fR(j+1)=\fIjb\fR(j)+\fIlblks_source\fR(j),j=1,...,\fInproc\fR-1; + \fIje\fR(\fInproc\fR)=\fBSUM\fR(\fIlblks_source\fR); + \fIje\fR(j-1)=\fIje\fR(j)-\fIlblks_source\fR(j),j=\fInproc\fR,...,2. +.fi + +.SH NOTES +Transposition with overlapping is equiualent to transposition widthout overlapping and subsequent interprocessor boundary exchange applied to the resultant array \fIarr_dest\fR. + +.SH NOTES FOR FORTRAN +All INM ParLib routines in Fortran have an additional argument +.I ierr +at the end of the argument list. +.I ierr +is an integer and has the same meaning as the return value of the routine +in C. In Fortran, INM ParLib routines are subroutines, and are invoked with the +.I call +statement. + +All INM ParLib objects (e.g., \fIBExchange\fR, \fITransposition\fR) are \fIINTEGER\fR arrays of size \fIHANDLE_SIZE\fR in Fortran. +The parameter \fIHANDLE_SIZE\fR is defined in 'parlibf.h' which is recommended to be included in Fortran code beforehand. + +.SH ERRORS +The routine returns the following error codes: +.RS 3 +.TP +.PD 0 +.TP 3 +.B 0 +- No error; the routine completed successfully. +.TP +.B 1 +- Number of dimensions \fIndims\fR is less than 2. +.TP +.B 2 +- \fIdim_source\fR is out of range [1,\fIndims\fR]. +.TP +.B 3 +- \fIdim_dest\fR is out of range [1,\fIndims\fR]. +.TP +.B 4 +- \fIdim_source\fR and \fIdim_dest\fR coincide. +.TP +.B 5 +- Nonpositive array dimension. +.TP +.B 6 +- Negative overlapping. +.TP +.B 7 +- \fIblklen\fR exceeds \fIstride\fR. +.TP +.B 8 +- \fIlblks_source\fR(\fIiproc\fR) exceeds \fIblklen\fR(\fIdim_source\fR). +.TP +.B 9 +- \fIlblks_dest\fR(\fIiproc\fR) exceeds \fIblklen\fR(\fIdim_dest\fR). +.TP +.B 10 +- \fBSUM\fR(\fIlblks_source\fR) exceeds \fIstride\fR(\fIdim_source\fR). +.TP +.B 11 +- \fBSUM\fR(\fIlblks_dest\fR) exceeds \fIstride\fR(\fIdim_dest\fR). +.TP +.B 12 +- \fIoverlap\fR(1) exceeds \fIlblks_dest\fR(1). +.TP +.B 13 +- \fIoverlap\fR(2) exceeds \fIlblks_dest\fR(\fInproc\fR). +.TP +.B 14 +- \fIlblks_source\fR is nonpositive. +.TP +.B 15 +- \fIlblks_dest\fR is nonpositive. + +.SH SEE ALSO +\fBP_Transpose_init\fR(3), \fBP_Transpose_start\fR(3), \fBP_Transpose_end\fR(3), \fBP_Transpose_free\fR(3) + +.SH AUTHOR +Val N. Gloukhov, gluhoff@inm.ras.ru diff --git a/ParLib.src/man/P_Transpose_init.3 b/ParLib.src/man/P_Transpose_init.3 new file mode 100644 index 0000000..bd582d2 --- /dev/null +++ b/ParLib.src/man/P_Transpose_init.3 @@ -0,0 +1,237 @@ +.TH P_Transpose 3 "10/23/2001" " " "ParLib 1.1" +.SH NAME +P_Transpose_init, P_Transpose_start, P_Transpose_end, P_Transpose_free \- Perform nonblocking transposition +.SH SYNOPSIS +.nf +\fB#include "parlib.h"\fR + +\fBint P_Transpose_init(\fR int ndims, int dim_source, int *lblks_source, + int dim_dest, int *lblks_dest, int *stride, int *blklen, + int *overlap, MPI_Datatype datatype, MPI_Comm comm, int period, + Transposition *transp \fB);\fR +\fBint P_Transpose_start(\fR void *arr_source, void *arr_dest, + Transposition *transp \fB);\fR +\fBint P_Transpose_end(\fR Transposition *transp \fB);\fR +\fBint P_Transpose_free(\fR Transposition *transp \fB);\fR +.fi + +.SH ARGUMENTS +.PD 0 +.TP +.B ndims +- number of dimensions of the arrays (positive integer) +.PD 1 +.PD 0 +.TP +.B arr_source +- address of the local part of the source array (choice) +.PD 1 +.PD 0 +.TP +.B dim_source +- decomposed dimension of array \fIarr_source\fR +.PD 1 +.PD 0 +.TP +.B lblks_source +- distribution of array \fIarr_source\fR +.PD 0 +.TP +.B arr_dest +- address of the local part of the destination array (choice) +.PD 1 +.PD 0 +.TP +.B dim_dest +- decomposed dimension of array \fIarr_dest\fR +.PD 1 +.PD 0 +.TP +.B lblks_dest +- distribution of array \fIarr_dest\fR +.PD 1 +.PD 0 +.TP +.B stride, blklen +- parameters specifying dimensions of the arrays \fIarr_source\fR, \fIarr_dest\fR +and location of the communicated data within the arrays +.PD 1 +.PD 0 +.TP +.B overlap +- overlapping of the output array \fIarr_dest\fR +.PD 1 +.PD 0 +.TP +.B datatype +- data type of the arrays +.PD 1 +.PD 0 +.TP +.B comm +- communicator specifying the processor group +.PD 1 +.PD 0 +.TP +.B period +- true, if the overlapping is periodic, false otherwise (logical) +.PD 1 +.PD 0 +.TP +.B transp +- pointer to a \fITransposition\fR structure + +.SH DESCRIPTION +.PD 0 +The routine set performes a nonblocking transposition of array \fIarr_source\fR storing the result in array \fIarr_dest\fR. + +.PD 0 +\fBP_Transpose_init ()\fR builds data types required by the transposition. + +Both arrays \fIarr_source\fR and \fIarr_dest\fR should have the same number of dimensions \fIndims\fR. +Array \fIarr_source\fR have to be distriduted in dimension \fIdim_source\fR while array \fIarr_dest\fR in dimension \fIdim_dest\fR. + +The dimensions and the location of the communicated data within the arrays are specified by parameters \fIstride\fR, \fIblklen\fR, \fIlblks_source\fR and \fIlblks_dest\fR. +We assume all indexes to be ranged from 1 in this manual. + +The arrays have the following dimensions: + +.nf + | regular dim | \fIdim_source\fR | \fIdim_dest\fR +-----------|-------------|--------------------|----------------- +\fIarr_source\fR | \fIstride\fR(dim) | \fIblklen\fR(\fIdim_source\fR) | \fIstride\fR(\fIdim_dest\fR) +\fIarr_dest\fR | \fIstride\fR(dim) | \fIstride\fR(\fIdim_source\fR) | \fIblklen\fR(\fIdim_dest\fR) +.fi + +The following blocks of the arrays are communicated: + +.nf + | regular dim | \fIdim_source\fR | \fIdim_dest\fR +-----------|-------------|---------------------|------------------- +\fIarr_source\fR |1:\fIblklen\fR(dim)|1:\fIlblks_source\fR(\fIiproc\fR)|1:\fBSUM\fR(\fIlblks_dest\fR) +\fIarr_dest\fR |1:\fIblklen\fR(dim)|1:\fBSUM\fR(\fIlblks_source\fR) |1:\fIlblks_dest\fR(\fIiproc\fR) +.fi + +where the regular dimension is any dimension but \fIdim_source\fR and \fIdim_dest\fR, \fIiproc\fR is the processor number within the given group; \fBSUM\fR denotes the summation upon all processors of the group (e.g., + +.nf + \fBSUM\fR(\fIlblks_source\fR)=SUM(\fIlblks_source\fR(i),i=1,\fInproc\fR)), +.fi + +\fInproc\fR is the number of processors in the group. +These blocks should be augmented by overlapped regions, if overlapping is taking place. + +.PD 1 +.PD 0 +\fBP_Transpose_start ()\fR starts the transposition. + +In the next section, we will assume that the first index of the arrays \fIarr_source\fR and \fIarr_dest\fR corresponds to dimension \fIdim_source\fR while the second to dimension \fIdim_dest\fR. Then the transposition algorithm can be outlined as follows. + +Processor \fIiproc\fR sends + +.nf + \fIarr_source\fR(1:\fIlblks_source\fR(\fIiproc\fR), + \fIjb\fR(\fIjproc\fR)-overlap(1):\fIje\fR(\fIjproc\fR)+overlap(2)) +.fi + +to every processor \fIjproc\fR of the group (including himself) and receives + +.nf + \fIarr_dest\fR(\fIib\fR(\fIjproc\fR):\fIie\fR(\fIjproc\fR), + 1-overlap(1):\fIlblks_dest\fR(\fIiproc\fR)+overlap(2)) +.fi + +In the nonperiodic case (\fIperiod\fR is false), \fIoverlap\fR(1) is counted for 0, if \fIiproc\fR is equal to 1 as well as \fIoverlap\fR(2), if \fIiproc\fR is equal to \fInproc\fR. +Starting and ending indexes \fIib\fR, \fIie\fR, \fIjb\fR, \fIje\fR are calculated as follows + +.nf + \fIib\fR(1)=1; \fIib\fR(i+1)=\fIib\fR(i)+\fIlblks_dest\fR(i),i=1,...,\fInproc\fR-1; + \fIie\fR(\fInproc\fR)=\fBSUM\fR(\fIlblks_dest\fR); + \fIie\fR(i-1)=\fIie\fR(i)-\fIlblks_dest\fR(i),i=\fInproc\fR,...,2; +.fi + +.nf + \fIjb\fR(1)=1; \fIjb\fR(j+1)=\fIjb\fR(j)+\fIlblks_source\fR(j),j=1,...,\fInproc\fR-1; + \fIje\fR(\fInproc\fR)=\fBSUM\fR(\fIlblks_source\fR); + \fIje\fR(j-1)=\fIje\fR(j)-\fIlblks_source\fR(j),j=\fInproc\fR,...,2. +.fi + +\fBP_Transpose_end ()\fR waits untill the transposition will be completed. + +\fBP_Transpose_free ()\fR deallocate data types allocated by \fBP_Transpose_init\fR. + +.SH NOTES +Transposition with overlapping is equiualent to transposition widthout overlapping and subsequent interprocessor boundary exchange applied to the resultant array \fIarr_dest\fR. + +.SH NOTES FOR FORTRAN +All INM ParLib routines in Fortran have an additional argument +.I ierr +at the end of the argument list. +.I ierr +is an integer and has the same meaning as the return value of the routine +in C. In Fortran, INM ParLib routines are subroutines, and are invoked with the +.I call +statement. + +All INM ParLib objects (e.g., \fIBExchange\fR, \fITransposition\fR) are \fIINTEGER\fR arrays of size \fIHANDLE_SIZE\fR in Fortran. +The parameter \fIHANDLE_SIZE\fR is defined in 'parlibf.h' which is recommended to be included in Fortran code beforehand. + +.SH ERRORS +\fBP_Transpose_init\fR returns the following error codes: +.RS 3 +.TP +.PD 0 +.TP 3 +.B 0 +- No error; the routine completed successfully. +.TP +.B 1 +- Number of dimensions \fIndims\fR is less than 2. +.TP +.B 2 +- \fIdim_source\fR is out of range [1,\fIndims\fR]. +.TP +.B 3 +- \fIdim_dest\fR is out of range [1,\fIndims\fR]. +.TP +.B 4 +- \fIdim_source\fR and \fIdim_dest\fR coincide. +.TP +.B 5 +- Nonpositive array dimension. +.TP +.B 6 +- Negative overlapping. +.TP +.B 7 +- \fIblklen\fR exceeds \fIstride\fR. +.TP +.B 8 +- \fIlblks_source\fR(\fIiproc\fR) exceeds \fIblklen\fR(\fIdim_source\fR). +.TP +.B 9 +- \fIlblks_dest\fR(\fIiproc\fR) exceeds \fIblklen\fR(\fIdim_dest\fR). +.TP +.B 10 +- \fBSUM\fR(\fIlblks_source\fR) exceeds \fIstride\fR(\fIdim_source\fR). +.TP +.B 11 +- \fBSUM\fR(\fIlblks_dest\fR) exceeds \fIstride\fR(\fIdim_dest\fR). +.TP +.B 12 +- \fIoverlap\fR(1) exceeds \fIlblks_dest\fR(1). +.TP +.B 13 +- \fIoverlap\fR(2) exceeds \fIlblks_dest\fR(\fInproc\fR). +.TP +.B 14 +- \fIlblks_source\fR is nonpositive. +.TP +.B 15 +- \fIlblks_dest\fR is nonpositive. + +.SH SEE ALSO +\fBP_Transpose\fR(3) + +.SH AUTHOR +Val N. Gloukhov, gluhoff@inm.ras.ru -- GitLab