HPL documentation

http://www.netlib.org/benchmark/hpl/

HPL_all_reduce

HPL_all_reduce performs a global reduce operation across all processes of a group leaving the results on all processes.

HPL_all_reduce performs a global reduce operation across all processes of a group leaving the results on all processes.

HPL_barrier

HPL_barrier blocks the caller until all process members have call it. The call returns at any process only after all group members have entered the call.

HPL_barrier blocks the caller until all process members have call it. The call returns at any process only after all group members have entered the call.

HPL_bcast

HPL_bcast broadcasts the current panel. Successful completion is indicated by IFLAG set to HPL_SUCCESS on return. IFLAG will be set to HPL_FAILURE on failure...

HPL_bcast broadcasts the current panel. Successful completion is indicated by IFLAG set to HPL_SUCCESS on return. IFLAG will be set to HPL_FAILURE on failure...

HPL_binit

HPL_binit initializes a row broadcast. Successful completion is indicated by the returned error code HPL_SUCCESS.

HPL_binit initializes a row broadcast. Successful completion is indicated by the returned error code HPL_SUCCESS.

HPL_broadcast

HPL_broadcast broadcasts a message from the process with rank ROOT to all processes in the group.

HPL_broadcast broadcasts a message from the process with rank ROOT to all processes in the group.

HPL_bwait

HPL_bwait waits for the row broadcast of the current panel to terminate. Successful completion is indicated by the returned error code HPL_SUCCESS.

HPL_bwait waits for the row broadcast of the current panel to terminate. Successful completion is indicated by the returned error code HPL_SUCCESS.

HPL_copyL

HPL_copyL copies the panel of columns, the L1 replicated submatrix, the pivot array and the info scalar into a contiguous workspace for later broadcast. The...

HPL_copyL copies the panel of columns, the L1 replicated submatrix, the pivot array and the info scalar into a contiguous workspace for later broadcast. The...

HPL_dgemm

HPL_dgemm performs one of the matrix-matrix operations C := alpha * op( A ) * op( B ) + beta * C where op( X ) is one of op( X ) = X or op( X ) = X^T. Alpha and...

HPL_dgemm performs one of the matrix-matrix operations C := alpha * op( A ) * op( B ) + beta * C where op( X ) is one of op( X ) = X or op( X ) = X^T. Alpha and...

HPL_dgemv

HPL_dgemv performs one of the matrix-vector operations y := alpha * op( A ) * x + beta * y, where op( X ) is one of op( X ) = X or op( X ) = X^T. where alpha...

HPL_dgemv performs one of the matrix-vector operations y := alpha * op( A ) * x + beta * y, where op( X ) is one of op( X ) = X or op( X ) = X^T. where alpha...

HPL_dger

HPL_dger performs the rank 1 operation A := alpha * x * y^T + A, where alpha is a scalar, x is an m-element vector, y is an n-element vector and A is an m by n...

HPL_dger performs the rank 1 operation A := alpha * x * y^T + A, where alpha is a scalar, x is an m-element vector, y is an n-element vector and A is an m by n...

HPL_dlamch

HPL_dlamch determines machine-specific arithmetic constants such as the relative machine precision (eps), the safe minimum (sfmin) such that 1 / sfmin does not...

HPL_dlamch determines machine-specific arithmetic constants such as the relative machine precision (eps), the safe minimum (sfmin) such that 1 / sfmin does not...

HPL_dlange

HPL_dlange returns the value of the one norm, or the infinity norm, or the element of largest absolute value of a matrix A: max(abs(A(i,j))) when NORM =...

HPL_dlange returns the value of the one norm, or the infinity norm, or the element of largest absolute value of a matrix A: max(abs(A(i,j))) when NORM =...

HPL_dlaswp00N

HPL_dlaswp00N performs a series of local row interchanges on a matrix A. One row interchange is initiated for rows 0 through M-1 of A.

HPL_dlaswp00N performs a series of local row interchanges on a matrix A. One row interchange is initiated for rows 0 through M-1 of A.

HPL_dlaswp01N

HPL_dlaswp01N copies scattered rows of A into itself and into an array U. The row offsets in A of the source rows are specified by LINDXA. The destination of...

HPL_dlaswp01N copies scattered rows of A into itself and into an array U. The row offsets in A of the source rows are specified by LINDXA. The destination of...

HPL_dlaswp01T

HPL_dlaswp01T copies scattered rows of A into itself and into an array U. The row offsets in A of the source rows are specified by LINDXA. The destination of...

HPL_dlaswp01T copies scattered rows of A into itself and into an array U. The row offsets in A of the source rows are specified by LINDXA. The destination of...

HPL_dlaswp02N

HPL_dlaswp02N packs scattered rows of an array A into workspace W. The row offsets in A are specified by LINDXA.

HPL_dlaswp02N packs scattered rows of an array A into workspace W. The row offsets in A are specified by LINDXA.

HPL_dlaswp03N

HPL_dlaswp03N copies columns of W into rows of an array U. The destination in U of these columns contained in W is stored within W0.

HPL_dlaswp03N copies columns of W into rows of an array U. The destination in U of these columns contained in W is stored within W0.

HPL_dlaswp03T

HPL_dlaswp03T copies columns of W into an array U. The destination in U of these columns contained in W is stored within W0.

HPL_dlaswp03T copies columns of W into an array U. The destination in U of these columns contained in W is stored within W0.

HPL_dlaswp04N

HPL_dlaswp04N copies M0 rows of U into A and replaces those rows of U with columns of W. In addition M1 - M0 columns of W are copied into rows of U.

HPL_dlaswp04N copies M0 rows of U into A and replaces those rows of U with columns of W. In addition M1 - M0 columns of W are copied into rows of U.

HPL_dlaswp04T

HPL_dlaswp04T copies M0 columns of U into rows of A and replaces those columns of U with columns of W. In addition M1 - M0 columns of W are copied into U.

HPL_dlaswp04T copies M0 columns of U into rows of A and replaces those columns of U with columns of W. In addition M1 - M0 columns of W are copied into U.

HPL_dlaswp05N

HPL_dlaswp05N copies rows of U of global offset LINDXAU into rows of A at positions indicated by LINDXA.

HPL_dlaswp05N copies rows of U of global offset LINDXAU into rows of A at positions indicated by LINDXA.

HPL_dlaswp05T

HPL_dlaswp05T copies columns of U of global offset LINDXAU into rows of A at positions indicated by LINDXA.

HPL_dlaswp05T copies columns of U of global offset LINDXAU into rows of A at positions indicated by LINDXA.

HPL_dlaswp10N

HPL_dlaswp10N performs a sequence of local column interchanges on a matrix A. One column interchange is initiated for columns 0 through N-1 of A.

HPL_dlaswp10N performs a sequence of local column interchanges on a matrix A. One column interchange is initiated for columns 0 through N-1 of A.

HPL_dlocmax

HPL_dlocmax finds the maximum entry in the current column and packs the useful information in WORK[0:3]. On exit, WORK[0] contains the local maximum absolute...

HPL_dlocmax finds the maximum entry in the current column and packs the useful information in WORK[0:3]. On exit, WORK[0] contains the local maximum absolute...

HPL_dlocswpN

HPL_dlocswpN performs the local swapping operations within a panel. The lower triangular N0-by-N0 upper block of the panel is stored in no-transpose form (i.e...

HPL_dlocswpN performs the local swapping operations within a panel. The lower triangular N0-by-N0 upper block of the panel is stored in no-transpose form (i.e...

HPL_dlocswpT

HPL_dlocswpT performs the local swapping operations within a panel. The lower triangular N0-by-N0 upper block of the panel is stored in transpose form.

HPL_dlocswpT performs the local swapping operations within a panel. The lower triangular N0-by-N0 upper block of the panel is stored in transpose form.

HPL_dmatgen

HPL_dmatgen generates (or regenerates) a random matrix A. The pseudo-random generator uses the linear congruential algorithm: X(n+1) = (a * X(n) + c) mod m as...

HPL_dmatgen generates (or regenerates) a random matrix A. The pseudo-random generator uses the linear congruential algorithm: X(n+1) = (a * X(n) + c) mod m as...

HPL_dtrsm

HPL_dtrsm solves one of the matrix equations op( A ) * X = alpha * B, or X * op( A ) = alpha * B, where alpha is a scalar, X and B are m by n matrices, A is a...

HPL_dtrsm solves one of the matrix equations op( A ) * X = alpha * B, or X * op( A ) = alpha * B, where alpha is a scalar, X and B are m by n matrices, A is a...

HPL_dtrsv

HPL_dtrsv solves one of the systems of equations A * x = b, or A^T * x = b, where b and x are n-element vectors and A is an n by n non-unit, or unit, upper or...

HPL_dtrsv solves one of the systems of equations A * x = b, or A^T * x = b, where b and x are n-element vectors and A is an n by n non-unit, or unit, upper or...

HPL_equil

HPL_equil equilibrates the local pieces of U, so that on exit to this function, pieces of U contained in every process row are of the same size. This phase...

HPL_equil equilibrates the local pieces of U, so that on exit to this function, pieces of U contained in every process row are of the same size. This phase...

HPL_grid_exit

HPL_grid_exit marks the process grid object for deallocation. The returned error code MPI_SUCCESS indicates successful completion. Other error codes are (MPI)...

HPL_grid_exit marks the process grid object for deallocation. The returned error code MPI_SUCCESS indicates successful completion. Other error codes are (MPI)...

HPL_grid_info

HPL_grid_info returns the grid shape and the coordinates in the grid of the calling process. Successful completion is indicated by the returned error code...

HPL_grid_info returns the grid shape and the coordinates in the grid of the calling process. Successful completion is indicated by the returned error code...

HPL_grid_init

HPL_grid_init creates a NPROW x NPCOL process grid using column- or row-major ordering from an initial collection of processes identified by an MPI...

HPL_grid_init creates a NPROW x NPCOL process grid using column- or row-major ordering from an initial collection of processes identified by an MPI...

HPL_idamax

HPL_idamax returns the index in an n-vector x of the first element having maximum absolute value.

HPL_idamax returns the index in an n-vector x of the first element having maximum absolute value.

HPL_indxg2l

HPL_indxg2l computes the local index of a matrix entry pointed to by the global index IG. This local returned index is the same in all processes.

HPL_indxg2l computes the local index of a matrix entry pointed to by the global index IG. This local returned index is the same in all processes.

HPL_indxg2lp

HPL_indxg2lp computes the local index of a matrix entry pointed to by the global index IG as well as the process coordinate which posseses this entry. The local...

HPL_indxg2lp computes the local index of a matrix entry pointed to by the global index IG as well as the process coordinate which posseses this entry. The local...

HPL_indxg2p

HPL_indxg2p computes the process coordinate which posseses the entry of a matrix specified by a global index IG.

HPL_indxg2p computes the process coordinate which posseses the entry of a matrix specified by a global index IG.

HPL_indxl2g

HPL_indxl2g computes the global index of a matrix entry pointed to by the local index IL of the process indicated by PROC.

HPL_indxl2g computes the global index of a matrix entry pointed to by the local index IL of the process indicated by PROC.

HPL_infog2l

HPL_infog2l computes the starting local index II, JJ corresponding to the submatrix starting globally at the entry pointed by I, J. This routine returns the...

HPL_infog2l computes the starting local index II, JJ corresponding to the submatrix starting globally at the entry pointed by I, J. This routine returns the...

HPL_jumpit

HPL_jumpit jumps in the random sequence from the number X(n) encoded in IRANN to the number X(m) encoded in IRANM using the constants A and C encoded in MULT...

HPL_jumpit jumps in the random sequence from the number X(n) encoded in IRANN to the number X(m) encoded in IRANM using the constants A and C encoded in MULT...

HPL_ladd

HPL_ladd adds without carry two long positive integers K and J and puts the result into I. The long integers I, J, K are encoded on 64 bits using an array of 2...

HPL_ladd adds without carry two long positive integers K and J and puts the result into I. The long integers I, J, K are encoded on 64 bits using an array of 2...

HPL_lmul

HPL_lmul multiplies without carry two long positive integers K and J and puts the result into I. The long integers I, J, K are encoded on 64 bits using an array...

HPL_lmul multiplies without carry two long positive integers K and J and puts the result into I. The long integers I, J, K are encoded on 64 bits using an array...

HPL_logsort

HPL_logsort computes an array IPMAP and its inverse IPMAPM1 that contain the logarithmic sorted processes id with repect to the local number of rows of U that...

HPL_logsort computes an array IPMAP and its inverse IPMAPM1 that contain the logarithmic sorted processes id with repect to the local number of rows of U that...

HPL_numroc

HPL_numroc returns the local number of matrix rows/columns process PROC will get if we give out N rows/columns starting from global index 0.

HPL_numroc returns the local number of matrix rows/columns process PROC will get if we give out N rows/columns starting from global index 0.

HPL_numrocI

HPL_numrocI returns the local number of matrix rows/columns process PROC will get if we give out N rows/columns starting from global index I.

HPL_numrocI returns the local number of matrix rows/columns process PROC will get if we give out N rows/columns starting from global index I.

HPL_packL

HPL_packL forms the MPI data type for the panel to be broadcast. Successful completion is indicated by the returned error code MPI_SUCCESS.

HPL_packL forms the MPI data type for the panel to be broadcast. Successful completion is indicated by the returned error code MPI_SUCCESS.

HPL_pddriver

main is the main driver program for testing the HPL routines. This program is driven by a short data file named "HPL.dat".

main is the main driver program for testing the HPL routines. This program is driven by a short data file named "HPL.dat".

HPL_pdfact

HPL_pdfact recursively factorizes a 1-dimensional panel of columns. The RPFACT function pointer specifies the recursive algorithm to be used, either Crout...

HPL_pdfact recursively factorizes a 1-dimensional panel of columns. The RPFACT function pointer specifies the recursive algorithm to be used, either Crout...

HPL_pdgesv

HPL_pdgesv factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant with or without...

HPL_pdgesv factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant with or without...

HPL_pdgesv0

HPL_pdgesv0 factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant without look-ahead...

HPL_pdgesv0 factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant without look-ahead...

HPL_pdgesvK1

HPL_pdgesvK1 factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant with look-ahead. The...

HPL_pdgesvK1 factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant with look-ahead. The...

HPL_pdgesvK2

HPL_pdgesvK2 factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant with look-ahead. The...

HPL_pdgesvK2 factors a N+1-by-N matrix using LU factorization with row partial pivoting. The main algorithm is the "right looking" variant with look-ahead. The...

HPL_pdinfo

HPL_pdinfo reads the startup information for the various tests and transmits it to all processes.

HPL_pdinfo reads the startup information for the various tests and transmits it to all processes.

HPL_pdlamch

HPL_pdlamch determines machine-specific arithmetic constants such as the relative machine precision (eps), the safe minimum(sfmin) such that 1/sfmin does not...

HPL_pdlamch determines machine-specific arithmetic constants such as the relative machine precision (eps), the safe minimum(sfmin) such that 1/sfmin does not...

HPL_pdlange

HPL_pdlange returns the value of the one norm, or the infinity norm, or the element of largest absolute value of a distributed matrix A: max(abs(A(i,j))) when...

HPL_pdlange returns the value of the one norm, or the infinity norm, or the element of largest absolute value of a distributed matrix A: max(abs(A(i,j))) when...

HPL_pdlaprnt

HPL_pdlaprnt prints to standard error a distributed matrix A. The local pieces of A are sent to the process of coordinates (0,0) in the grid and then printed.

HPL_pdlaprnt prints to standard error a distributed matrix A. The local pieces of A are sent to the process of coordinates (0,0) in the grid and then printed.

HPL_pdlaswp00N

HPL_pdlaswp00N applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. Bi-directional exchange is used to perform...

HPL_pdlaswp00N applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. Bi-directional exchange is used to perform...

HPL_pdlaswp00T

HPL_pdlaswp00T applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. Bi-directional exchange is used to perform...

HPL_pdlaswp00T applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. Bi-directional exchange is used to perform...

HPL_pdlaswp01N

HPL_pdlaswp01N applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. A "Spread then roll" algorithm performs the...

HPL_pdlaswp01N applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. A "Spread then roll" algorithm performs the...

HPL_pdlaswp01T

HPL_pdlaswp01T applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. A "Spread then roll" algorithm performs the...

HPL_pdlaswp01T applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. A "Spread then roll" algorithm performs the...

HPL_pdmatgen

HPL_pdmatgen generates (or regenerates) a parallel random matrix A. The pseudo-random generator uses the linear congruential algorithm: X(n+1) = (a * X(n) + c)...

HPL_pdmatgen generates (or regenerates) a parallel random matrix A. The pseudo-random generator uses the linear congruential algorithm: X(n+1) = (a * X(n) + c)...

HPL_pdmxswp

HPL_pdmxswp swaps and broadcasts the absolute value max row using bi-directional exchange. The buffer is partially set by HPL_dlocmax. Bi-directional exchange...

HPL_pdmxswp swaps and broadcasts the absolute value max row using bi-directional exchange. The buffer is partially set by HPL_dlocmax. Bi-directional exchange...

HPL_pdpancrN

HPL_pdpancrN factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Crout variant of the usual one-dimensional...

HPL_pdpancrN factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Crout variant of the usual one-dimensional...

HPL_pdpancrT

HPL_pdpancrT factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Crout variant of the usual one-dimensional...

HPL_pdpancrT factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Crout variant of the usual one-dimensional...

HPL_pdpanel_disp

HPL_pdpanel_disp deallocates the panel structure and resources and stores the error code returned by the panel factorization.

HPL_pdpanel_disp deallocates the panel structure and resources and stores the error code returned by the panel factorization.

HPL_pdpanel_free

HPL_pdpanel_free deallocates the panel resources and stores the error code returned by the panel factorization.

HPL_pdpanel_free deallocates the panel resources and stores the error code returned by the panel factorization.

HPL_pdpanllN

HPL_pdpanllN factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Left-looking variant of the usual one-dimensional...

HPL_pdpanllN factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Left-looking variant of the usual one-dimensional...

HPL_pdpanllT

HPL_pdpanllT factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Left-looking variant of the usual one-dimensional...

HPL_pdpanllT factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Left-looking variant of the usual one-dimensional...

HPL_pdpanrlN

HPL_pdpanrlN factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Right-looking variant of the usual one-dimensional...

HPL_pdpanrlN factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Right-looking variant of the usual one-dimensional...

HPL_pdpanrlT

HPL_pdpanrlT factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Right-looking variant of the usual one-dimensional...

HPL_pdpanrlT factorizes a panel of columns that is a sub-array of a larger one-dimensional panel A using the Right-looking variant of the usual one-dimensional...

HPL_pdrpancrN

HPL_pdrpancrN recursively factorizes a panel of columns using the recursive Crout variant of the usual one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpancrN recursively factorizes a panel of columns using the recursive Crout variant of the usual one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpancrT

HPL_pdrpancrT recursively factorizes a panel of columns using the recursive Crout variant of the usual one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpancrT recursively factorizes a panel of columns using the recursive Crout variant of the usual one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpanllN

HPL_pdrpanllN recursively factorizes a panel of columns using the recursive Left-looking variant of the one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpanllN recursively factorizes a panel of columns using the recursive Left-looking variant of the one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpanllT

HPL_pdrpanllT recursively factorizes a panel of columns using the recursive Left-looking variant of the one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpanllT recursively factorizes a panel of columns using the recursive Left-looking variant of the one-dimensional algorithm. The lower triangular N0-by-N0...

HPL_pdrpanrlN

HPL_pdrpanrlN recursively factorizes a panel of columns using the recursive Right-looking variant of the one-dimensional algorithm. The lower triangular...

HPL_pdrpanrlN recursively factorizes a panel of columns using the recursive Right-looking variant of the one-dimensional algorithm. The lower triangular...

HPL_pdrpanrlT

HPL_pdrpanrlT recursively factorizes a panel of columns using the recursive Right-looking variant of the one-dimensional algorithm. The lower triangular...

HPL_pdrpanrlT recursively factorizes a panel of columns using the recursive Right-looking variant of the one-dimensional algorithm. The lower triangular...

HPL_pdtest

HPL_pdtest performs one test given a set of parameters such as the process grid, the problem size, the distribution blocking factor ... This function generates...

HPL_pdtest performs one test given a set of parameters such as the process grid, the problem size, the distribution blocking factor ... This function generates...

HPL_pdtrsv

HPL_pdtrsv solves an upper triangular system of linear equations. The rhs is the last column of the N by N+1 matrix A. The solve starts in the process column...

HPL_pdtrsv solves an upper triangular system of linear equations. The rhs is the last column of the N by N+1 matrix A. The solve starts in the process column...

HPL_pdupdateNN

HPL_pdupdateNN broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateNN broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateNT

HPL_pdupdateNT broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateNT broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateTN

HPL_pdupdateTN broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateTN broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateTT

HPL_pdupdateTT broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_pdupdateTT broadcast - forward the panel PBCST and simultaneously applies the row interchanges and updates part of the trailing (using the panel PANEL)...

HPL_perm

HPL_perm combines two index arrays and generate the corresponding permutation. First, this function computes the inverse of LINDXA, and then combine it with...

HPL_perm combines two index arrays and generate the corresponding permutation. First, this function computes the inverse of LINDXA, and then combine it with...

HPL_pipid

HPL_pipid computes an array IPID that contains the source and final destination of matrix rows resulting from the application of N interchanges as computed by...

HPL_pipid computes an array IPID that contains the source and final destination of matrix rows resulting from the application of N interchanges as computed by...

HPL_plindx0

HPL_plindx0 computes two local arrays LINDXA and LINDXAU containing the local source and final destination position resulting from the application of row...

HPL_plindx0 computes two local arrays LINDXA and LINDXAU containing the local source and final destination position resulting from the application of row...

HPL_plindx1

HPL_plindx1 computes two local arrays LINDXA and LINDXAU containing the local source and final destination position resulting from the application of row...

HPL_plindx1 computes two local arrays LINDXA and LINDXAU containing the local source and final destination position resulting from the application of row...

HPL_plindx10

HPL_plindx10 computes three arrays IPLEN, IPMAP and IPMAPM1 that contain the logarithmic mapping information for the spreading phase.

HPL_plindx10 computes three arrays IPLEN, IPMAP and IPMAPM1 that contain the logarithmic mapping information for the spreading phase.

HPL_ptimer

HPL_ptimer provides a "stopwatch" functionality cpu/wall timer in seconds. Up to 64 separate timers can be functioning at once. The first call starts the timer...

HPL_ptimer provides a "stopwatch" functionality cpu/wall timer in seconds. Up to 64 separate timers can be functioning at once. The first call starts the timer...

HPL_ptimer_cputime

HPL_ptimer_cputime returns the cpu time. If HPL_USE_CLOCK is defined, the clock() function is used to return an approximation of processor time used by the...

HPL_ptimer_cputime returns the cpu time. If HPL_USE_CLOCK is defined, the clock() function is used to return an approximation of processor time used by the...

HPL_rand

HPL_rand generates the next number in the random sequence. This function ensures that this number lies in the interval (-0.5, 0.5]. The static array irand...

HPL_rand generates the next number in the random sequence. This function ensures that this number lies in the interval (-0.5, 0.5]. The static array irand...

HPL_recv

HPL_recv is a simple wrapper around MPI_Recv. Its main purpose is to allow for some experimentation / tuning of this simple routine. Successful completion is...

HPL_recv is a simple wrapper around MPI_Recv. Its main purpose is to allow for some experimentation / tuning of this simple routine. Successful completion is...

HPL_reduce

HPL_reduce performs a global reduce operation across all processes of a group. Note that the input buffer is used as workarray and in all processes but the...

HPL_reduce performs a global reduce operation across all processes of a group. Note that the input buffer is used as workarray and in all processes but the...

HPL_rollN

HPL_rollN rolls the local arrays containing the local pieces of U, so that on exit to this function U is replicated in every process row. In addition, this...

HPL_rollN rolls the local arrays containing the local pieces of U, so that on exit to this function U is replicated in every process row. In addition, this...

HPL_rollT

HPL_rollT rolls the local arrays containing the local pieces of U, so that on exit to this function U is replicated in every process row. In addition, this...

HPL_rollT rolls the local arrays containing the local pieces of U, so that on exit to this function U is replicated in every process row. In addition, this...

HPL_sdrv

HPL_sdrv is a simple wrapper around MPI_Sendrecv. Its main purpose is to allow for some experimentation and tuning of this simple function. Messages of length...

HPL_sdrv is a simple wrapper around MPI_Sendrecv. Its main purpose is to allow for some experimentation and tuning of this simple function. Messages of length...

HPL_send

HPL_send is a simple wrapper around MPI_Send. Its main purpose is to allow for some experimentation / tuning of this simple routine. Successful completion is...

HPL_send is a simple wrapper around MPI_Send. Its main purpose is to allow for some experimentation / tuning of this simple routine. Successful completion is...

HPL_setran

HPL_setran initializes the random generator with the encoding of the first number X(0) in the sequence, and the constants a and c used to compute the next...

HPL_setran initializes the random generator with the encoding of the first number X(0) in the sequence, and the constants a and c used to compute the next...

HPL_spreadN

HPL_spreadN spreads the local array containing local pieces of U, so that on exit to this function, a piece of U is contained in every process row. The array...

HPL_spreadN spreads the local array containing local pieces of U, so that on exit to this function, a piece of U is contained in every process row. The array...

HPL_spreadT

HPL_spreadT spreads the local array containing local pieces of U, so that on exit to this function, a piece of U is contained in every process row. The array...

HPL_spreadT spreads the local array containing local pieces of U, so that on exit to this function, a piece of U is contained in every process row. The array...

HPL_timer

HPL_timer provides a "stopwatch" functionality cpu/wall timer in seconds. Up to 64 separate timers can be functioning at once. The first call starts the timer...

HPL_timer provides a "stopwatch" functionality cpu/wall timer in seconds. Up to 64 separate timers can be functioning at once. The first call starts the timer...

HPL_timer_cputime

HPL_timer_cputime returns the cpu time. If HPL_USE_CLOCK is defined, the clock() function is used to return an approximation of processor time used by the...

HPL_timer_cputime returns the cpu time. If HPL_USE_CLOCK is defined, the clock() function is used to return an approximation of processor time used by the...

HPL_xjumpm

HPL_xjumpm computes the constants A and C to jump JUMPM numbers in the random sequence: X(n+JUMPM) = A*X(n)+C. The constants encoded in MULT and IADD specify...

HPL_xjumpm computes the constants A and C to jump JUMPM numbers in the random sequence: X(n+JUMPM) = A*X(n)+C. The constants encoded in MULT and IADD specify...