HPL_pdfact - Man Page

recursive panel factorization.

Synopsis

#include "hpl.h"
 void HPL_pdfact( HPL_T_panel * PANEL );

Description

HPL_pdfact recursively factorizes a  1-dimensional  panel of columns. The  RPFACT  function pointer specifies the recursive algorithm to be used, either Crout, Left- or Right looking.  NBMIN allows to vary the recursive stopping criterium in terms of the number of columns in the panel, and  NDIV  allow to specify the number of subpanels each panel should be divided into. Usuallly a value of 2 will be chosen. Finally PFACT is a function pointer specifying the non-recursive algorithm to to be used on at most NBMIN columns. One can also choose here between Crout, Left- or Right looking.  Empirical tests seem to indicate that values of 4 or 8 for NBMIN give the best results.
 Bi-directional  exchange  is  used  to  perform  the  swap::broadcast operations  at once  for one column in the panel.  This  results in a lower number of slightly larger  messages than usual.  On P processes and assuming bi-directional links,  the running time of this function can be approximated by (when N is equal to N0):                      

  N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) +
  N0^2 * ( M - N0/3 ) * gam2-3
 where M is the local number of rows of  the panel, lat and bdwth  are the latency and bandwidth of the network for  double  precision  real words, and  gam2-3  is  an estimate of the  Level 2 and Level 3  BLAS rate of execution. The  recursive  algorithm  allows indeed to almost achieve  Level 3 BLAS  performance  in the panel factorization.  On a large  number of modern machines,  this  operation is however latency bound,  meaning  that its cost can  be estimated  by only the latency portion N0 * log_2(P) * lat.  Mono-directional links will double this communication cost.

Arguments

PANEL   (local input/output)    HPL_T_panel *

On entry,  PANEL  points to the data structure containing the panel information.

See Also

HPL_dlocmax (3), HPL_dlocswpN (3), HPL_dlocswpT (3), HPL_pdmxswp (3), HPL_pdpancrN (3), HPL_pdpancrT (3), HPL_pdpanllN (3), HPL_pdpanllT (3), HPL_pdpanrlN (3), HPL_pdpanrlT (3), HPL_pdrpancrN (3), HPL_pdrpancrT (3), HPL_pdrpanllN (3), HPL_pdrpanllT (3), HPL_pdrpanrlN (3), HPL_pdrpanrlT (3).

Referenced By

HPL_dlocmax(3), HPL_dlocswpN(3), HPL_dlocswpT(3), HPL_pdgesv0(3), HPL_pdgesvK1(3), HPL_pdgesvK2(3), HPL_pdmxswp(3), HPL_pdrpancrN(3), HPL_pdrpancrT(3), HPL_pdrpanllN(3), HPL_pdrpanllT(3), HPL_pdrpanrlN(3), HPL_pdrpanrlT(3).

February 24, 2016 HPL 2.2 HPL Library Functions