Matrices from Andre Garon, Univ. of Montreal. 2D Navier-Stokes.

(andreg :at the domain: CERCA.UMontreal.CA).
2D Finite-Element discretisation of the Navier-Stokes
Equations. The geometry is a simply a square, with inlet and outlet on 
opposing sides.

The "matrix_big" file got truncated in transmission...

Performance of various solvers:

--------------------------------------------------------------------------------
To: andreg :at the domain: CERCA.UMontreal.CA
cc: bramley :at the domain: cs.indiana.edu, davis :at the domain: cise.ufl.edu
Date: Mon, 13 May 1996 13:48:24 -0400
From: Tim Davis <davis :at the domain: cise.ufl.edu>

Andre,

Here are some initial results with your matrices.
Run time is the complete factorization time (ordering,
symbolic factorization, and numerical factorization).

This is on a Sun UltraSparc, with 128MB of memory,
2GB of swap space.  Peak performance in the BLAS is about
80 mflops in double precision, 160 mflops in single
(I'm using double prec.).

The methods used:

	UMFPACK2: default parameters, version 2.1 (at my ftp site,
		but very similar to version 2.0 in netlib).
	SuperLU-mmd:  SuperLU with MMD preordering on A'*A
	SuperLU-camd: SuperLU with COLAMD preodering on A'*A
		(COLAMD is a code I'm working on).
	MA48-def: from Harwell Subr. Library.  default parameters,
		successor to MA28.
	MA48-sym: MA48 but strictly with symmetric pivoting only.
	MA42: from Harwell Subr. Library.  A unifrontal code.

Both matrix_small and matrix_medium are easily factorizable.
Perhaps not as fast as an iterative method might work, but
they can be factorized.

"matrix_small" n=3175, nz=88927.

		time(sec)	nz in LU	flop count
UMFPACK2	3.95		754158		0.1318D+09
SuperLU-mmd	4.62		700626		0.1061D+09
SuperLU-camd	3.94		625623		0.0788D+09
MA48-def       17.8		646982		0.1371D+09
MA48-sym       11.9		485733		0.0693D+09
MA42	       10.9	       1044187		0.0772D+09

"matrix_medium" n=13535, nz=390607

		time(sec)	nz in LU	flop count
UMFPACK2	116.9		8298907		0.5235D+10
SuperLU-mmd	 48.1		4725883		0.1251D+10
SuperLU-camd	 54.5		4673004		0.1280D+10
MA48-def	472.1		6297647		0.4531D+10
MA48-sym	293.9		3826925		0.1520D+10
MA42		128.1		9020565		1.3587D+10

It looks like UMFPACK2 is getting unacceptable fill-in.  The
diagonal is good - MA48-sym works better than MA48-def.
SuperLU seems to work the best ... HOWEVER ... you can't use
just SuperLU alone.  It needs a column preordering.  I shudder to
think what would happen if you didn't preordering the columns
(maybe I should try it).

This is from a fluid flow problem, right?  UMFPACK seems to have
trouble with those.  Can you email me the details?


--------------------------------------------------------------------------------
Subject: more results
Date: Fri, 17 May 1996 16:23:22 -0400
From: Tim Davis <davis :at the domain: cise.ufl.edu>


Andre,

I ran MA41, the "symmetric-pattern" multifrontal method (a new version,
to appear in the next release of the Harwell Subroutine Library), on
your matrices.

Here are the results.  Basically, MA41 is quite a bit faster for these
matrices than UMFPACK (=MA38).  I doubt there's much I can do to beat
these run times.  It would be possible to improve UMFPACK, I think,
so it wouldn't be as slow as it is.  MA41 is also faster than
SuperLU, MA42, and MA48 for these matrices.  MA41 is also more
accurate than UMFPACK2, probably because of the smaller flop count.

These matrices have symmetric nonzero pattern.

Do you have problems that lead to matrices with unsymmetric nonzero
pattern?

Thanks,
Tim

p.s., you'll need a wide screen to read these results.
--------------------------------------------------------------------------------

This is on a lightly loaded UltraSparc, May 15-17, 1996,
128MB memory, 2GB swap space.  Large differences between CPU and
WALL CLOCK time indicate swap-space thrashing of the method.

Method.A is the method using DEFAULT parameters, except UMFPACK uses u=0.01.
Method.B uses non-default parameters.

        MA41.A: no max transversal, u=0.01 (defaults)
        MA41.B: max transversal, u=0.01
        UMF*.A: BTF and no symmetric preference (defaults), u=0.01
        UMF*.B: no BTF, and with symmetric preference, u=0.01


"total time" is analysis+factorize, not including solve time.
All times in seconds.

method     total time        |  num. factorize  |  solve time      |  nz in | flop     | error
           cpu       wall    |  cpu      wall   |  cpu       wall  |  LU    | count    | (max norm)


matrices/Garon/garon1.rua
MA41.A     0.835     0.938     0.729     0.810     0.036     0.078    357037 2.6470e+07     5.15D-12
MA41.B     0.869     0.941     0.730     0.790     0.036     0.037    357037 2.6470e+07     5.15D-12
UMF2.A     3.844     3.951     3.304     3.332     0.068     0.082    728318 1.3160e+08   0.6468E-04
UMF2.B     2.501     2.504     2.103     2.110     0.057     0.057    606371 8.2460e+07   0.3974E-07
UMF1.A     4.077     4.087     3.863     3.889     0.083     0.083    821459 1.1397e+08   0.1419E-03
UMF1.B     2.102     2.109     1.641     1.643     0.058     0.058    558734 5.5743e+07   0.7491E-06


matrices/Garon/garon2.rua
MA41.A     7.645     8.046     7.133     7.492     0.215     0.217   2396585 3.4220e+08     2.76D-11
MA41.B     7.785     8.187     7.141     7.499     0.216     0.224   2396585 3.4220e+08     2.76D-11
UMF2.A    76.076    81.396    82.050   167.636     1.013    27.110   7322867 3.2870e+09   0.6202E+02
UMF2.B    31.693    31.801    30.761    31.005     0.386     0.386   4556143 1.2800e+09   0.5692E-04
UMF1.A    73.480    82.058    86.705   239.346     1.098    28.960   7100507 2.9960e+09   0.7279E-01
UMF1.B    25.236    25.665    25.874    26.828     0.408     0.408   4099241 9.2035e+08   0.1846E-04