The University of Florida Sparse Matrix Collection
For a general-audience description of the matrices and images
posted here, see our short note on Visualizing
Sparse Matrices.
(click reload to see a replay of the animation).
Maintained by Tim Davis
and
Yifan Hu, AT&T Research.
From the abstract of the paper
The University of Florida Sparse Matrix Collection:
We describe the University of Florida Sparse Matrix Collection, a large
and actively growing set of sparse matrices that arise in real applications.
The Collection is widely used by the numerical linear algebra community for
the development and performance evaluation of sparse matrix algorithms. It
allows for robust and repeatable experiments:
robust because performance results with artificially-generated matrices can be
misleading, and repeatable because matrices are curated and made publicly
available in many formats.
Its matrices cover a wide spectrum of domains, include those arising from
problems with underlying 2D or 3D geometry (as structural engineering,
computational fluid dynamics, model reduction, electromagnetics, semiconductor
devices, thermodynamics, materials, acoustics, computer graphics/vision,
robotics/kinematics, and other discretizations) and those that typically do not
have such geometry (optimization, circuit simulation, economic and
financial modeling, theoretical and quantum chemistry, chemical process
simulation, mathematics and statistics, power networks, and other networks and
graphs).
We provide software for accessing and managing the Collection, from
MATLAB, Mathematica, Fortran, and C, as well as an
online search capability. Graph visualization of the matrices is provided, and
a new multilevel coarsening scheme is proposed to facilitate this task.
The collection also appears as a
Public Data Set hosted by
Amazon Web Services, at
aws.amazon.com:
Sample Gallery of the University of Florida Sparse Matrix Collection:
Click on the thumbnails below for a close-up.
The images above of matrices in the UF Sparse Matrix Collection were created by
Yifan Hu, AT&T.
As of July 2011, it contains 2547 problems (some of which are sequences of
dozens of matrices). The largest has a dimension of 118 million, and the
matrix with the most nonzero entries has almost 2 billion of them. The
matrices are available in three formats: MATLAB mat-file, Rutherford-Boeing,
and Matrix Market. Note that the MATLAB mat-files can only be read by MATLAB
7.0 or later.
This collection is managed by Tim Davis, with images created by Yifan Hu.
``Editors'' of other collections
are attributed, via the Problem.ed field in each problem set. Problem.author
is the matrix creator. Other collections are always welcome.
Click here for a paper describing the collection.
The UFget
package includes a Java program (UFgui)
for browsing and downloading the matrices in any format on any platform.
Below is a screenshot (click on it to download the code):
Browse the collection:
Statistics:
MATLAB and stand-alone Java GUI interfaces:
- Click here for the UFget MATLAB and
UFgui Java interfaces.
UFget.m provides
for simple access to the collection, right inside your MATLAB workspace.
From inside MATLAB, UFget will download a matrix, cache it locally, and
load it into your MATLAB workspace. No need to use a browser to get a matrix.
You can even use the built-in index to search for matrices that fit your
criteria ... all inside MATLAB.
For example, to download all symmetric matrices
into MATLAB, in increasing size as measured by nnz(A):
index = UFget ; % get index of the UF Sparse Matrix Collection
ids = find (index.numerical_symmetry == 1) ;
[ignore, i] = sort (index.nnz (ids)) ;
ids = ids (i) ;
for id = ids
Prob = UFget (id) % Prob is a struct (matrix, name, meta-data, ...)
A = Prob.A ; % A is a symmetric sparse matrix
end
The
DIMACS10
Challenge Problems can be accessed in the UF Collection,
using the
dimacs10
toolbox.
Software:
The SuiteSparse collection of packages includes
all of the software that I used to generate these web pages, and to
manage the matrices themselves (creating the Matrix Market and Rutherford/
Boeing files, for example).
You don't need SuiteSparse to access the matrices, however.
References:
-
To cite this collection, use the following:
The University of Florida Sparse Matrix Collection
T. A. Davis and Y. Hu,
ACM Transactions on Mathematical Software (to appear),
http://www.cise.ufl.edu/research/sparse/matrices.
- For additional background, see: Duff, I.S, Grimes, R. G, and Lewis, J. G,
Sparse matrix test problems,
ACM Transactions on Mathematical Software,
vol 15, no. 1, pp 1-14, 1989.
This describes the Harwell-Boeing collection which is the starting point
of the UF Sparse Matrix Collection (the first 292 matrices).
-
The Rutherford-Boeing format is described in the document
Rutherford-Boeing Sparse Matrix Collection
-
See the Matrix Market
for a description of the Matrix Market format.
To submit matrices to this collection:
Sparse matrices from real applications are critical to the development of
sparse matrix algorithms. Many sparse matrix algorithm developers
use the matrices at this site to test their methods. If you would like
the next generation of sparse matrix methods to work well on matrices
from your problem domain, then please submit matrices to the collection,
at:
http://www.cise.ufl.edu/dropbox/www
(for recipient: "davis@cise.ufl.edu").
If you upload a file there, I will automatically be notified via email.
Please include details about the matrix.
In particular, include a paragraph or more about the problem the matrix
represents. Include any citations: journal articles, web pages,
conference papers, books, etc that give more details.
You cannot include this description in the
http://www.cise.ufl.edu/dropbox/www form, so please include it an uploaded file, or email it to me.
Use any reasonable format; just tell me what you use. I prefer either the
Matrix Market format, or a MATLAB *.mat file.
Another simple method is the triplet format. The triplet format is a simple
ascii file with nz lines; each line contains a row index, column index, and
numerical value of one entry in the matrix (two values for a complex matrix,
the real part followed by the imaginary part). Duplicates are OK - these are
summed in the output matrix. The triplets can be in any order. If the matrix
dimension cannot be inferred from the row and column indices, please tell me
what they are in another file or email message.
If you wish to include other data (right-hand-sides, solutions,
cost vector c for a linear programming problem, and so on),
use a separate file for each matrix in your
problem. A dense vector of length n should appear as a file with n lines, and
one entry per line (or use the Matrix Market format for dense matrices).
Right-hand sides are particularly important for testing
iterative solvers for sparse linear systems.
Graph drawings, by Yifan Hu
Yifan Hu, at AT&T Labs
has created a
graph drawing program
that can generate truly beautiful drawings of a large graph, based solely on
the connectivity (that is, a sparse matrix). Take a look at his
drawings of the matrices in the UF Sparse Matrix Collection.
Each square matrix in the UF Sparse Matrix Collection
has a link to his graph drawings; clicking on them
will bring up his web page for that matrix, including a link to a higher
resolution image.
For a demo of how Yifan's algorithm works, see the
GraphPlot function, which he wrote for Mathematica,
or you can view it here by right-clicking the figure below and selecting "Play".
(or just click "reload" on your browser).
Below is Yifan Hu's graph drawing of the
Chen/pkustk01 matrix that
I obtained from Pu Chen, Beijing University.
The matrix is a model of the
Beijing Botanical Garden Conservatory.
Overlayed on top of the graph is a picture of the actual building.