Bibliography

Home Up Contents Bibliography Errata Papers Other Publications

    
ABE+97
P. Alpatov, G. Baker, C. Edwards, J. Gunnels, G. Morrow, J. Overfelt, R. van de Geijn, and Y.-J. J. Wu.
PLAPACK: Parallel linear algebra package design overview.
In Supercomputing, 1997.
Also available at http://www.supercomp.org/sc97/proceedings/TECH/ALPATOV/INDEX.HTM.
ABKP03
Mark F. Adams, Harun H. Bayraldar, Tony M. Keaveny, and Panayiotis Papadopoulos.
Applications of algebraic multigrid to large-scale finite element analysis of whole bone micro-mechanics on the IBM SP.
In Proceedings SC'03. IEEE Press, 2003.
Also available at http://www.sc-conference.org/sc2003.
ACC+90
R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith.
The Tera computer system.
In Proceedings of the ACM International Conference on Supercomputing, pages 1-6, June 1990.
ACK+02
David P. Anderson, Jeff Cobb, Eric Korpela, Matt Lebofsky, and Dan Werthimer.
SETI@home: An experiment in public-resource computing.
Communications of the ACM, 45(11):56-61, November 2002.
ADE+01
V. Aslot, M. Domeika, R. Eigenmann, G. Gaertner, W. B. Jones, and B. Parady.
Specomp: A new benchmark suite for measuring parallel computer performance.
In OpenMP Shared Memory Parallel Programming, volume 2104 of Lecture Notes in Computer Science, pages 1-10. Springer Verlag, 2001.
AE03
Vishal Aslot and Rudolf Eigenmann.
Quantitative performance analysis of the SPEC OMP 2001 benchmarks.
Scientific Programming, 11:105-124, 2003.
AGH00
Ken Arnold, James Gosling, and David Holmes.
The Java(TM) Programming Language.
Addison-Wesley Pub Co, 3rd edition, 2000.
AIS77
Christopher Alexander, Sara Ishikawa, and Murray Silverstein.
A Pattern Language: Towns, Buildings, Construction.
Oxford University Press, 1977.
AJMJS02
Al-Jaroodi, Mohamed, Jiang, and Swanson.
A comparative study of parallel and distributed Java projects.
In IPDPS Workshop on Java for Parallel and Distributed Computing, 2002.
AML+99
E. Ayguade, X. Martorell, J. Labarta, M. Gonzalez, and Navarro N.
Exploiting multiple levels of parallelism in OpenMP: a case study.
In Proceedings of the 1999 International Conference on Parallel Processing, pages 172-80. IEEE Comput. Soc., 1999.
And00
Gregory R. Andrews.
Foundations of Multithreaded, Parallel, and Distributed Programming.
Addison-Wesley, 2000.
ARv03
C. Adison, Y. Ren, and M. van Waveren.
OpenMP issues arising in the development of parallel BLAS and LAPACK libraries.
Scientific Programming, 11(2):95-104, 2003.
BB99
Christian Brunschen and Mats Brorsson.  
OdinMP/CCP:  A portable implementation of OpenMP for C. 
In Proceedings of the European Workshop on OpenMP, 1999. Also available at http://www.commmunity.org/eayguade/resPub/papers/ewomp99/brunschen.pdf 
BBC+03
C. Bell, D. Bonachea, Y. Cote, J. Duell, P. Hargrove, P. Husbands, C. Iancu, M. Welcome, and K. Yelick.
An evaluation of current high-performance networks.
In Proceedings of IPDPS, 2003.
BBE+99
Steve W. Bova, Clay P. Beshears,  Rudolf Eigenmann, Henry Gabb, Greg Gaertner, Bob Kuhn, Bill Magro, and Stefano Salvini and Veer Vatsa.
Combining message passing and directives in parallel applications.
SIAM 32(9), November 1999
BC87
K. Beck and W. Cunningham.
Using pattern languages for object-oriented programs.
Presented at the Workshop on Specification and Design, held in conjunction with OOPSLA 1987.  Available at http://c2.com/doc/oopsla87.html
BC00
M. Baker and B. Carpenter.
A proposed Jini infrastructure to support a Java message passing implementation. 
In Proceedings of the 2nd Annual Workshop on Active Middleware Services.  Kluwer Academic Publishers, 2000. Held at HPDC-9.
BCC+97
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley.
ScaLAPACK Users' Guide.
Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997.
BCKL98
M. Baker, B. Carpenter, S. Ko, and X. Li.
mpiJava: A Java interface to MPI,
Presented at the First UK Workshop on Java for High Performance Network Computing, Europar 1998.  Available at http://www.hpjava.org/papers/mpiJava/mpiJava.pdf, 1998
BCM+91
R. Bjornson, N. Carriero, T .G. Mattson, D. Kaminsky, and A. Sherman.
Experience with Linda.
Technical Report RR-866, Yale University Computer Science Department, August 1991.
BDK95
A. Baratloo, P. Dasgupta, and Z. M. Kedem.
CALYPSO:  A novel software system for fault tolerant parallel processing on distributed platforms.
In Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing, 1995.
Beo
Beowulf.org: The Beowulf cluster site.
http:www.beowulf.org.
[BGMS98]
Satish Balay, William D. Gropp, Lois Curfman McInnes, and Barry F. Smith.  
PETSc home page.  http://www.mcs.anl.gov/petsc 1998.
BH86
Josh Barnes and Piet Hut.
A hierarchical O(N log N) force calculation algorithm.
Nature, 324(4), December 1986.
BJK+96
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou.
Cilk: An efficient multithreaded runtime system.
Journal of Parallel and Distributed Computing, 37(1):55-69, 25 August 1996.
BKS91
R. Bjornson, C. Kolb, and A. Sherman.
Ray tracing with network Linda.
SIAM News, 24(1), January 1991.
BMR+96
Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad, and Michael Stal.
Pattern-Oriented Software Architecture, Volume 1: A System of Patterns.
John Wiley & Son Ltd, 1996.
BP99
Robert D. Blumofe and Dionisios Papadopoulos.
Hood: A user-level threads library for multiprogrammed multiprocessors.
Technical report, University of Texas, 1999.
See also http://www.cs.utexas.edu/users/hood/.
BT89
D. P. Bertsekas and J. N. Tsitsiklis.
Parallel and Distributed Computation Numerical Methods.
Prentice-Hall, 1989.
But97
David R. Butenhof.
Programming with POSIX(R) Threads.
Addison-Wesley, 1st edition, 1997.
CD97
A. Cleary and J. Dongarra.
Implementation in ScaLAPACK of divide-and-conquer algorithms for banded and tridiagonal linear systems.
Technical Report CS-97-358, University of Tennesee, Knoxville, TN 37996, USA, 1997.
Also available as LAPACK Working Note #124 from http://www.netlib.org/lapack/lawns/.
CDK+00
Rohit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon.  
Parallel Programming in OpenMP.  
Morgan Kaufman Publishers, 2000.
Cen
The center for programming models for scalable parallel computing.
http://www.pmodels.org.
CG86
K. L. Clark and S. Gregory.
PARLOG: Parallel programming in logic.
ACM Trans. Programming Language Systems, 8(1):1-49, 1986.
CG91
N. Carriero and D. Gelernter.
How to Write Parallel Programs: A First Course.
MIT press, 1991.
CGMS94
N. J. Carriero, D. Gelernter, T. G. Mattson, and A. H. Sherman.
The Linda alternative to message-passing systems.
Parallel Computing, 20:633-655, 1994.
CKP+93
D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, R. Subramonian E. Santos, and T. von Eicken.
LogP: Toward a realistic model of parallel computation.
In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1-12, May 1993.
CLL+99
James Cowie, Hongbo Liu, Jason Liu, David Nicol, and Andy Agielski.
Towards realistic million-node internet simulations.
In Proceedings of the 1999 International Conference on Parallel and Distributed Processing, 1999.
See also http://www.ssfnet.org.
CLW+00
A. Choudhary, W. Liao, D. Weiner, P. Varshnet, R. Linderman, and R. Brown.
Design, implementation, and evaluation of parallel pipelined STAP on parallel computers.
IEEE Transactions on Aerospace and Electronic Systems, 36(2):528-548, April 2000.
CMD+00
R. Chandra, R. Menon, L. Dagum, D. Kohr, D. Maydan, and J. McDonald.
Parallel Programming in OpenMP.
Morgan Kaufmann Publishers, 2000.
Co
 
Co-Array Fortran.
http://www.co-array.org.
Con
 
Concurrent ML.
http://cml.cs.uchicago.edu.
COR
 
Corba.
www.corba.org.
CPP01
Barbara Chapman, Amit Patil, and Achal Prabhakar.
Performance-oriented programming for NUMA architectures.
In R. Eigenmann and M. .J. Voss, editors, Proceedings of WOMPAT 2001 (LNCS 2104), pages 137-154. Springer Verlag, 2001.
CS95
J. O. Coplien and D. C. Schmidt, editors.
Pattern Languages of Program Design.
Addison-Wesley, 1995.
DD97
J. J. Dongarra and T. Dunigan.
Message-passing performance of various computers.
Concurrency: Practice and Experience, 9(10):915-926, 1997.
DFF+02
Jack Dongarra, Ian Foster, Geoffrey Fox, Ken Kennedy, Andy White, Linda Torczon, and William Gropp, editors.
The Sourcebook of Parallel Computing.
Morgan Kaufmann, 2002.
DFP+94
S. Das, R. M. Fujimoto, K. Panesar, D. Allison, and M. Hybinette.
GTW: A Time Warp system for shared memory multiprocessors.
In Proceedings of the 1994 Winter Simulation Conference, pages 1332-1339, 1994.
DGO+94
P. Dinda, T. Gross, D. O'Hallaron, E. Segall, J. Stichnoth, J. Subhlok, J. Webb, and B. Yang.
The CMU task parallel program suite.
Technical Report CMU-CS-94-131, School of Computer Science, Carnegie Mellon University, March 1994.
DKK90
Jack Dongarra, Alan H. Karp, and David J. Kuck.
1989 Gordon Bell prize.
IEEE Software, 7(3):100-104, 110, 1990.
Dou86
A. Douady.
Julia sets and the mandelbrot set.
In H. O. Peitgen and D. H. Richter, editors, The Beauty of Fractals: Images of Complex Dynamical Systems, page 161. Berlin: Springer-Verlag, 1986.
DS80
E. W. Dijkstra and C. S. Scholten.
Termination detection for diffusing computations.
Information Processing Letters, 11(1), August 1980.
DS87
J.J. Dongarra and D.C. Sorensen.
A fully parallel algorithm for the symmetric eigenvalue problem.
SIAM J. Sci. and Stat. Comp., 8:S139-S154, 1987.
EG88
David Eppstein and Zvi Galil.
Parallel algorithmic techniques for combinatorial computation.
Annual Reviews in Computer Science, 3:233-283, 1988.
Ein00
David Einstein.
Compaq a winner in gene race.
Forbes.com, June 26 2000.
http://www.forbes.com/2000/06/26/mu7.html.
EM
Rudolf Eigenmann and Timothy G. Mattson.
OpenMP tutorial, part 2: Advanced OpenMP.
A tutorial presented a SC'2001, Denver Colorado, USA, 2001.
EV01
Rudolf Eigenmann and Michael J. Voss, editors.
OpenMP Shared Memory Parallel Programming, volume 2104 of Lecture Notes in Computer Science.
Springer-Verlag, 2001.
EWO01
Special issue.
Scientific Programming, 9(2-3), 2001.
Selected papers from the Second European Workshop on OpenMP (EWOMP 2000.
FCO90
J. T. Feo, D. C. Camm, and R. R. Oldehoeft.
A report on the SISAL language project.
Journal of Parallel and Distributed Computing, 12:349, 1990.
FHA99
Eric Freeman, Susanne Hupfer, and Ken Arnold.
JavaSpaces(TM) Principles, Patterns, and Practice.
Addison-Wesley, 1999.
FJL+88
G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker.
Solving Problems on Concurrent Processors.
Prentice Hall, 1988.
FK03
I. Foster and C. Kesselman.
The Grid 2: Blueprint for a new Computing Infrastructure, 2nd edition
Morgan Kaufmann Publishers, 2003.
FLR98
Matteo Frigo, Charles Leiserson, and Keith Randall.
The implementation of the Cilk-5 multithreaded language.
In Proceedings of 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1998.
Fly72
M. J. Flynn.
Some computer organizations and their effectiveness.
IEEE Transactions on Computers, C-21(9), 1972.
GAM+00
M. Gonzalez, E. Ayguade, X. Martorell, J. Labarta, N. Navarro, and J. Oliver.
NanosCompiler: Supporting flexible multilevel parallelism in OpenMP.
Concurrency: Practice and Experience, Special Issue on OpenMP, 12(q12):1205-1218, October 2000.
GG90
L. Greengard and W. D. Gropp.
A parallel version for the fast multipole method.
Computers Math. Applic, 20(7), 1990.
GGHvdG01
John A. Gunnels, Fred G. Gustavson, Greg M. Henry, and Robert A. van de Geijn.
FLAME: Formal linear algebra methods environment.
ACM Trans. Math. Soft., 27(4):422-455, December 2001.
Also see http://www.cs.utexas.edu/users/flame/.
GHJV95
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.
Design Patterns: Elements of Reusable Object-Oriented Software.
Addison-Wesley, 1995.
GL96
G.H. Golub and C.F. Van Loan.
Matrix Computations, 3rd Edition
The Johns Hopkins University Press, 1996.
Gloa
Global Arrays.
http://www.emsl.pnl.gov/docs/global/ga.html.
Glob
The Globus Alliance.
http://www.globus.org/.
GLS99
William Gropp, Ewing Lusk, and Anthony Skjellum.
Using MPI: Portable Parallel Programming with the Message-Passing Interface.
The MIT Press, 1999.
GOS94
T. Gross, D. O'Hallaron, and J. Subhlok.
Task parallelism in a High Performance Fortran framework.
IEEE Parallel & Distributed Technology, 2(2):16-26, 1994.
Also see http://www.cs.cmu.edu/afs/cs.cmu.edu/project/iwarp/ member/fx/public/www/fx.html.
GS98
William Gropp and Marc Snir.  
MPI:  The Complete Reference, 2nd Edition. 
MIT Press 1999.
Gus88
John L. Gustafson.
Reevaluating Amdahl's law.
Commun. ACM, 31(5):532-533, 1988.
Har91
R. J. Harrison.
Portable tools and applications for parallel computers.
Int. J. Quantum Chem., 40:847-863, 1991.
HC01
Cay S. Horstmann and Gary Cornell.
Core Java 2, volume Volume II: Advanced Features.
Prentice Hall PTR, 5th edition edition, 2001.
HC02
Cay S. Horstmann and Gary Cornell.
Core Java 2, volume Volume I: Fundamentals.
Prentice Hall PTR, 6th edition edition, 2002.
HFRS99
N. Harrison, B. Foote, H. Rohnert, and D. C. Schmidt, editors.
Pattern Languages of Program Design 4.
Addison-Wesley, 1999.
HHS01
William W. Hargrove, Forrest M. Hoffman, and Thomas Sterling.
The do-it-yourself supercomputer.
Scientific American, August 2001.
Hil
 
Hillside group.
http://hillside.net.
HLCZ99
Y. C. Hu,  Honghui Lu, Al Cox, and W. Zwaenepoel.
OpenMP for networks of SMPs.
In Proceedings of 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, pages 302-310. IEEE Computer Society, 1999.
Hoa74
C. A. R. Hoare.
Monitors: An operating system structuring concept.
Communications of the ACM, 17(10):549-557, 1974.
HPF  
Paul Hudak, John Peterson, and Joseph Fasel.  
A Gentle Introduction to Haskell Version 98. 
http://www.haskell.org/tutorial
HPF97
High Performance Fortran Forum: High Performance Fortran Language specification, version 2.0.
http://dacnet.rice.edu/Depts/CRPC/HPFF, 1997.
HPF99
Japan Association for High Performance Fortran: HPF/JA language specification, version 1.0.
http://www.hpfpc.org/japhf/spec/jahpf_e.html, 1999.
HS86
W. Daniel Hillis and Guy L. Steele, Jr.
Data parallel algorithms.
Communications of the ACM, 29(12):1170-1183, 1986.
Hud89
P. Hudak.
The conception, evaluation, and application of functional programming.
ACM Computing Surveys, 21(3):359-411, 1989.
IBM02
The IBM BlueGene/L Team.  
An overview of the BlueGene/L supercomputer.  I
n Proceedings of SC'2002.  
http://sc-2002.ort/paperpdfs/pap.pap207.pdf
IEE
 
IEEE.
The Open Group Base Specifications, Issue 6, IEEE Std 1003.1, 2004 Edition
http://www.opengroup.org/onlinepubs/009695399/toc.htm
J92
J. JáJá.
An Introduction to Parallel Algorithms.
Addison-Wesley, 1992.
Jag96
R. Jagannathan.
Dataflow models.
In A. Y. H. Zomaya, editor, Parallel and Distributed Computing Handbook, chapter 8. McGraw-Hill, 1996.
Java
Java 2 platform.
java.sun.com.
Javb
Java 2 platform, enterprise edition (j2ee).
java.sun.com/j2ee.
JCS98
G. Judd, M. Clement, and Q. Snell.
DOGMA: Distributed object group management architecture.
Concurrency: Practice and Experience, 1998.
Jef85
David R. Jefferson.
Virtual time.
ACM Transactions on Programming Languages and Systems (TOPLAS), 7(3):404-425, 1985.
JSRa
JSR 133: Java(TM) memory model and thread specification revision.
http://www.jcp.org/en/jsr/detail?id=133.
JSRb
JSR 166: Concurrency utilities.
http://www.jcp.org/en/jsr/detail?id=166.
JSRc
Concurrency JSR-166 interest site.
http://gee.cs.oswego.edu/dl/concurrency-interest/index.html.
KLK+03
Seung Jo Kim, Chang Sung Lee, Jeong Ho Kim, Minsu Joh, and Sangsan Lee.
IPSAP: A high-performance parallel finite element code for large-scale structural analysis based on domain-wise multifrontal technique.
In Proceedings SC'03. IEEE Press, 2003.
Also available at http://www.sc-conference.org/sc2003.
LAM
LAM/MPI parallel computing.
http://www.lam-mpi.org/.
Lam78
Leslie Lamport.
Time, clocks, and the ordering of events in distributed systems.
Communications of the ACM, 21(7):558-565, 1978.
LDSH95
Hans Lischka, Holger Dachsel, Ron Shepard, and Robert J. Harrison.
The parallelization of a general ab initio multireference configuration interaction program: The COLUMBUS program system.
In T. G. Mattson, editor, Parallel Computing in Computational Chemistry, ACS Symposium Series 592, pages 75-83. American Chemical Society, 1995.
Lea
Doug Lea.
Overview of package util.concurrent.
http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/ util/concurrent/intro.html.
Lea00a
Doug Lea.
Concurrent Programming in Java: Design Principles and Patterns.
Addison-Wesley, second edition, 2000.
Lea00b
Doug Lea.
A Java fork/join framework.
In Java Grande, pages 36-43, 2000.
LK98
Micheal Ljungberg and M. A. King, editors.
Monte Carlo Calculations in Nuclear Medicine: Applications in Diagnostic Imaging.
Institute of Physics Publishing, 1998.
Man
 
Manta: Fast Parallel Java.  
http://www.cs.vu.nl/manta.
Mas97
M. Mascagni.
Some methods of parallel pseudorandom number generation.
In R. Schreiber, M. Heath, and A. Ranade, editors, Algorithms for Parallel Processing, pages 277-288. Springer Verlag, New York, Berlin, 1997.
Mat87
F. Mattern.
Algorithms for distributed termination detection.
Distributed Computing, 2(3):161-175, 1987.
Mat94
T. G. Mattson.
The efficiency of Linda for general purpose scientific programming.
Scientific Programming, 3:61-71, 1994.
Mat95
T. G. Mattson, editor.
Parallel Computing in Computational Chemistry, ACS Symposium Series 592.
American Chemical Society, 1995.
Mat96
T.G. Mattson.
Scientific computation.
In A. Zomaya, editor, Parallel and Distributed Computing Handbook. McGraw Hill, 1996.
Mat03
T. G. Mattson.
How good is OpenMP?
Scientific Programming, 11:81-93, 2003.
Mesa
MPI (Message Passing Interface) 2.0 Standard.
http://www.mpi-forum.org/docs/docs.html.
Mesb
Message Passing Interface Forum.
http://www.mpi-forum.org.
Met
Metron, Inc.
SPEEDES (synchronous parallel environment for emulation and discrete-event simulation).
http://www.speedes.com.
MHC+99
A. A. Mirin, R. H., Cohen, B. C. Curtis, W. P. Dannevik, A. M. Dimits, M. A. Duchaineau, D. E. Eliason, D. R. Schikore, S. E. Anderson, D. H. Porter, P. R. Woodward, L. J. Shieh, and S. W. White.
Very high resolution simulation of compressible turbulence on the IBM-SP system.
In Proceedings SC99. IEEE Press, 1999.
Also available at http://www.sc-conference.org/SC1999/.
Min97
S. Mintchev.
Writing programs in JavaMPI, 1997.
Mis86
J. Misra.
Distributed discrete-event simulation.
Computing Surveys, 18(1), 1986.
MPI
 
MPICH--a portable implementation of MPI.
http://www-unix.mcs.anl.gov/mpi/mpich/.
MPS02
W. Magro, P. Petersen, and S. Shah.
Hyper-threading technology: Impact on computer-intensive workloads.
Intel Technology Journal, 06(01), 2002.
ISSN 1535-766X.
MR95
T. G. Mattson and G. Ravishanker.
Portable molecular dynamics software for parallel computing.
In T. G. Mattson, editor, Parallel Computing in Computational Chemistry, ACS Symposium Series 592, page 133. American Chemical Society, 1995.
MRB97
R. C. Martin, D. Riehle, and F. Buschmann, editors.
Pattern Languages of Program Design 3.
Addison-Wesley, 1997.
MSW96
T. G. Mattson, David S. Scott, and Stephen Wheat.
A TeraFLOP in 1996: The ASCI TeraFLOP supercomputer.
In Proceedings of the International Parallel Processing Symposium, 1996.
NA01 
Dimitrios S. Nikolopoulos and Eduard Ayguadé.  
A study of implicit data distribution methods for OpenMp using the SPED benchmarks.  
In Rudolph Eigenmann and Michael J. Voss, editors, OpenMP Shared Memory Parallel Programming, volume 2104 of Lecture Notes in Computer Science pages 115-129.  Springer Verlag, 2001.
NBB01
Jeffrey S. Norris, Paul G. Backes, and Eric T. Baumgartner.
PTEP: The parallel telemetry processor.
In IEEE Proceedings (Aerospace Conference), volume 7, pages 7-3339 - 7-3345, 2001.
Also see wits.jpl.nasa.gov/public/pubs/PTEP-IEEEAS01.pdf.
NHK+02
J. Nieplocha, R. Harrison, M. Krishnan, B. Palmer, and V. Tipparaju.
Combining shared and distributed memory models: Evolution and recent advancements of the Global Array Toolkit.
In Proc. POOHL'2002 Workshop of ICS-2002, 2002.
NHL94
J. Nieplocha, R. J. Harrison, and R. J. Littlefield.
Global Arrays: A portable shared memory model for distributed memory computers.
In Proceedings of Supercomputing '94, pages 340-349, 1994.
NHL96
J. Nieplocha, R. J. Harrison, and R. J. Littlefield.
Global Arrays: A nonuniform memory access programming model for high-performance computers.
The Journal of Supercomputing, 10:197-220, 1996.
NM92
Robert H. B. Netzer and Barton P. Miller.
What are race conditions?: Some issues and formalizations.
ACM Lett. Program. Lang. Syst., 1(1):74-88, 1992.
Omn
Omni OpenMP compiler.
http://phase.hpcc.jp/Omni/home.html.
OMP
OpenMP: Simple, portable, scalable SMP programming.
http://www.openmp.org.
OSG03
Ryan M. Olson, Michael W. Schmidt, and Mark S. Gordon.
Enabling the efficient use of SMP clusters: the GAMESS/DDI model.
In Proceedings SC'03. IEEE Press, 2003.
Also available at http://www.sc-conference.org/sc2003.
Pac96
Peter Pacheco.
Parallel Programming with MPI.
Morgan Kaufmann, 1996.
Pat
The Pattern Languages of Programs conference.
http://jerry.cs.uiuc.edu/~plop/.
PH95
S. J. Plimpton and B. A. Hendrickson.
Parallel molecular dynamics algorithms for simulation of molecular systems.
In T. G. Mattson, editor, Parallel Computing in Computational Chemistry, ACS Symposium Series 592, pages 114-132. American Chemical Society, 1995.
PH98
David A. Patterson and John L. Hennessy.
Computer Organization and Design: The Hardware/Software Interface.
Morgan Kaufmann Publishers, second edition, 1998.
PLA
 
PLAPACK: Parallel linear algebra package.
http://www.cs.utexas.edu/users/plapack.
Pli95
 
S. J. Plimpton.
Fast parallel algorithms for short-range molecular dynamics.
J Comp Phys, 117:1-19, 1995.
PS00
Tom Porter and Galyn Susman.
On site: Creating lifelike characters in Pixar movies.
Communications of the ACM, 43(1):25-29, 2000.
PS04
Bill Pugh and Jaime Spacco.
MPJava: high-performance message passing in Java using java.nio.
In 16th Workshop on Languages and Compilers for Parallel Computing, 2003. Revised papers volume 2958 of Lecture Notes in Computer Science.
Also appeared in the Proceedings of MASPLAS'03. http://www.cs.haverford.edu/masplas/masplas03-01.pdf.
PFTV88  
W. H. Press, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes in C: The Art of Scientific Computing.Cambridge University Press, 1988
Rep99
John H. Reppy.
Concurrent Programming in ML.
Cambridge University Press, 1999.
RHB03
John W. Romein, Jaap Heringa, and Henri E. Bal.
A million-fold speed improvement in genomic repeats detection.
In Proceedings SC'03. IEEE Press, 2003.
http://www.sc-conference.org/sc2003.
RHC+96 
J.V.W. Reynders, P.J. Hinker, J. C. Cummings, S. R. Atlas,  S. Banerjee, W. F. Humphrey, S. R. Karmesin, K. Keahey, M. Srikant, and M. Tholburn.  
POOMA.  
In Parallel Programming using C++.  The MIT Press, 1996.
RMC+98
R. Radhakrishnan, D. E. Martin, M. Chetlur, D. Madhava Rao, and P. A. Wilsey.
An object-oriented Time Warp simulation kernel.
In D. Caromel, R. R. Oldehoeft, and M. Tholburn, editors, Proceedings of the International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE'98) LNCS 1505, 1998.
Also available as http://www.ececs.uc.edu/~pa
Sca
The ScaLAPACK project.
http://www.netlib.org/scalapack/.
Sci03
Special issue on OpenMP and its applications.
Scientific Programming, 11(2), 2003.
SER
Java servlet technology.
http://java.sun.com/products/servlet/.
SET
SETI@home: The search for extraterrestrial intelligence.
http://setiathome.ssl.berkeley.edu/.
SHPT00
S. Shah, G. Haab, P. Petersen, and J. Throop.
Flexible control structures for parallelism in OpenMP.
Concurrency: Practice and Experience, 12:1219-1239, 2000.
SHTS01
Mitsuhiso Sato, Motonari Hirano, Yoshio Tanaka, and Satoshi Sekiguchi.
OmniRPC: A grid RPC facility for cluster and global computing in OpenMP.
In Rudolf Eigenmann and Michael J. Voss, editors, OpenMP Shared Memory Parallel Programming, volume 2104 of Lecture Notes in Computer Science, pages 130-136. Springer-Verlag, 2001.
SHGZ99
A. Scherer, Honghui Lu, T. Gross, and W. Zwaenepoel.
Transparent adaptive parallelism on NOWS using OpenMP.
ACM SIGPLAN Notices (ACM Special Interest Group on Programming Languages), 34(8):96-106, August 1999.
SN90
Xian-He Sun and Lionel M. Ni.
Another view on parallel speedup.
In Proceedings of the 1990 conference on Supercomputing, pages 324-333. IEEE Computer Society Press, 1990.
SR98
Daryl A. Swade and James F. Rose.
OPUS: A flexible pipeline data processing environment.
In Proceedings of the AIAA/USU Conference on Small Satellites, September 1998.
See also http://www.stsci.edu/resources/software_hardware/opus/index_html.
SS94
Leon Sterling and Ehud Shapiro.
The Art of Prolog: Advanced Programming Techniques. 2nd Edition
MIT Press, 1994.
SSD+94
Anthony Skjellum, Steven G. Smith, Nathan E. Doss, Alvin P. Leung, and Manfred Morari.
The design and evolution of Zipcode.
Parallel Computing, 20(4):565-596, 1994.
SSGF00
C. P. Sosa, C. Scalmani, R. Gomperts, and M. J. Frisch.
Ab initio quantum chemistry on a ccNUMA architecture using OpenMP III.
Parallel Computing, 26(7-8):843-56, July 2000.
SSOG93
J. Subhlok, J. Stichnoth, D. O'Hallaron, and T. Gross.
Exploiting task and data parallelism on a multicomputer.
In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993.
Sun90
V. S. Sunderam.
PVM: A framework for parallel distributed computing.
Concurrency: Practice and Experience, 4(2):315-339, 1990.
Tho95
John Thornley.
Performance of a class of highly-parallel divide-and-conquer algorithms.
Technical report, Caltech, 1995.
http://resolver.caltech.edu/CaltechCSTR:1995.cs-tr-95-10.
Tita
Titanium home page.
http://titanium.cs.berkeley.edu/intro.html.
Top
TOP500 supercomputer sites.
http://www.top500.org.
UPC
Unified parallel c.
http://upc.gwu.edu.
VCKS96
J. M. Vlissides, J. O. Coplien, N. L. Keith, and D. C. Schmidt, editors.
Pattern Languages of Program Design 2.
Addison-Wesley, 1996.
vdG97
Robert A. van de Geijn.
Using PLAPACK.
MIT Press, 1997.
vdSD03
Aad J. van der Steen and Jack J. Dongarra.
Overview of recent supercomputers.
http://www.top500.org/ORSC/, 2003.
VJKT00M. 
Valero, K. Joe, M. Kitsuregawa, and H. Tanaka, editors.  High Performance Computing:  Third International Symposium, volume 1940 of Lecture Notes in Computer Science, Springer-Verlag, 2000.
vRBH+98
Robbert van Renesse, Keneth P. Birman, Mark Hayden, Alexey Vaysburd, and David Karr.
Building adaptive systems using Ensemble.
Software--Practice and Experience, 28(9):963-979, August 1998.
See also http:www.cs.cornell.edu/Info/Projects/Ensemble.
Wie01
Frederick Wieland.
Practical parallel simulation applied to aviation modeling.
In Proceedings of the Workshop on Parallel and Distributed Simulation, pages 109-116, 2001.
Win95
A. Windemuth.
Advanced algorithms for molecular dynamics simulation: The program PMD.
In T. G. Mattson, editor, Parallel Computing in Computational Chemistry, ACS Symposium Series 592. American Chemical Society, 1995.
WSG95
T. L. Windus, M. W. Schmidt, and M. S. Gordon.
Parallel implementation of the electronic structure code GAMESS.
In T. G. Mattson, editor, Parallel Computing in Computational Chemistry, ACS Symposium Series 592, pages 16-28. American Chemical Society, 1995.
WY95
C.-P. Wen and K. Yelick.
Portable runtime support for asynchronous simulation.
In International Conference on Parallel Processing, August 1995.
X393
Accredited Standards Committee X3.
Parallel extensions for Fortran.
Technical Report X3H5/93-SDI revision M, American National Standards Institute, April 1993.
YWC+95
K. Yelick, C.-P. Wen, S. Chakrabarti, E. Deprit, J. Jones, and A. Krishnamurthy.
Portable parallel irregular applications.
In Workshop on Parallel Symbolic Languages and Systems, October 1995.
To appear in Lecture Notes in Computer Science.
ZJS+02
Hans P. Zima, Kazuki Joe, Mitsuhisa Sato, Yoshiki Seo, and Masaaki Shimasaki, editors.  Proceedings of HPF International Workshop:  Experiences and Progress (HiWEP 2002), volume 2327 of Lecture Notes in Computer Science.  Springer-Verlag.  2002.