| 2009 | ||
|---|---|---|
| 67 | Mercedes Marqués, Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Robert A. van de Geijn: Out-of-Core Computation of the QR Factorization on Multi-core Processors. Euro-Par 2009: 809-820 | |
| 66 | Mercedes Marqués, Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Robert A. van de Geijn: Solving "large" dense matrix problems on multi-core processors. IPDPS 2009: 1-8 | |
| 65 | Gregorio Quintana-Ortí, Francisco D. Igual, Enrique S. Quintana-Ortí, Robert A. van de Geijn: Solving dense linear systems on platforms with multiple hardware accelerators. PPOPP 2009: 121-130 | |
| 64 | Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Robert A. van de Geijn, Field G. Van Zee, Ernie Chan: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans. Math. Softw. 36(3): (2009) | |
| 2008 | ||
| 63 | Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Ernie Chan, Robert A. van de Geijn, Field G. Van Zee: Design of scalable dense linear algebra libraries for multithreaded architectures: the LU factorization. IPDPS 2008: 1-8 | |
| 62 | Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Ernie Chan, Robert A. van de Geijn, Field G. Van Zee: Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures. PDP 2008: 301-310 | |
| 61 | Ernie Chan, Field G. Van Zee, Paolo Bientinesi, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, Robert A. van de Geijn: SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. PPOPP 2008: 123-132 | |
| 60 | Jeffrey R. Diamond, Behnam Robatmili, Stephen W. Keckler, Robert A. van de Geijn, Kazushige Goto, Doug Burger: High performance dense linear algebra on a spatially distributed processor. PPOPP 2008: 63-72 | |
| 59 | Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí, Alfredo Remón, Robert A. van de Geijn: An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization. VECPAR 2008: 228-239 | |
| 58 | Field G. Van Zee, Paolo Bientinesi, Tze Meng Low, Robert A. van de Geijn: Scalable parallelization of FLAME code via the workqueuing model. ACM Trans. Math. Softw. 34(2): (2008) | |
| 57 | Kazushige Goto, Robert A. van de Geijn: Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw. 34(3): (2008) | |
| 56 | Paolo Bientinesi, Brian C. Gunter, Robert A. van de Geijn: Families of algorithms related to the inversion of a Symmetric Positive Definite matrix. ACM Trans. Math. Softw. 35(1): (2008) | |
| 55 | Kazushige Goto, Robert A. van de Geijn: High-performance implementation of the level-3 BLAS. ACM Trans. Math. Softw. 35(1): (2008) | |
| 54 | Enrique S. Quintana-Ortí, Robert A. van de Geijn: Updating an LU Factorization with Pivoting. ACM Trans. Math. Softw. 35(2): (2008) | |
| 2007 | ||
| 53 | Ernie Chan, Field G. Van Zee, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, Robert A. van de Geijn: Satisfying your dependencies with SuperMatrix. CLUSTER 2007: 91-99 | |
| 52 | Robert A. van de Geijn: The science of programming dense linear algebra libraries. CLUSTER 2007 | |
| 51 | Bryan Marker, Field G. Van Zee, Kazushige Goto, Gregorio Quintana-Ortí, Robert A. van de Geijn: Toward Scalable Matrix Multiply on Multithreaded Architectures. Euro-Par 2007: 748-757 | |
| 50 | Ernie Chan, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, Robert A. van de Geijn: Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. SPAA 2007: 116-125 | |
| 49 | Ernie Chan, Marcel Heimlich, Avi Purkayastha, Robert A. van de Geijn: Collective communication: theory, practice, and experience. Concurrency and Computation: Practice and Experience 19(13): 1749-1783 (2007) | |
| 2006 | ||
| 48 | Ernie Chan, Robert A. van de Geijn, William Gropp, Rajeev Thakur: Collective communication on architectures that support simultaneous communication over multiple links. PPOPP 2006: 2-11 | |
| 47 | Thierry Joffrain, Tze Meng Low, Enrique S. Quintana-Ortí, Robert A. van de Geijn, Field G. Van Zee: Accumulating Householder transformations, revisited. ACM Trans. Math. Softw. 32(2): 169-179 (2006) | |
| 46 | Gregorio Quintana-Ortí, Robert A. van de Geijn: Improving the performance of reduction to Hessenberg form. ACM Trans. Math. Softw. 32(2): 180-194 (2006) | |
| 2005 | ||
| 45 | Tze Meng Low, Robert A. van de Geijn, Field G. Van Zee: Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications. PPOPP 2005: 153-163 | |
| 44 | Paolo Bientinesi, John A. Gunnels, Margaret E. Myers, Enrique S. Quintana-Ortí, Robert A. van de Geijn: The science of deriving dense linear algebra algorithms. ACM Trans. Math. Softw. 31(1): 1-26 (2005) | |
| 43 | Paolo Bientinesi, Enrique S. Quintana-Ortí, Robert A. van de Geijn: Representing linear algebra algorithms in code: the FLAME application program interfaces. ACM Trans. Math. Softw. 31(1): 27-59 (2005) | |
| 42 | Brian C. Gunter, Robert A. van de Geijn: Parallel out-of-core computation and updating of the QR factorization. ACM Trans. Math. Softw. 31(1): 60-78 (2005) | |
| 41 | Paolo Bientinesi, Inderjit S. Dhillon, Robert A. van de Geijn: A Parallel Eigensolver for Dense Symmetric Matrices Based on Multiple Relatively Robust Representations. SIAM J. Scientific Computing 27(1): 43-66 (2005) | |
| 2004 | ||
| 40 | E. W. Chan, M. F. Heimlich, Avi Purkayastha, Robert A. van de Geijn: On optimizing collective communication. CLUSTER 2004: 145-155 | |
| 39 | E. W. Chan, M. F. Heimlich, Avi Purkayastha, Robert A. van de Geijn: Attaining higher performance in collective communication. CLUSTER 2004: 484 | |
| 38 | John A. Gunnels, Fred G. Gustavson, Greg Henry, Robert A. van de Geijn: A Family of High-Performance Matrix Multiplication Algorithms. PARA 2004: 256-265 | |
| 37 | Paolo Bientinesi, John A. Gunnels, Fred G. Gustavson, Greg Henry, Margaret E. Myers, Enrique S. Quintana-Ortí, Robert A. van de Geijn: Rapid Development of High-Performance Linear Algebra Libraries. PARA 2004: 376-384 | |
| 36 | Paolo Bientinesi, Sergey Kolos, Robert A. van de Geijn: Automatic Derivation of Linear Algebra Algorithms with Application to Control Theory. PARA 2004: 385-394 | |
| 35 | Thierry Joffrain, Enrique S. Quintana-Ortí, Robert A. van de Geijn: Rapid Development of High-Performance Out-of-Core Solvers. PARA 2004: 413-422 | |
| 2003 | ||
| 34 | Enrique S. Quintana-Ortí, Robert A. van de Geijn: Formal derivation of algorithms: The triangular sylvester equation. ACM Trans. Math. Softw. 29(2): 218-243 (2003) | |
| 2002 | ||
| 33 | Thuan D. Cao, John F. Hall, Robert A. van de Geijn: Parallel Cholesky Factorization of a Block Tridiagonal Matrix. ICPP Workshops 2002: 327-335 | |
| 2001 | ||
| 32 | John A. Gunnels, Robert A. van de Geijn, Daniel S. Katz, Enrique S. Quintana-Ortí: Fault-Tolerant High-Performance Matrix Multiplication: Theory and Practice. DSN 2001: 47-56 | |
| 31 | Brian C. Gunter, Wesley C. Reiley, Robert A. van de Geijn: Parallel Out-of-Core Cholesky and QR Factorization with POOCLAPACK. IPDPS 2001: 179 | |
| 30 | John A. Gunnels, Greg Henry, Robert A. van de Geijn: A Family of High-Performance Matrix Multiplication Algorithms. International Conference on Computational Science (1) 2001: 51-60 | |
| 29 | John A. Gunnels, Fred G. Gustavson, Greg Henry, Robert A. van de Geijn: FLAME: Formal Linear Algebra Methods Environment. ACM Trans. Math. Softw. 27(4): 422-455 (2001) | |
| 28 | Enrique S. Quintana-Ortí, Robert A. van de Geijn: Specialized Parallel Algorithms for Solving Lyapunov and Stein Equations. J. Parallel Distrib. Comput. 61(10): 1489-1504 (2001) | |
| 2000 | ||
| 27 | John A. Gunnels, Robert A. van de Geijn: Formal Methods for High-Performance Linear Algebra Libraries. The Architecture of Scientific Software 2000: 193-210 | |
| 1999 | ||
| 26 | James Overfelt, Yuhong Fu, Gregory J. Rodin, Robert A. van de Geijn: Application Driven Fast Summation Methods. PPSC 1999 | |
| 25 | Enrique S. Quintana-Ortí, Robert A. van de Geijn: Fast Parallel Kernels for Selected Problems in Control Theory. PPSC 1999 | |
| 1998 | ||
| 24 | Gregory S. Baker, John A. Gunnels, Greg Morrow, Béatrice Riviére, Robert A. van de Geijn: PLAPACK: High Performance through High-Level Abstraction. ICPP 1998: 414- | |
| 23 | John A. Gunnels, Calvin Lin, Greg Morrow, Robert A. van de Geijn: A Flexible Class of Parallel Matrix Multiplication Algorithms. IPPS/SPDP 1998: 110-116 | |
| 1997 | ||
| 22 | Phillip Alpatov, Gregory S. Baker, Carter Edwards, John A. Gunnels, Greg Morrow, James Overfelt, Robert A. van de Geijn, Yuan-Jye J. Wu: PLAPACK: Parallel Linear Algebra Package. PPSC 1997 | |
| 21 | Robert A. van de Geijn, Jerrell Watts: SUMMA: scalable universal matrix multiplication algorithm. Concurrency - Practice and Experience 9(4): 255-274 (1997) | |
| 20 | Domingo Giménez, Vicente Hernández, Robert A. van de Geijn, Antonio M. Vidal: A block Jacobi method on a mesh of processors. Concurrency - Practice and Experience 9(5): 391-411 (1997) | |
| 19 | Almadena Yu. Chtchelkanova, John A. Gunnels, Greg Morrow, James Overfelt, Robert A. van de Geijn: Parallel implementation of BLAS: general techniques for Level 3 BLAS. Concurrency - Practice and Experience 9(9): 837-857 (1997) | |
| 1996 | ||
| 18 | Domingo Giménez, Robert A. van de Geijn, Vicente Hernández, Antonio M. Vidal: Exploiting the Symmetry on the Jacobi Method on a Mesh of Processors. PDP 1996: 377-384 | |
| 17 | Michael Barnett, David G. Payne, Robert A. van de Geijn, Jerrell Watts: Broadcasting on Meshes with Wormhole Routing. J. Parallel Distrib. Comput. 35(2): 111-122 (1996) | |
| 16 | Brian Grayson, Robert A. van de Geijn: A High Performance Parallel Strassen Implementation. Parallel Processing Letters 6(1): 3-12 (1996) | |
| 1995 | ||
| 15 | Kenneth Klimkowski, Robert A. van de Geijn: Anatomy of a Parallel Out-of-Core Dense Linear Solver. ICPP (3) 1995: 29-33 | |
| 14 | Michael Barnett, Richard J. Littlefield, David G. Payne, Robert A. van de Geijn: Global Combine Algorithms for 2-D Meshes with Wormhole Routing. J. Parallel Distrib. Comput. 24(2): 191-201 (1995) | |
| 13 | Jerrell Watts, Robert A. van de Geijn: A Pipelined Broadcast for Multidimensional Meshes. Parallel Processing Letters 5: 281-292 (1995) | |
| 1994 | ||
| 12 | Michael Barnett, Lance Shuler, Satya Gupta, David G. Payne, Robert A. van de Geijn, Jerrell Watts: Building a high-performance collective communication library. SC 1994: 107-116 | |
| 11 | Edward J. Barragy, Graham F. Carey, Robert A. van de Geijn: Performance and Scalability of Finite Element Analysis for Distributed Parallel Computation. J. Parallel Distrib. Comput. 21(2): 202-212 (1994) | |
| 10 | Robert A. van de Geijn: On Global Combine Operations. J. Parallel Distrib. Comput. 22(2): 324-328 (1994) | |
| 9 | Jack Dongarra, Robert A. van de Geijn, David W. Walker: Scalability Issues Affecting the Design of a Dense Linear Algebra Library. J. Parallel Distrib. Comput. 22(3): 523-537 (1994) | |
| 1993 | ||
| 8 | Michael Barnett, Richard J. Littlefield, David G. Payne, Robert A. van de Geijn: Global Combine on Mesh Architectures with Wormhole Routing. IPPS 1993: 156-162 | |
| 7 | James Demmel, Jack Dongarra, Robert A. van de Geijn, David W. Walker: LAPACK for Distributed Memory Architectures: The Next Generation. PPSC 1993: 323-329 | |
| 6 | Jack Dongarra, Robert A. van de Geijn, R. Clinton Whaley: Two Dimensional Basic Linear Algebra Communication Subprograms. PPSC 1993: 347-352 | |
| 5 | Michael Barnett, Richard J. Littlefield, David G. Payne, Robert A. van de Geijn: Efficient Communication Primitives on Mesh Architectures with Hardware Routing. PPSC 1993: 943-948 | |
| 4 | John G. Lewis, Robert A. van de Geijn: Distributed memory matrix-vector multiplication and conjugate gradient algorithms. SC 1993: 484-492 | |
| 1992 | ||
| 3 | Jack Dongarra, Robert A. van de Geijn: Reduction to condensed form for the eigenvalue problem on distributed memory architectures. Parallel Computing 18(9): 973-982 (1992) | |
| 1991 | ||
| 2 | Ed Anderson, Annamaria Benzoni, Jack Dongarra, Steve Moulton, Susan Ostrouchov, Bernard Tourancheau, Robert A. van de Geijn: LAPACK for Distributed Memory Architectures: Progress Report. PPSC 1991: 625-630 | |
| 1990 | ||
| 1 | Duncan G. Hudson III, Robert A. van de Geijn: An asymptotically 100% efficient parallel implementation of the nonsymmetric QR algorithm. SPDP 1990: 243-249 | |