Discamus continentiam augere, luxuriam coercere
Home -> Publications
Home
  Publications
    
edited volumes
  Awards
  Research
  Teaching
  Miscellaneous
  Full CV [pdf]
  BLOG






  Events








  Past Events





Publications of Torsten Hoefler
Grzegorz Kwasniewski and Marko Kabić and Maciej Besta and Joost VandeVondele and Raffaele Solcà and Torsten Hoefler:

 Red-Blue Pebbling Revisited: Near Optimal Parallel Matrix-Matrix Multiplication

(In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019)
Best Paper Finalist, SC19 Best Student Paper (1/87)

Publisher Reference

Abstract

We propose COSMA: a parallel matrix-matrix multiplication algorithm that is near communication-optimal for all combinations of matrix dimensions, processor counts, and memory sizes. The key idea behind COSMA is to derive an optimal (up to a factor of 0.03% for 10MB of fast memory) sequential schedule and then parallelize it, preserving I/O optimality. To achieve this, we use the red-blue pebble game to precisely model MMM dependencies and derive a constructive and tight sequential and parallel I/O lower bound proofs. Compared to 2D or 3D algorithms, which fix processor decomposition upfront and then map it to the matrix dimensions, it reduces communication volume by up to 3 times. COSMA outperforms the established ScaLAPACK, CARMA, and CTF algorithms in all scenarios up to 12.8x (2.2x on average), achieving up to 88% of Piz Daint’s peak performance. Our work does not require any hand tuning and is maintained as an open source implementation.

Documents

Publisher URL: http://doi.acm.org/10.1145/3295500.3356181download article:     
 

BibTeX

@inproceedings{,
  author={Grzegorz Kwasniewski and Marko Kabić and Maciej Besta and Joost VandeVondele and Raffaele Solcà and Torsten Hoefler},
  title={{Red-Blue Pebbling Revisited: Near Optimal Parallel Matrix-Matrix Multiplication}},
  year={2019},
  month={Nov.},
  booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19)},
  source={http://www.unixer.de/~htor/publications/},
}


serving: 18.97.9.175:55298© Torsten Hoefler