A generic and adaptive approach for workload distribution in multitier cluster systems with an application to distributed matrix multiplication

  • We present a novel approach of distributing matrix multiplications among GPU-equipped nodes in a cluster system. In this context we discuss the induced challenges and possible solutions. Additionally we state an algorithm which outperforms optimized GPU BLAS libraries for small matrices. Furthermore we provide a novel theoretical model for distributing algorithms within homogeneous computation systems with multiple hierarchies. In the context of this model we develop an algorithm which can find the optimal distribution parameters for each involved subalgorithm. We provide a detailed analysis of the algorithms space and time complexities and justify its use with a structured evaluation within a small GPU-equipped Beowulf cluster.

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Uwe Handmann, Thomas Kopinski, Darius Malysiak
Parent Title (English):In 16th IEEE International Symposium on Computational Intelligence and Informatics, Budapest, Hungary
Document Type:Conference Proceeding
Language:English
Year of Completion:2015
Release Date:2019/07/08
Institutes:Fachbereich 1 - Institut Informatik
DDC class:000 Allgemeines, Informatik, Informationswissenschaft / 000 Allgemeines, Wissenschaft
Licence (German):License LogoNo Creative Commons