DB-SpGEMM: A Massively Distributed Block-Sparse Matrix-Matrix Multiplication for Linear-Scaling DFT Calculations

UTC+8

Linear-scaling O(N) density functional theory (DFT) represents a significant advancement in the field of computational materials science, especially for simulations of large systems where traditional cubic-scaling methods become computationally prohibitive. The core operation in O(N) methods is sparse general matrix-matrix multiplication (SpGEMM), which is the major performance bottleneck. To enhance the computational efficiency of SpGEMM, it is crucial to consider the inherent sparse pattern of these matrices. Targeting block-sparse matrices with moderate block sizes and regular block shapes, we have developed a distributed block-sparse matrix-matrix multiplication (DB-SpGEMM) algorithm for large-scale DFT calculations. Through deep optimizations in distributed matrix storage, computational task decomposition, asynchronous task scheduling, and load balancing, we have implemented a linear-scaling method based on this algorithm within the discontinuous Galerkin density functional theory (DGDFT). On the new Sunway supercomputer, our approach achieves a 8 ∼ 10x speedup compared to the original version on monolayer phosphorene systems, and demonstrates superior scalability.