Libraries

SLATE

Next-generation software library for dense linear algebra operations designed to replace ScaLAPACK for modern HPC systems.

Dense Linear AlgebraEigenvalue ProblemsHigh-Performance ComputingHermitian/SymmetricQuasi-Hermitian/SymmetricGPU AccelerationDistributed Memory

Key Features

  • Full coverage of LAPACK and ScaLAPACK functionality
  • Parallel BLAS operations with GPU acceleration
  • Dense linear system solvers
  • Least squares solvers
  • Singular value and eigenvalue solvers
  • Support for distributed-memory systems
  • Built on MPI and OpenMP standards
  • Integration with vendor libraries (MKL, cuBLAS, rocBLAS, etc.)
  • Modern C++ design with task-based parallelism
  • Optimized for both GPU-accelerated and multi-core systems
Language

C++

License

BSD-3

ChASE

A modern C++ implementation of the Chebyshev Accelerated Subspace Eigenvalue Solver for computing selected eigenpairs of large Hermitian and quasi-Hermitian matrices.

Dense Linear AlgebraEigenvalue ProblemsHigh-Performance ComputingHermitian/SymmetricQuasi-Hermitian/SymmetricGPU AccelerationSubspace IterationDistributed Memory

Key Features

  • Optimized for dense Hermitian eigenproblems
  • Optimized for dense quasi-Hermitian eigenproblems (Bethe–Salpeter equations)
  • Highly efficient parallel implementation
  • GPU acceleration support
  • Subspace iteration
  • Excellent scalability for large matrices
  • Distributed memory support
  • Multi-GPU support
Language

C++

License

BSD-3

SLEPc

A PETSc-based toolkit for the (partial) solution of various types of eigenvalue problems, focusing on large-scale sparse or matrix-free problems with iterative algorithms.

Sparse Linear AlgebraEigenvalue ProblemsSingular Value ProblemsHigh-Performance ComputingHermitian/SymmetricNon-Hermitian/SymmetricQuasi-Hermitian/SymmetricGPU AccelerationDistributed Memory

Key Features

  • Covers both standard and generalized eigenproblems, either Hermitian or non-Hermitian
  • Support for structured eigenproblems (Bethe–Salpeter and other)
  • Easy selection of available solvers: Krylov-Schur, Jacobi-Davidson, LOBPCG, contour integral, etc.
  • Built-in support for shift-and-invert spectral transformation, as well as polynomial filters
  • Also (partial) solution of different singular value problems: SVD, GSVD, HSVD
  • Polynomial eigenvalue problems (either quadratic or higher degree) with or without (implicit) linearization
  • Support for general nonlinear eigenvalue problems involving almost any nonlinear function (including rational, square root, exponential)
  • Basic functionality for computing the action of a matrix function on a vector
  • Tight integration with PETSc, leveraging all linear system solvers and basic infrastructure
  • MPI and GPU parallelism
  • Real or complex arithmetic, with single, double, or quad precision
Language

C

License

BSD-2

DLA-Future

A distributed linear algebra library implemented using C++ std::execution. It provides an asynchronous C++ interface, a synchronous C interface, and a synchronous ScaLAPACK-like C interface.

Dense Linear AlgebraEigenvalue ProblemsHigh-Performance ComputingHermitian/SymmetricGPU AccelerationDistributed Memory

Key Features

  • Dense Symmetric/Hermitian eigenproblems
  • Dense Symmetric/Hermitian generalized eigenproblems
  • Cholesky factorization and inverse of positive definite matrices
  • Highly efficient parallel implementation
  • Integration with vendor libraries (MKL, cuBLAS, rocBLAS, etc.)
  • Modern C++ design with C++ standard task-based parallelism (std::execution)
  • Multi-core systems support
  • NVIDIA GPU acceleration support
  • AMD GPU acceleration support
  • Multi-node and multi-GPU support through MPI
  • Real or complex arithmetic, with single or double precision
  • Installable with Spack
Language

C++

License

BSD-3

DLA-Future-Fortran

DLA-Future-Fortran

Fortran wrappers for DLA Future.

Dense Linear AlgebraEigenvalue ProblemsHigh-Performance ComputingHermitian/SymmetricGPU AccelerationDistributed Memory

Key Features

  • Dense Symmetric/Hermitian eigenproblems
  • Dense Symmetric/Hermitian generalized eigenproblems
  • Cholesky factorization and inverse of positive definite matrices
  • Highly efficient parallel implementation
  • Integration with vendor libraries (MKL, cuBLAS, rocBLAS, etc.)
  • Modern C++ design with C++ standard task-based parallelism (std::execution)
  • Multi-core systems support
  • NVIDIA GPU acceleration support
  • AMD GPU acceleration support
  • Multi-node and multi-GPU support through MPI
  • Real or complex arithmetic, with single or double precision
  • Installable with Spack
Language

Fortran

License

BSD-3

COSMA

COSMA is a parallel, high-performance, GPU-accelerated, communication-optimal matrix-matrix multiplication.

Dense Linear AlgebraHigh-Performance ComputingGPU AccelerationDistributed Memory

Key Features

  • Dense matrix-matrix multiplication
  • Communication-optimal implementation
  • Highly efficient parallel implementation
  • Integration with vendor libraries (MKL, cuBLAS, rocBLAS, etc.)
  • Multi-core systems support
  • NVIDIA GPU acceleration support
  • AMD GPU acceleration support
  • Multi-node and multi-GPU support through MPI
  • Real or complex arithmetic, with single or double precision
  • Installable with Spack
Language

C++

License

BSD-3

ELPA

Highly efficient and highly scalable direct eigensolvers for symmetric (hermitian) matrices.

Dense Linear AlgebraEigenvalue ProblemsHigh-Performance ComputingHermitian/SymmetricQuasi-Hermitian/SymmetricGPU AccelerationDistributed Memory

Key Features

  • Dense Hermitian standard and generalized eigenproblems
  • Dense quasi-Hermitian eigenproblems (skew-symmetric and related Bethe–Salpeter)
  • 1- and 2-stage direct eigensolver algorithms. 2-stage solver is especially efficient when only a part of the eigenspectrum is needed
  • Distributed dense matrix-matrix multiplication
  • Support for NVIDIA, AMD, and Intel GPUs
  • Demonstrated pre-exascale runs: full eigenspectrum of 3,200,000*3,200,000 real double matrix on LUMI
Language

Fortran

License

LGPL-3

NTPoly

NTPoly is a massively parallel library for computing the functions of sparse, symmetric matrices based on polynomial expansions.

High-Performance ComputingDistributed MemorySparse Linear AlgebraHermitian/Symmetric

Key Features

  • Sparse matrix-matrix multiplication
  • Highly efficient parallel implementation
  • Multi-core systems support
  • Real or complex arithmetic
  • General polynomials (standard, chebyshev, hermite)
  • Transcendental functions (trigonometric, exponential, logarithm)
  • Matrix roots and inverses
  • Density matrix purification, sign function, polar decomposition
Language

Fortran, C++, Python

License

MIT

Chameleon

Dense linear algebra subroutines for heterogeneous and distributed architectures

Dense Linear AlgebraHigh-Performance ComputingGPU AccelerationDistributed MemoryHermitian/Symmetric

Key Features

  • Dense matrix mutiplication (GEMM, SYMM, TRMM)
  • Dense matrix factorization and linear system solve (GETR, POTR, GEQR, GELQ, GELS)
  • Integration with vendor libraries (Blis/Flame, OpenBLAS, MKL, cuBLAS, rocBLAS, etc.)
  • Multi-core systems support
  • NVIDIA GPU acceleration support
  • AMD GPU acceleration support
  • Multi-node and multi-GPU support through MPI
  • Real or complex arithmetic, with single or double precision
  • Installable with CMake, GNU Guix, Homebrew, Spack
Language

C

License

CeCILL-C

Library Category Distribution

Overview of library categorizations.

Dense Linear Algebra

7

Libraries

Sparse Linear Algebra

2

Libraries

Eigenvalue Problems

6

Libraries

Singular Value Problems

1

Library

High-Performance Computing

9

Libraries

Hermitian/Symmetric

8

Libraries

Non-Hermitian/Symmetric

1

Library

Quasi-Hermitian/Symmetric

4

Libraries

GPU Acceleration

8

Libraries

Distributed Memory

9

Libraries

Subspace Iteration

1

Library