Browsing Division of Electrical, Electronics, and Computer Science (EECS) by Advisor "Reddy, Uday Kumar B"
Now showing items 1-2 of 2
-
High Performance GPU Tensor Core Code Generation for Matmul using MLIR
State of the art in high-performance deep learning is primarily driven by highly tuned libraries. These libraries are often hand-optimized and tuned by expert programmers using low-level abstractions with significant effort. ... -
Optimizing Dense Matrix Computations with PolyMage
Linear algebra computations and other arbitrary affine accesses are ubiquitous in applications from domains like scientific computing, digital signal processing (DSP), and deep neural networks. Libraries such as OpenBLAS, ...