CHARGE: Accelerating GNN Training via CPU Sampling in Heterogeneous CPU–GPU Environment
Abstract
Graph Neural Networks (GNNs) have demonstrated exceptional performance across a wide range of applications, driving their widespread adoption. Current frameworks employ CPU and GPU resources—either in isolation or heterogeneously—to train GNNs, incorporating mini-batching and sampling techniques to mitigate scalability challenges posed by limited GPU memory. Sample-based GNN training is divided into three phases: Sampling, Extraction, and Training. Existing systems orchestrate these tasks across CPU and GPU in various ways, but exhaustive experiments reveal that not every stage is equally suited to both processors; notably, CPU sampling can outperform GPU sampling for certain samplers. Moreover, most frameworks lack adaptability to different samplers, datasets, and hardware configurations.
In this thesis, we propose CHARGE, a system that leverages competitive CPU sampling to accelerate end-to-end GNN training. An intelligent controller assigns each stage—Sampling, Extraction, and Training—to the most appropriate processor (CPU or GPU), agnostic to sampler, dataset, batch size, model, or underlying hardware. Built atop the DGL framework, CHARGE retains ease of programmability while delivering substantial improvements over state-of-the-art systems across multiple samplers, datasets, and models.