Focus on computer systems and parallel computing research for advancing machine learning, in particular, large language model (LLM) systems. Topics include GPU data compression, KV cache and memory management, profiling and parallel scaling of LLM workloads, kernel optimizations, optimizers, parallel training and inference, and hardware-software co-design.