Lightning Talk: GPU.x – Dynamic GPU Fraction and Sharing on ML Mainstream Framework – Tiejun Chen

December 21, 2023

Lightning Talk: GPU.x – Dynamic GPU Fraction and Sharing on ML Mainstream Framework – Tiejun Chen, VMware

As we see, organizations are investing heavily in bringing AI accelerators into their data centers or using them on the public cloud but continue to struggle with the cost-effective and efficient management of these critical resources. There are some existing approaches to address them but heavy and inflexible. Here, we’d like to take this chance to review if-how we can address the challenges of expensive and limited machine learning compute resources like GPU and identifies solutions for GPU fractional optimization with our technical PoC – GPU.x by transparent backend Python hooker within ML upstream frameworks. It’s lightweight, easy and flexible without any code changes to your AI applications.

source

by The Linux Foundation

linux foundation