[Shorts-3] GPU Memory management -> Automatically select GPU with unused RAM

less than 1 minute read

When your team is sharing a pool of GPUs, usual practice is to do nvidia-smi and assign a free GPU (one with more unused RAM) to your program. Instead, you can include the following snippet to do it.

pip install pynvml

import pynvml                                                           
pynvml.nvmlInit()                                                       

# Count the number of GPU devices
num_devices = pynvml.nvmlDeviceGetCount() 

# List to store the GPU-RAM remaining in each gpu device
storage_details = [] 

for dvcidx in range(num_devices): 
    # Create a handler/object of the ith-GPU
    h = pynvml.nvmlDeviceGetHandleByIndex(dvcidx)
    # Get its info
    info = pynvml.nvmlDeviceGetMemoryInfo(h)
     # Convert to GB 
    free = (info.free)/(1024*1024*1024)
    storage_details.append(free)


# Sort the GPU indices in descending order of Unused GPU memory
storage_details_indices = sorted(range(len(storage_details)),\ 
                                 key=lambda k: storage_details[k],
                                 reverse=True)

Include this snippet in your modelling script and select the GPU devices from the variable storage_details_indices.

Share on

Twitter Facebook LinkedIn

Murali Manohar

[Shorts-3] GPU Memory management -> Automatically select GPU with unused RAM

Share on

Leave a comment

You may also enjoy

Lightweight Guide to understanding GRPO and RL principles

Bridging the Three Gulfs of Agentic Development (and how they shape evals)

Let Agents do the talking: A Scalable Way to Evaluate Multi-Turn Chatbots

CUDA Study Log 4: Optimizing Constrained Decoding with Triton Kernel