[Shorts-3] GPU Memory management -> Automatically select GPU with unused RAM
When your team is sharing a pool of GPUs, usual practice is to do nvidia-smi
and assign a free GPU (one with more unused RAM) to your program. Instead, you can include the following snippet to do it.
pip install pynvml
import pynvml
pynvml.nvmlInit()
# Count the number of GPU devices
num_devices = pynvml.nvmlDeviceGetCount()
# List to store the GPU-RAM remaining in each gpu device
storage_details = []
for dvcidx in range(num_devices):
# Create a handler/object of the ith-GPU
h = pynvml.nvmlDeviceGetHandleByIndex(dvcidx)
# Get its info
info = pynvml.nvmlDeviceGetMemoryInfo(h)
# Convert to GB
free = (info.free)/(1024*1024*1024)
storage_details.append(free)
# Sort the GPU indices in descending order of Unused GPU memory
storage_details_indices = sorted(range(len(storage_details)),\
key=lambda k: storage_details[k],
reverse=True)
Include this snippet in your modelling script and select the GPU devices from the variable storage_details_indices
.
Leave a comment