ramalama-musa - Man Page
Setting Up RamaLama with MUSA Support on Linux systems
This guide walks through the steps required to set up RamaLama with MUSA support.
Install the MT Linux Driver
Download the appropriate MUSA SDK and follow the installation instructions provided in the MT Linux Driver installation guide.
Install the MT Container Toolkit
Obtain the latest MT CloudNative Toolkits and follow the installation instructions provided in the MT Container Toolkit installation guide.
Setting Up MUSA Support
$ (cd /usr/bin/musa && sudo ./docker setup $PWD) $ docker info | grep mthreads Runtimes: mthreads mthreads-experimental runc Default Runtime: mthreads
Testing the Setup
Test the Installation
Run the following command to verify setup:
docker run --rm --env MTHREADS_VISIBLE_DEVICES=all ubuntu:22.04 mthreads-gmi
Expected Output
Verify everything is configured correctly, with output similar to this:
Thu May 15 01:53:39 2025 --------------------------------------------------------------- mthreads-gmi:2.0.0 Driver Version:3.0.0 --------------------------------------------------------------- ID Name |PCIe |%GPU Mem Device Type |Pcie Lane Width |Temp MPC Capable | ECC Mode +-------------------------------------------------------------+ 0 MTT S80 |00000000:01:00.0 |0% 3419MiB(16384MiB) Physical |16x(16x) |59C YES | N/A --------------------------------------------------------------- --------------------------------------------------------------- Processes: ID PID Process name GPU Memory Usage +-------------------------------------------------------------+ No running processes found ---------------------------------------------------------------
Musa_visible_devices
RamaLama respects the MUSA_VISIBLE_DEVICES environment variable if it's already set in your environment. If not set, RamaLama will default to using all the GPU detected by mthreads-gmi.
You can specify which GPU devices should be visible to RamaLama by setting this variable before running RamaLama commands:
export MUSA_VISIBLE_DEVICES="0,1" # Use GPUs 0 and 1 ramalama run granite
This is particularly useful in multi-GPU systems where you want to dedicate specific GPUs to different workloads.
History
May 2025, Originally compiled by Xiaodong Ye yeahdongcn@gmail.com ⟨mailto:yeahdongcn@gmail.com⟩