Local Deployment · Engineering Perspective · Avoid Detours

Devstral 2 · Local Deployment Guide

This page does one thing: get Devstral running with the shortest path, and let you know 'which model to choose, what hardware to prepare'

Before You Start (Save Time First)

Suggestion

First check hardware recommendations to determine if you want to run 24B or 123B; naming/alias related issues are unified in theFAQ。

Devstral Small 2 (More Recommended for Personal Developers)

GPU

≥ 24GB VRAM（RTX 3090 / 4090 / L40）

RAM

≥ 32GB

System

Linux / macOS（Apple Silicon can be quantized）

Goal: Let you 'use it out of the box' instead of struggling with environment setup for a week

Devstral 2 (123B, More for Teams/Servers)

GPU

多卡 / ≥ 128GB VRAM（推理服务器级别）

Usage

Team inference service, heavy tasks and long context

If you just want to experience the workflow, starting with 24B is more cost-effective

One command to get it running

ollama run devstral-2

Model library address: https://ollama.com/library/devstral-2

When is Ollama suitable?

You want to quickly verify 'if it feels right'
You care more about 'one command to start service' than ultimate performance tuning
You're already promoting Ollama workflow in your team

Recommended Process (copy as is based on your copy)

Recommended Parameters (for starting)

Note: This is not 'the only correct answer', just a more stable default value