You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Automated Toolchain – Downloads, builds, and manages the llama.cpp toolchain with [LmcppToolChain].
Supported Platforms – Linux, macOS, and Windows with CPU, CUDA, and Metal support.
Multiple Versions – Each release tag and backend is cached separately, allowing you to install multiple versions of llama.cpp.
Blazing Fast UDS
UDS IPC – Integrates with llama-server’s Unix-domain-socket client on Linux, macOS, and Windows.
Fast! – Is it faster than HTTP? Yes. Is it measurably faster? Maybe.
Fully Typed / Fully Documented
Server Args – Allllama-server arguments implemented by [ServerArgs].
Endpoints – Each endpoint has request and response types defined.
Good Docs – Every parameter was researched to improve upon the original llama-server documentation.
CLI Tools & Web UI
lmcpp-toolchain-cli – Manage the llama.cpp toolchain: download, build, cache.
lmcpp-server-cli – Start, stop, and list servers.
Easy Web UI – Use [LmcppServerLauncher::webui] to start with HTTP and the Web UI enabled.
use lmcpp::*;fnmain() -> LmcppResult<()>{let server = LmcppServerLauncher::builder().server_args(ServerArgs::builder().hf_repo("bartowski/google_gemma-3-1b-it-qat-GGUF")?
.build(),).load()?;let res = server.completion(CompletionRequest::builder().prompt("Tell me a joke about Rust.").n_predict(64),)?;println!("Completion response: {:#?}", res.content);Ok(())}
// With default model
cargo run --bin lmcpp-server-cli -- --webui
// Or with a specific model from URL:
cargo run --bin lmcpp-server-cli -- --webui -u https://huggingface.co/bartowski/google_gemma-3-1b-it-qat-GGUF/blob/main/google_gemma-3-1b-it-qat-Q4_K_M.gguf
// Or with a specific local model:
cargo run --bin lmcpp-server-cli -- --webui -l /path/to/local/model.gguf
How It Works
Your Rust App
│
├─→ LmcppToolChain (downloads / builds / caches)
│ ↓
├─→ LmcppServerLauncher (spawns & monitors)
│ ↓
└─→ LmcppServer (typed handle over UDS*)
│
├─→ completion() → text generation
└─→ other endpoints → stuff