LM Playground — Run LLMs locally on Android

Why on-device

Private by construction. Fast by design.

The model weights live on your phone. Prompts, drafts and answers never touch a server — there isn't one to touch.

Fully offline

Airplane mode? Subway? Flight? No difference. Inference happens on your SoC, never on someone else's server.

Your roles, saved

Save reusable system prompts once and pick the right one for any model. Keep tone, role and output format consistent across every session.

Tune every knob

Temperature, Top-P, Top-K, Min-P, repetition penalty, seed, context size — each model remembers its own settings.

Save anywhere

Models are multi-gigabyte. Download them to any folder — internal storage, SD card, or an external drive. Move them between locations any time.

How it works

Three taps from app open to first token.

STEP 01

Pick a model

Browse the curated list or paste your own model URL. Sizes range from 267 MB to 5.4 GB.

Qwen 3 1.7B✓

Gemma 3 1B806 MB

Llama 3.2 3B2.02 GB

Phi-4 mini2.49 GB

DeepSeek R1 1.5B1.12 GB

STEP 02

Download and resume

Reliable background downloads with notifications, speed, and ETA — with automatic resume when the connection drops.

Qwen 3 1.7B73%

12.4 MB/sETA 0:28

Wi-Fi · HomeResuming...

→ /storage/emulated/0/LMPlayground/

STEP 03

Chat locally

Your conversation history stays on your device. Reasoning is shown inline. No accounts, no API keys.

Summarize this email.

The sender is moving Thursday's sync to Friday 10am PT — confirm or propose another time.

● on-device38 tok/s

Under the hood

Engineered for phones,
not data centers.

Built on llama.cpp with GGUF quantized models. Native C++ inference with ARM-optimized kernels means less heat, more tokens per second, and no unexpected background drain.

Inference enginellama.cpp
Model formatGGUF · Q4_K_M
KernelsKleidiAI + OpenMP
Min AndroidAPI 30 (Android 11)
Architecturearm64-v8a
UIJetpack Compose · Material 3
LicenseMIT

// LlamaCpp.kt — Kotlin ↔ C++ bridge to llama.cpp
package com.druk.llamacpp

class LlamaCpp {

    companion object {
        init {
            System.loadLibrary("llamacpp")
        }
    }

    external fun init(): Int

    external fun systemInfo(): String

    external fun loadModel(
        path: String,
        progressCallback: LlamaProgressCallback,
    ): LlamaModel

    external fun probeModelMetadata(path: String): Array<String>?
}

// no network, no telemetry, no keys.
          

FAQ

Questions you had
before you asked.

Still stuck? Open an issue on GitHub or reach out directly.

What is LM Playground?

LM Playground lets you run large language models directly on your Android device. All processing happens locally — no cloud servers needed.

What models are supported?

The app supports models in GGUF format with Q4_K_M quantization, including Qwen 3, Gemma 3, Llama 3.2, Phi‑4 mini, and DeepSeek R1 Distill in various sizes.

How much storage space do I need?

Model sizes vary from about 500 MB (small models like Qwen 3 0.6B) to several GB (larger models like DeepSeek R1 7B). Make sure you have enough free space before downloading.

Does the app require an internet connection?

Internet is only needed to download models. Once downloaded, models run completely offline on your device.

Is my data private?

Yes. All conversations are processed locally on your device. No data is sent to external servers.

Why is model loading slow?

Larger models take more time to load into memory. Loading times depend on your device's hardware. Once loaded, the model stays in memory until you unload it.

Which devices work best?

Devices with more RAM can run larger models. For best performance, use a device with at least 6 GB of RAM for small models and 8+ GB for larger ones.

Can I load a custom GGUF model?

Yes. Place your .gguf file in the storage folder selected in Settings → Models (the same folder used for downloads). The app will pick it up automatically and show it in the model selector alongside the built-in catalog. Chat template and tokenizer settings are read from the GGUF metadata. If a specific model doesn't work, please open an issue on GitHub.

Can I change where models are stored?

Yes. Go to Settings, then Models, and use the "Change Folder" option to select a different storage location.

How do I delete a model?

Go to Settings, then Models. In the "Downloaded" section, tap the delete icon next to the model you want to remove.

Run large language models
on your phone.

Private by construction. Fast by design.

Fully offline

Your roles, saved

Tune every knob

Save anywhere

Pick a brain. Tap download. Chat.

Three taps from app open to first token.

Pick a model

Download and resume

Chat locally

Speaks your language.
Right out of the box.

Engineered for phones,
not data centers.

Questions you had
before you asked.

Any model in your pocket.

Tweaks

Private by construction. Fast by design.

Fully offline

Your roles, saved

Tune every knob

Save anywhere

Pick a brain. Tap download. Chat.

Three taps from app open to first token.

Pick a model

Download and resume

Chat locally

Speaks your language.Right out of the box.

Engineered for phones,not data centers.

Questions you hadbefore you asked.

Any model in your pocket.

Tweaks

Speaks your language.
Right out of the box.

Engineered for phones,
not data centers.

Questions you had
before you asked.