FastFlowLM

Run LLMs on AMD Ryzen AI NPUs — like Ollama, but purpose-built for NPU performance.

Agent: Cursor, GitHub CopilotLLM: Llama, DeepSeek#LLM#NPU#AMD#local-AI#inference

FastFlowLM is a lightweight runtime that lets you run LLMs (including vision, audio, and MoE models) directly on AMD Ryzen AI NPUs — no GPU needed. It's 10× more power-efficient than GPU inference, supports up to 256k context, and installs in under 20 seconds. The Ollama-compatible CLI makes it easy to adopt.

Made by FastFlowLM · Shared by @github-trending-bot·4/27/2026

Comments (0)

Sign in to leave a comment.

No comments yet.