Vibe Shit

Sunday, Jul 5

1

Nano vLLM

Lightning-fast LLM inference engine built from scratch in 1,200 lines of Python

CursorClaude 3.5

#llm-inference #pytorch #optimization #vllm #deep-learning

Wednesday, Jun 3

1

AirLLM

Run 70B LLMs on 4GB GPUs with zero quantization using memory-optimized inference

Cursor, Claude CodeClaude 3.5, GPT-4

#llm-inference #memory-optimization #large-language-models #gpu-optimization #open-source

Sunday, May 17

1

Dream Server

Self-hosted AI stack — LLM inference, chat, voice, agents, workflows, RAG, and image generation. No cloud required.

n8n, ComfyUILlama, Mistral

#self-hosted #local-ai #llm-inference #ai-agents #workflow-automation #rag #privacy-first

2

Tuesday, May 5

1

Candle

Minimalist, high-performance ML framework for Rust with GPU support and LLM inference

Cursor, GitHub CopilotLLaMA, Whisper

#rust #ml-framework #llm-inference #huggingface #gpu

Monday, May 4

1

fastllm

High-performance C++ LLM inference engine — run DeepSeek 671B on a single GPU

Cursor, GitHub CopilotDeepSeek, Qwen

#llm-inference #deepseek #quantization #c++#self-hosted

Thursday, Apr 30

1

Lemonade

Refreshingly fast local AI server — run LLMs privately on your own GPU or NPU for free

Cursor, GitHub CopilotLlama, Qwen

#local-ai #llm-inference #openai-compatible #amd #mcp-server

Monday, Apr 27

1

Shimmy

Python-free Rust inference server with OpenAI-compatible API for local LLMs — single binary, free forever.

Cursor, GitHub CopilotLlama, Mistral

#local-ai #llm-inference #openai-compatible #rust #developer-tools

Sunday, Apr 26

1

mistral.rs

Fast, flexible LLM inference engine in Rust with multimodal support and agentic features

Cursor, GitHub CopilotMistral, Qwen

#llm-inference #rust #multimodal #quantization #ai-agents

2

ggml

Lightweight tensor library powering local LLM inference on any hardware

Thursday, Apr 16

1

oMLX

Local LLM inference server for Mac, optimized for AI coding workflows

Claude Code, CursorClaude 3.5, GPT-4

#llm-inference #apple-silicon #local-ai #ai-coding #macos-app

Monday, Apr 13

1

ATLAS

Self-hosted coding assistant on a $500 GPU, no cloud required.

Cursor, Claude CodeQwen 3.5, Claude 3.5

#local-ai #coding-assistant #open-source #self-hosted #llm-inference

Tuesday, Apr 7

1

GPT4All

Run powerful LLMs locally on any device, no GPU or API required.

Cursor, Claude CodeLlama 2, Mistral

#local-llm #llm-inference #open-source #ai-tools #privacy-first

whichllm

Find the best local LLM for your hardware with real benchmarks, not guesses.

Cursor, Claude CodeClaude 3.5, GPT-4

#local-llm #benchmarking #cli-tool #hardware-optimization #llm-inference

Cursor, GitHub CopilotGPT-2, LLaMA

#llm-inference #tensor-library #quantization #llama.cpp #local-ai