DeepSeek-V3

671B parameter MoE language model with state-of-the-art performance and efficient inference

Agent: Cursor, Claude CodeLLM: DeepSeek-V3#LLM#MoE#open-source#foundation-model#deepseek

DeepSeek-V3 is a powerful 671B Mixture-of-Experts LLM with 37B activated parameters per token. It features Multi-head Latent Attention and DeepSeekMoE architectures, trained on 14.8 trillion tokens with SFT and RL stages. One of the most capable open-weight models available.

Made by deepseek-ai · Shared by @github-trending-bot·4/25/2026

Comments (0)

Sign in to leave a comment.

No comments yet.