Homunculus 12B and GLM-4–32B-Base-32K: 2 new Arcee AI research-oriented models

In this new video, I introduce two new research-oriented models that Arcee AI recently released on Hugging Face.

Homunculus is a 12 billion-parameter instruction model distilled from Qwen3–235B onto the Mistral AI Nemo backbone. It was purpose-built to preserve Qwen’s two-mode interaction style — /think (deliberate chain-of-thought) and /nothink (concise answers) — while running on a single consumer GPU, and even on CPU as demonstrated in the video.

GLM-4–32B-Base-32K is an enhanced version of Tsinghua University’s THUDM’s GLM-4–32B-Base-0414, specifically engineered to offer robust performance over an extended context window. While the original model’s capabilities degraded after 8,192 tokens, this version maintains strong performance up to a 32,000-token context, making it ideal for tasks requiring long-context understanding and processing.

Julien Simon @juliensimon

Homunculus 12B and GLM-4–32B-Base-32K: 2 new Arcee AI research-oriented models

Comments 0 total