Maxim Saplin

Maxim Saplin @maximsaplin

About: ツ Manager, Engineer, Open-source Maintainer

Joined:
Oct 12, 2019

Maxim Saplin
articles - 75 total

XYZ% of Code is Now Written by AI... Who Cares?

Microsoft CEO Satya Nadella said that "as much as 30% of the company’s code is now written by...

Learn More 18 7May 1

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

This will be a quick post. I've ran the recent OpenAI models through LLM Chess eval: o4-mini and o3...

Learn More 10 0Apr 21

Mercury Coder - A Quick Test of Diffusion Language Model

I have recently touched on how diffusion/transformer models come into new domains - specifically the...

Learn More 13 0Apr 18

Llama 4 - 10M Context? Coding? Decent Follow-up?

Meta has brought the long-awaited Llama 4 models on Saturday, April 5. Llama 3 came out on April 26,...

Learn More 23 1Apr 8

4o Image Gen - Diffusion/Transformer Cross-over Trend?

In March we saw 2 major releases of image generation tools ((Google, OpenAI)) that are very much...

Learn More 20 0Mar 31

SGLang vs llama.cpp - A Quick Speed Test

Recently, I stumbled upon a post about SGLang, an open-source LLM inference engine that boasts 2-5x...

Learn More 15 3Feb 17

OpenAI o3-mini Tested in LLM Chess

OpenAI has recently presented its newest reasoning model - o3-mini. At Medium and High reasoning...

Learn More 16 0Feb 13

Qwen2.5 Max Release Went Unnoticed in Deepseek Hysteria

Last week, Chinese Big-Tech company Alibaba released its best model to date: Qwen-Max. It is a...

Learn More 11 0Feb 3

DeepSeek-V3: Laziness vs Eagerness

In late 2023, people complained about GPT-4 Turbo's laziness - often the model didn't complete tasks....

Learn More 19 0Jan 27

Deepseek R1 vs OpenAI o1

Deepseek R1 is out - available via Deepseek API or free Deepseek chat. If you are following LLM/Gen...

Learn More 151 23Jan 23

OpenAI o1/o3 - Be Careful What you Wish For...

Hallucination is also a latent fear accompanying the copy-pasting of a long scroll of text from a...

Learn More 9 2Jan 14

OpenAI o3 - Thinking Fast and Slow

OpenAI has teased the o3 model today—a further development of the "reasoning" model and a successor...

Learn More 202 10Dec 20 '24

Tried Phi-4, It didn't Impress

Phi-4 14B has been recently released. Benchmarks look promising, e.g. it beats GPT-4o in Math: I...

Learn More 22 0Dec 18 '24

Gemini 2.0 Released, Reminding of "AI Hitting the Wall" Talks

Today Google has presented a major update to its flagship SOTA model - Gemini 2.0. What caught my...

Learn More 13 1Dec 11 '24

The Power of Pragmatism: Engineering Cultures and China's Ascendancy

A company decided to rename its product XYZ to ABC. Assets, texts in the app, and web pages were...

Learn More 8 0Dec 3 '24

Can LLMs Play Chess? I've Tested 13 Models (GPT-4o, Claude 3.5, Gemini 1.5 etc.)

UPD January 25, 2025: Deepseek R1 is another model that broke the ceiling of zero wins showing...

Learn More 38 7Nov 21 '24

Microsoft Autogen Has Split in 2... Wait 3... No, 4 Parts

One of the most popular open-source frameworks of 2023, which explored the application of AI agents...

Learn More 41 12Nov 18 '24

Llama 3.1 Nemotron 70B - Quirks and Features

The new open model by NVidia, Nemotron 70B has recently been the hotspot of "this is wild", "this is...

Learn More 52 2Oct 29 '24

DDR5 Speed, CPU and LLM Inference

This is the 3rd part of my investigations of local LLM inference speed. Here're the 1st and 2nd...

Learn More 19 5Oct 12 '24

Gen AI Hype - the Never Ending Excitement

"When sensationalism wins over nuance, we lose our ability to think." is a great quote by Lex Fridman...

Learn More 16 1Sep 23 '24

OpenAI o1 Release is so Reminiscent of Apple Events - it's an Incremental Update

Yesterday OpenAI introduced "A new series of reasoning models for solving hard problems", which...

Learn More 14 4Sep 13 '24

Continue.dev: The Swiss Army Knife That Sometimes Fails to Cut

First Impressions I had low expectations from Continue.dev when I first installed it in...

Learn More 53 3Sep 10 '24

Python 3.13 RC1 - a Quick CPU Benchmark

Python 3.13 is due to be released in October, yet the first release candidate was published earlier...

Learn More 11 1Aug 25 '24

llama.cpp: CPU vs GPU, shared VRAM and Inference Speed

This is the 2nd part of my investigations of local LLM inference speed. Here're the 1st and 3rd...

Learn More 31 6Aug 22 '24

Convergence of LLMs: 2024 Trend Solidified by Llama 3.1 Release

The recent release of Llama 3.1 was reminiscent of many releases this year. It underlined a trend...

Learn More 14 4Jul 25 '24

DoLa and MT-Bench - A Quick Eval of a new LLM trick

Decoding by Contrasting Layers (DoLa) is a technique suggesting a different approach to calculating...

Learn More 6 0Jul 11 '24

4090 - ECC ON vs ECC OFF

Fine Tuning LLM via Huggin Face TRL/Torch: ECC On: 2,22 epochs/day ECC Off: 2,33 epochs/day...

Learn More 11 0Jun 25 '24

MT-Bench: Comparing different LLM Judges

By default, MT-Bench uses OpenAI as a service provider with a gpt-4 model ID, which is a vanilla...

Learn More 18 2Jun 8 '24

Nvidia's 1000x Performance Boost Claim Verified

Nvidia's keynote at the recent Computex was full of bold marketing and messaging, bordering on...

Learn More 12 0Jun 4 '24

LLM Fine-tunig on RTX 4090: 90% Performance at 55% Power

At just a fraction of power, 4090 is capable of delivering almost full performance. While running...

Learn More 27 0May 29 '24