Maxim Saplin

Maxim Saplin @maximsaplin

About: ツ AI in Software Dev, Open-source

Joined:

Oct 12, 2019

Maxim Saplin
articles - 82 total

Ran out of Cursor tokens and switched to GitHub Copilot: Side-by-Side

Ran out of Cursor tokens and switched to GitHub Copilot: Side-by-Side

DISCLAIMER! The best AI coding tool is the one available to you, that gives you the best model and...

Learn More 28 19Feb 18

Long-horizon agents: OpenCode + GPT-5.2 Codex Experiment

Long-horizon agents: OpenCode + GPT-5.2 Codex Experiment

Sequoia Capital has recently published a blog post arguing that AGI has been achieved because...

Learn More 9 0Jan 22

Cursor-like Semantic Rules in GitHub Copilot

Cursor-like Semantic Rules in GitHub Copilot

Both GitHub Copilot and Cursor offer ways to define guardrails for agents in the form of Instructions...

Learn More 8 0Jan 8

AI Dev: Plan Mode vs. SDD — A Weekend Experiment

AI Dev: Plan Mode vs. SDD — A Weekend Experiment

Three months ago, I tested Kiro's Spec-Driven Development (SDD) workflow and walked away impressed...

Learn More 14 4Dec 4 '25

AI Dev: Testing Kiro

Kiro is a yet another VSCode fork (just like Cursor or Windsurf) that integrates AI coding features....

Learn More 17 10Aug 25 '25

LLMs are Bad at Math

LLMs are known to struggle with math. Not in those PhD level tasks from AIME eval, where the...

Learn More 13 1Jun 13 '25

Grok 3 API - Reasoning Tokens are Counted Differently

Grok 3 API - Reasoning Tokens are Counted Differently

I've learned it the hard way... If you use the recently released Grok-3 Mini reasoning model (which...

Learn More 10 0May 15 '25

XYZ% of Code is Now Written by AI... Who Cares?

XYZ% of Code is Now Written by AI... Who Cares?

Microsoft CEO Satya Nadella said that "as much as 30% of the company’s code is now written by...

Learn More 43 8May 1 '25

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

This will be a quick post. I've ran the recent OpenAI models through LLM Chess eval: o4-mini and o3...

Learn More 10 0Apr 21 '25

Mercury Coder - A Quick Test of Diffusion Language Model

Mercury Coder - A Quick Test of Diffusion Language Model

I have recently touched on how diffusion/transformer models come into new domains - specifically the...

Learn More 10 0Apr 18 '25

Llama 4 - 10M Context? Coding? Decent Follow-up?

Llama 4 - 10M Context? Coding? Decent Follow-up?

Meta has brought the long-awaited Llama 4 models on Saturday, April 5. Llama 3 came out on April 26,...

Learn More 25 1Apr 8 '25

4o Image Gen - Diffusion/Transformer Cross-over Trend?

4o Image Gen - Diffusion/Transformer Cross-over Trend?

In March we saw 2 major releases of image generation tools ((Google, OpenAI)) that are very much...

Learn More 20 0Mar 31 '25

SGLang vs llama.cpp - A Quick Speed Test

SGLang vs llama.cpp - A Quick Speed Test

Recently, I stumbled upon a post about SGLang, an open-source LLM inference engine that boasts 2-5x...

Learn More 20 3Feb 17 '25

OpenAI o3-mini Tested in LLM Chess

OpenAI has recently presented its newest reasoning model - o3-mini. At Medium and High reasoning...

Learn More 16 0Feb 13 '25

Qwen2.5 Max Release Went Unnoticed in Deepseek Hysteria

Qwen2.5 Max Release Went Unnoticed in Deepseek Hysteria

Last week, Chinese Big-Tech company Alibaba released its best model to date: Qwen-Max. It is a...

Learn More 11 0Feb 3 '25

DeepSeek-V3: Laziness vs Eagerness

In late 2023, people complained about GPT-4 Turbo's laziness - often the model didn't complete tasks....

Learn More 19 0Jan 27 '25

Deepseek R1 vs OpenAI o1

Deepseek R1 is out - available via Deepseek API or free Deepseek chat. If you are following LLM/Gen...

Learn More 152 23Jan 23 '25

OpenAI o1/o3 - Be Careful What you Wish For...

OpenAI o1/o3 - Be Careful What you Wish For...

Hallucination is also a latent fear accompanying the copy-pasting of a long scroll of text from a...

Learn More 9 2Jan 14 '25

OpenAI o3 - Thinking Fast and Slow

OpenAI has teased the o3 model today—a further development of the "reasoning" model and a successor...

Learn More 203 10Dec 20 '24

Tried Phi-4, It didn't Impress

Phi-4 14B has been recently released. Benchmarks look promising, e.g. it beats GPT-4o in Math: I...

Learn More 22 0Dec 18 '24

Gemini 2.0 Released, Reminding of "AI Hitting the Wall" Talks

Gemini 2.0 Released, Reminding of "AI Hitting the Wall" Talks

Today Google has presented a major update to its flagship SOTA model - Gemini 2.0. What caught my...

Learn More 13 1Dec 11 '24

The Power of Pragmatism: Engineering Cultures and China's Ascendancy

The Power of Pragmatism: Engineering Cultures and China's Ascendancy

A company decided to rename its product XYZ to ABC. Assets, texts in the app, and web pages were...

Learn More 8 0Dec 3 '24

Can LLMs Play Chess? I've Tested 13 Models (GPT-4o, Claude 3.5, Gemini 1.5 etc.)

Can LLMs Play Chess? I've Tested 13 Models (GPT-4o, Claude 3.5, Gemini 1.5 etc.)

UPD September 15, 2025: Reasoning models opened a new chapter in Chess performance, the most recent...

Learn More 44 9Nov 21 '24

Microsoft Autogen Has Split in 2... Wait 3... No, 4 Parts

Microsoft Autogen Has Split in 2... Wait 3... No, 4 Parts

One of the most popular open-source frameworks of 2023, which explored the application of AI agents...

Learn More 43 12Nov 18 '24

Llama 3.1 Nemotron 70B - Quirks and Features

Llama 3.1 Nemotron 70B - Quirks and Features

The new open model by NVidia, Nemotron 70B has recently been the hotspot of "this is wild", "this is...

Learn More 53 2Oct 29 '24

DDR5 Speed, CPU and LLM Inference

This is the 3rd part of my investigations of local LLM inference speed. Here're the 1st and 2nd...

Learn More 26 5Oct 12 '24

Gen AI Hype - the Never Ending Excitement

Gen AI Hype - the Never Ending Excitement

"When sensationalism wins over nuance, we lose our ability to think." is a great quote by Lex Fridman...

Learn More 16 1Sep 23 '24

OpenAI o1 Release is so Reminiscent of Apple Events - it's an Incremental Update

OpenAI o1 Release is so Reminiscent of Apple Events - it's an Incremental Update

Yesterday OpenAI introduced "A new series of reasoning models for solving hard problems", which...

Learn More 14 4Sep 13 '24

Continue.dev: The Swiss Army Knife That Sometimes Fails to Cut

Continue.dev: The Swiss Army Knife That Sometimes Fails to Cut

First Impressions I had low expectations from Continue.dev when I first installed it in...

Learn More 58 4Sep 10 '24

Python 3.13 RC1 - a Quick CPU Benchmark

Python 3.13 is due to be released in October, yet the first release candidate was published earlier...

Learn More 11 1Aug 25 '24

Maxim Saplin @maximsaplin

Maxim Saplin articles - 82 total

Ran out of Cursor tokens and switched to GitHub Copilot: Side-by-Side

Long-horizon agents: OpenCode + GPT-5.2 Codex Experiment

Cursor-like Semantic Rules in GitHub Copilot

AI Dev: Plan Mode vs. SDD — A Weekend Experiment

AI Dev: Testing Kiro

LLMs are Bad at Math

Grok 3 API - Reasoning Tokens are Counted Differently

XYZ% of Code is Now Written by AI... Who Cares?

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

Mercury Coder - A Quick Test of Diffusion Language Model

Llama 4 - 10M Context? Coding? Decent Follow-up?

4o Image Gen - Diffusion/Transformer Cross-over Trend?

SGLang vs llama.cpp - A Quick Speed Test

OpenAI o3-mini Tested in LLM Chess

Qwen2.5 Max Release Went Unnoticed in Deepseek Hysteria

DeepSeek-V3: Laziness vs Eagerness

Deepseek R1 vs OpenAI o1

OpenAI o1/o3 - Be Careful What you Wish For...

OpenAI o3 - Thinking Fast and Slow

Tried Phi-4, It didn't Impress

Gemini 2.0 Released, Reminding of "AI Hitting the Wall" Talks

The Power of Pragmatism: Engineering Cultures and China's Ascendancy

Can LLMs Play Chess? I've Tested 13 Models (GPT-4o, Claude 3.5, Gemini 1.5 etc.)

Microsoft Autogen Has Split in 2... Wait 3... No, 4 Parts

Llama 3.1 Nemotron 70B - Quirks and Features

DDR5 Speed, CPU and LLM Inference

Gen AI Hype - the Never Ending Excitement

OpenAI o1 Release is so Reminiscent of Apple Events - it's an Incremental Update

Continue.dev: The Swiss Army Knife That Sometimes Fails to Cut

Python 3.13 RC1 - a Quick CPU Benchmark

Maxim Saplin
articles - 82 total