GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

#openai #llm #ai #genai

Maxim Saplin

Maxim Saplin @maximsaplin

About: ツ Manager, Engineer, Open-source Maintainer

Joined:

Oct 12, 2019

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

Publish Date: Apr 21

10 0

This will be a quick post. I've ran the recent OpenAI models through LLM Chess eval:

o4-mini and o3 demonstrate solid chess performance and instruction following
GPT 4.1 didn't qualify due to multiple model errors
4.1 Mini is a good increment over 4o Mini, 4.1 Nano didn't impress

Below is a matrix view of models' performance with Y-axis showing chess proficiency and X-axis instruction following:

P.S> The "Notes" section of the leaderboard web site dives deeper into model's performance.

Comments 0 total