GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess
Maxim Saplin

Maxim Saplin @maximsaplin

About: ツ Manager, Engineer, Open-source Maintainer

Joined:
Oct 12, 2019

GPT 4.1, o3, o4-mini - OpenAI releases through the lens of LLM_Chess

Publish Date: Apr 21
10 0

This will be a quick post. I've ran the recent OpenAI models through LLM Chess eval:

  • o4-mini and o3 demonstrate solid chess performance and instruction following
  • GPT 4.1 didn't qualify due to multiple model errors
  • 4.1 Mini is a good increment over 4o Mini, 4.1 Nano didn't impress

Below is a matrix view of models' performance with Y-axis showing chess proficiency and X-axis instruction following:

LLM Chess Matrix View

P.S> The "Notes" section of the leaderboard web site dives deeper into model's performance.

Comments 0 total

    Add comment