Code Arena | Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Mar 20, 2026
209,727 votes
56 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1548+12/-12
4,059$5 / $251M
2
12
Anthropic
Anthropic · Proprietary
1546+12/-12
3,317$5 / $251M
3
33
Anthropic
Anthropic · Proprietary
1521+9/-9
5,876$3 / $151M
4
44
Anthropic
1489+7/-7
13,259$5 / $25200K
5
58
Anthropic
Anthropic · Proprietary
1465+7/-7
13,313$5 / $25200K
6
514
OpenAI · Proprietary
1457+17/-17
1,486N/AN/A
7
512
Google · Proprietary
1454+10/-10
4,364$2 / $121M
8
615
Z.ai · MIT
1445+10/-10
4,316$1 / $3.20202.8K
9
515
Minimax
MiniMax · Proprietary
1445+14/-14
2,015$0.30 / $1.20204.8K
10
615
Z.ai · MIT
1439+10/-10
4,971$0.39 / $1.75202.8K
11
615
Google · Proprietary
1437+7/-7
17,483$2 / $121M
12
715
Google · Proprietary
1436+7/-7
13,404$0.50 / $31M
13
616
Xiaomi
Xiaomi · Proprietary
1436+16/-16
1,350$1 / $31M
14
815
MoonshotAI
Moonshot · Modified MIT
1431+9/-9
5,987$0.60 / $3N/A
15
719
OpenAI · Proprietary
1428+16/-16
1,574N/AN/A
16
1522
Minimax
MiniMax · Modified MIT
1410+9/-9
5,796$0.20 / $1.17196.6K
17
1522
MoonshotAI
Moonshot · Modified MIT
1409+11/-11
3,632$0.45 / $2.20262.1K
18
1423
OpenAI · Proprietary
1409+12/-12
2,973$1.75 / $14400K
19
1528
OpenAI · Proprietary
1400+16/-16
1,531$1.75 / $14400K
20
1627
Minimax
MiniMax · MIT
1399+8/-8
9,584$0.27 / $0.95196.6K
21
1627
1395+7/-7
11,042$0.50 / $31M
22
1628
OpenAI · Proprietary
1392+12/-12
3,835$1.25 / $10400K
23
1928
Anthropic
1389+6/-6
16,012$3 / $15200K
24
1828
OpenAI · Proprietary
1388+9/-9
6,255$1.25 / $10400K
25
1930
Qwen Icon
Alibaba · Apache 2.0
1386+10/-10
4,535$0.39 / $2.34262.1K
26
1928
Anthropic
Anthropic · Proprietary
1386+6/-6
17,832$3 / $15200K
27
1930
Anthropic
Anthropic · Proprietary
1384+9/-9
8,738$15 / $75200K
28
2132
1373+14/-14
1,941$2 / $62M
29
2632
DeepSeek · MIT
1370+8/-8
7,445$0.26 / $0.38163.8K
30
2632
Qwen Icon
Alibaba · Apache 2.0
1367+11/-11
3,239$0.26 / $2.08262.1K
31
2834
Z.ai · MIT
1354+9/-9
8,522$0.39 / $1.90204.8K
32
2835
Qwen Icon
Alibaba · Apache 2.0
1352+12/-12
2,951$0.20 / $1.56262.1K
33
3137
OpenAI · Proprietary
1339+7/-7
13,088$1.25 / $10400K
34
3137
1338+8/-8
6,850$0.09 / $0.29262.1K
35
3238
OpenAI · Proprietary
1338+8/-8
7,901$1.75 / $14400K
36
3338
MoonshotAI
Moonshot · Modified MIT
1328+6/-6
14,436$1.15 / $8262.1K
37
3339
OpenAI · Proprietary
1326+9/-9
6,346$1.25 / $10400K
38
3541
DeepSeek · MIT
1322+8/-8
8,886$0.26 / $0.38163.8K
39
3841
Anthropic
Anthropic · Proprietary
1309+6/-6
15,758$1 / $5200K
40
3741
Minimax
MiniMax · Apache 2.0
1309+9/-9
8,602$0.26 / $1196.6K
41
3843
1302+14/-14
2,109$0.09 / $0.29262.1K
42
4143
DeepSeek · MIT
1285+10/-11
5,012$0.27 / $0.41163.8K
43
4143
Qwen Icon
Alibaba · Apache 2.0
1282+6/-6
15,471$0.40 / $1.60262.1K
44
4449
Kwai
KwaiKAT · Proprietary
1258+15/-15
1,925$0.21 / $0.83256K
45
4450
Google · Proprietary
1251+16/-16
1,479$0.25 / $1.501M
46
4450
Qwen Icon
Alibaba · Apache 2.0
1249+16/-16
1,818$0.16 / $1.30262.1K
47
4451
OpenAI · Proprietary
1240+17/-17
1,503$0.25 / $2400K
48
4451
Qwen Icon
Alibaba · Proprietary
1238+17/-17
1,560N/AN/A
49
4450
xAI · Proprietary
1234+9/-9
6,977$0.20 / $0.502M
50
4553
Mistral · Apache 2.0
1221+20/-20
1,031$0.50 / $1.50N/A
51
4853
xAI · Proprietary
1205+19/-19
1,242$0.20 / $0.50N/A
52
5053
Google · Proprietary
1205+13/-13
3,365$1.25 / $101M
53
5053
Mistral · Modified MIT
1198+17/-17
1,603N/AN/A
54
5455
xAI · Proprietary
1149+23/-23
936$0.20 / $0.502M
55
5456
xAI · Proprietary
1138+22/-22
989$0.20 / $1.50256K
56
5556
Mistral · Proprietary
1094+22/-22
1,003$0.40 / $2128K

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)