You need to enable JavaScript to run this app.
DevQualityEval v1.0
https://github.com/symflower/eval-dev-quality
Ruby score
0%
25%
50%
75%
100%
Percentage of total possible score across all tasks for language Ruby (higher is better)
DeepSeek: R1 Distill Qwen 1.5B
Meta: Llama 3.2 1B (Instruct)
Mistral: Mixtral 8x7B (Base) (v0.1)
Mistral: Mistral Tiny (v0.3)
XWin-LM: Xwin 70B
Liquid: LFM 7B
Liquid: LFM 3B
Cohere: Command R (08-2024)
Meta: Llama 3.2 3B (Instruct)
Liquid: LFM MoE 40B
Cohere: Command
Mistral: Mistral 7B (Instruct)
Cognitive Computations: Dolphin 2.6 Mixtral 8x7B
Microsoft: WizardLM-2 7B
Microsoft: Phi-3 Medium (Instruct) (128K)
NousResearch: Hermes 13B
DeepSeek: DeepSeek R1 Distill Qwen 14B
Qwen: Qwen 2 7B (Instruct)
Google: Gemma 2 27B
Mistral: Mistral 7B (Instruct) (v0.3)
Cohere: Command R (03-2024)
NousResearch: Hermes 2 Mixtral 8x7B (DPO)
Teknium: OpenHermes 2.5 Mistral 7B
Jon Durbin: Airoboros 70B
Perplexity: Llama 3.1 Sonar 8B
OpenChat: OpenChat 3.5 7B
Meta: Llama 3 8B (Instruct)
Microsoft: WizardLM-2 8x22B
Microsoft: Phi-3 Mini (Instruct) (128K)
Google: Gemma 2 9B
Google: Gemini Pro 1.5
Microsoft: Phi-3.5 Mini (Instruct) (128K)
Cohere: Command R7B (12-2024)
NousResearch: Hermes 3 405B (Instruct)
Meta: Llama 3.1 8B (Instruct)
DeepSeek: DeepSeek R1 Distill Llama 70B
Mistral: Mistral NeMo (v24.07)
Mistral: Mixtral 8x22B (Instruct) (v0.1)
Cohere: Command R+ (08-2024)
Mistral: Mistral Small (v24.02)
Cohere: Command R+ (04-2024)
Cognitive Computations: Dolphin 2.9.2 Mixtral
8x22B
Mistral: Pixtral 12B (v2409)
Mistral: Codestral Mamba
Amazon: Nova Micro 1.0
NousResearch: Hermes 2 Pro - Llama-3 8B
AionLabs: Aion-1.0-Mini
Qwen: Qwen2.5 7B (Instruct)
AI21: Jamba 1.5 Large
Mistral: Mixtral 8x7B (Instruct) (v0.1)
NousResearch: Hermes 3 70B (Instruct)
Mistral: Mistral Medium
Mistral: Ministral 3B
Databricks: DBRX 132B (Instruct)
Qwen: QwQ 32B
DeepSeek: DeepSeek R1 Distill Qwen 32B
Qwen: Qwen 2 72B (Instruct)
AI21: Jamba-Instruct
Mistral: Ministral 8B
Mistral: Mistral Small 3
NVIDIA: Llama 3.1 Nemotron 70B (Instruct)
DeepSeek: DeepSeek R1
AI21: Jamba 1.5 Mini
Qwen: Qwen-Turbo (2024-11-01)
Mistral: Pixtral Large (2411)
Mistral: Mistral Large 2 (2411)
Anthropic: Claude 3 Sonnet
Meta: Llama 3 70B (Instruct)
Amazon: Nova Lite 1.0
Amazon: Nova Pro 1.0
Google: Gemini Flash 1.5 8B
Mistral: Mistral Large 2 (2407)
AionLabs: Aion-1.0
Meta: Llama 3.1 70B (Instruct)
Anthropic: Claude 3 Opus
01.AI: Yi Large
Anthropic: Claude 3 Haiku
Meta: Llama 3.1 405B (Instruct)
Meta: Llama 3.3 70B (Instruct)
Qwen: Qwen2.5 72B (Instruct)
Perplexity: Llama 3.1 Sonar 70B
Perplexity: Llama 3 Sonar 70B (Online)
DeepSeek: DeepSeek V3
Qwen: Qwen2.5 32B Instruct
DeepSeek: DeepSeek V2.5
Microsoft: Phi 4
Mistral: Codestral (2501)
Anthropic: Claude 3.5 Sonnet (2024-10-22)
Google: Gemini Flash 2.0
Anthropic: Claude 3.5 Haiku (2024-10-22)
Anthropic: Claude 3.7 Sonnet (2025-02-19)
xAI: Grok-2 (1212)
Google: Gemini Flash 1.5
MiniMax: MiniMax-01
Qwen: Qwen-Plus
Anthropic: Claude 3.5 Sonnet (2024-06-20)
Anthropic: Claude 3.7 Sonnet (Thinking)
Qwen: Qwen-Max
Qwen: Qwen2.5 Coder 32B (Instruct)
OpenAI: o1-mini (2024-09-12)
OpenAI: GPT-4o-mini (2024-07-18)
Google: Gemini 2.0 Flash Lite
OpenAI: o3-mini (2025-01-31)
(reasoning_effort=high)
OpenAI: o3-mini (2025-01-31)
(reasoning_effort=low)
OpenAI: o3-mini (2025-01-31)
(reasoning_effort=medium)
OpenAI: GPT-4o (2024-11-20)
OpenAI: o1-preview (2024-09-12)
9.00%
9.42%
9.70%
14.10%
20.17%
21.29%
23.73%
26.25%
26.25%
27.22%
27.38%
27.70%
27.73%
27.78%
28.00%
28.17%
28.56%
28.75%
29.12%
29.33%
29.40%
30.82%
31.37%
31.79%
34.68%
35.29%
37.00%
39.09%
39.44%
39.78%
40.65%
42.88%
44.45%
45.28%
45.66%
48.28%
48.54%
49.49%
49.69%
50.77%
51.05%
51.48%
52.87%
54.43%
57.54%
58.07%
59.12%
59.25%
59.29%
63.19%
64.10%
64.17%
67.56%
67.93%
71.53%
71.91%
71.94%
73.17%
73.36%
73.51%
73.57%
73.91%
74.25%
75.48%
75.53%
76.31%
77.12%
77.35%
79.41%
80.06%
80.25%
81.76%
82.59%
82.85%
82.86%
83.03%
83.25%
83.55%
83.78%
85.15%
85.21%
85.22%
85.56%
85.88%
86.00%
86.63%
86.70%
88.42%
88.44%
88.78%
89.85%
89.93%
90.35%
90.58%
90.59%
91.10%
91.43%
91.53%
92.57%
92.68%
92.95%
93.44%
94.18%
94.19%
95.11%
95.47%
95.55%
Download SVG
0