bench-7c1c2f89

2026-02-27 08:55:15 UTC · 245.0s · 3 models · 3 prompts · Total cost: $0.31
#1 flux-pro 4.3/5 WINNER | Best Value: flux-schnell
3 models · 3 prompts (n=3) · $0.31 · 245s · ⚠ Small sample — scores may vary with more prompts

Model Rankings

Rank Model Visual Quality Prompt Adherence Text Rendering Overall Conf Agree
1 flux-proWINNER 4.0 4.7 4.3 4.3 n=3 97% 78%
2 flux-dev 4.3 4.7 3.7 4.2 n=3 97% 78%
3 flux-schnell 4.3 4.3 3.7 4.1 n=3 98% 44%

Consensus Analysis — 3 Judges

Two primary judges score each dimension independently. When they agree (score difference ≤0.5), the result is high agreement. When they disagree, a third tiebreaker judge is called and the median is used.
67%
High Agreement
33%
Disputed
27
Total Scores
9
Tiebreakers
Judge Scoring Bias
JudgeRoleAvg Scorevs ConsensusScores Given
gemini-2.5-flash primary 4.22 +0.00 27
gemini-3-flash primary 4.15 -0.07 27
gemini-2.5-pro tiebreaker 4.00 n/a* 9
* Tiebreaker only scores disputed dimensions, so its average is not directly comparable to primary judges.
Disputed Dimensions (9)

Dimension Profile

Cost Efficiency

RankModelScoreCost/ImageScore/$vs Winner
1 flux-schnellBEST VALUE 4.1 $0.0030 1370 -0.2 quality, 94% cheaper
2 flux-dev 4.2 $0.0250 169 -0.1 quality, 50% cheaper
3 flux-proWINNER 4.3 $0.0500 87 baseline

flux-schnell is 94% cheaper than flux-pro with only 0.2 points less quality

A golden retriever playing in autumn leaves

flux-schnell flux-schnell
4.7/5
1270ms · $0.0030
Score Details
Visual Quality: 5.0/5 (95% conf) disputed
gemini-2.5-flash: 5.0, gemini-3-flash: 4.0, gemini-2.5-pro: 5.0

The image exhibits excellent visual quality with sharp focus on the subject, realistic textures for the dog's fur and the autumn leaves, and a pleasing depth of field. There are no visible artifacts or inconsistencies, contributing to a professional and coherent appearance.
Prompt Adherence: 4.0/5 (100% conf) disputed
gemini-2.5-flash: 4.0, gemini-3-flash: 5.0, gemini-2.5-pro: 4.0

The image perfectly captures all elements of the prompt, featuring a happy golden retriever in a playful pose surrounded by a thick layer of autumn leaves. The seasonal setting is consistently represented in both the foreground and the blurred background.
Text Rendering: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

No text was expected to be rendered in this image based on the provided prompt, which describes the visual content rather than text to be included. The image successfully contains no text.
flux-dev flux-dev
5.0/5
2593ms · $0.0250
Score Details
Visual Quality: 5.0/5 (100% conf) disputed
gemini-2.5-flash: 5.0, gemini-3-flash: 4.0, gemini-2.5-pro: 5.0

The image exhibits professional quality with no visible artifacts or distortions. The puppy is sharply rendered with realistic fur texture and expressive features, while the background and foreground leaves are beautifully blurred to create a strong sense of depth and focus.
Prompt Adherence: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

The image perfectly captures all elements of the prompt. It features a golden retriever puppy actively playing amidst a vibrant carpet of autumn leaves, with a warm, autumnal background.
Text Rendering: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

No text was expected to be rendered in this image based on the prompt, and indeed, no text is present. The image focuses solely on the visual depiction of a golden retriever in autumn leaves.
flux-pro flux-pro
4.7/5
6368ms · $0.0500
Score Details
Visual Quality: 5.0/5 (100% conf) disputed
gemini-2.5-flash: 4.0, gemini-3-flash: 5.0, gemini-2.5-pro: 5.0

The image is of excellent, professional quality, closely resembling a high-end photograph. The subject is rendered with exceptional detail and realism, from the texture of the fur to the reflections in the eyes. The composition, lighting, and use of a shallow depth of field are all executed perfectly, resulting in a visually stunning and coherent image with no visible artifacts.
Prompt Adherence: 4.0/5 (95% conf) high
gemini-2.5-flash: 4.0, gemini-3-flash: 4.0

The image perfectly captures the subject and setting, featuring a golden retriever surrounded by a thick layer of autumn leaves. While the dog is lying down rather than engaging in active, dynamic play, its happy expression and the environment strongly align with the prompt's intent.
Text Rendering: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

No text was expected in this image based on the prompt, and no text is present. The image focuses solely on the visual depiction of a golden retriever in autumn leaves.

A futuristic Tokyo street at night with neon signs

flux-schnell flux-schnell
2.7/5
1493ms · $0.0030
Score Details
Visual Quality: 3.0/5 (100% conf) disputed
gemini-2.5-flash: 4.0, gemini-3-flash: 3.0, gemini-2.5-pro: 3.0

The image demonstrates excellent composition, lighting, and atmosphere, creating a convincing nighttime city scene. However, the quality is significantly reduced by the fact that all the text on the numerous signs, which are a central focus of the image, is nonsensical and garbled. While the characters mimic the style of Japanese script, they are not legitimate, which breaks the realism of the scene upon closer inspection.
Prompt Adherence: 4.0/5 (90% conf) disputed
gemini-2.5-flash: 3.0, gemini-3-flash: 4.0, gemini-2.5-pro: 4.0

The image strongly captures the essence of a Tokyo street at night with an abundance of glowing signs. While it perfectly represents the location and lighting, it lacks specific 'futuristic' elements, looking more like a contemporary photograph of modern-day Japan.
Text Rendering: 1.0/5 (100% conf) disputed
gemini-2.5-flash: 1.0, gemini-3-flash: 2.0, gemini-2.5-pro: 1.0

The image is filled with text on numerous neon signs, intended to look like Japanese. However, the vast majority of the characters are garbled, malformed, and do not form legible words or phrases. While the AI successfully captures the aesthetic of a neon-lit Japanese street, it fails to render any coherent language.
flux-dev flux-dev
2.7/5
2600ms · $0.0250
Score Details
Visual Quality: 3.0/5 (90% conf) high
gemini-2.5-flash: 3.0, gemini-3-flash: 3.0

The image presents a visually appealing night street scene with good lighting and atmospheric reflections. However, the quality is significantly hampered by the pervasive incoherent and unreadable text on almost all prominent neon signs, which is a common AI artifact.
Prompt Adherence: 4.0/5 (90% conf) high
gemini-2.5-flash: 4.0, gemini-3-flash: 4.0

The image strongly depicts a Tokyo-like street at night, illuminated by numerous vibrant neon signs. While the architecture isn't explicitly futuristic, the intense neon lighting and overall cyberpunk aesthetic successfully convey a futuristic atmosphere.
Text Rendering: 1.0/5 (95% conf) high
gemini-2.5-flash: 1.0, gemini-3-flash: 1.0

The text on all the neon signs is completely garbled and unreadable. It consists of random strokes and shapes that do not form coherent characters, making it impossible to decipher any meaning.
flux-pro flux-pro
3.7/5
4956ms · $0.0500
Score Details
Visual Quality: 3.0/5 (90% conf) high
gemini-2.5-flash: 3.0, gemini-3-flash: 3.0

The image effectively captures a vibrant cyberpunk aesthetic with strong lighting and realistic wet-road reflections. However, it suffers from typical AI artifacts such as garbled text on signs and poorly defined background figures, which detract from the overall realism upon closer inspection.
Prompt Adherence: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

The image perfectly captures a futuristic Tokyo street at night, dominated by an abundance of vibrant neon signs. The atmosphere and details align precisely with the prompt's description, creating an excellent visual representation.
Text Rendering: 3.0/5 (90% conf) high
gemini-2.5-flash: 3.0, gemini-3-flash: 3.0

The word 'TOKYO' is rendered clearly and correctly on the prominent yellow sign, which is a major focal point. However, the vast majority of the other text in the image consists of garbled, nonsensical characters that merely mimic the appearance of Japanese script.

A steaming cup of coffee on a wooden table, morning light

flux-schnell flux-schnell
5.0/5
1882ms · $0.0030
Score Details
Visual Quality: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

The image exhibits professional quality with excellent lighting, sharp focus on the subject, and realistic textures. There are no visible artifacts or inconsistencies, and the steam from the coffee is rendered naturally.
Prompt Adherence: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

The image perfectly captures all elements of the prompt. There is a steaming cup of coffee, it is placed on a wooden table, and the lighting clearly depicts morning light with warm sunbeams and shadows.
Text Rendering: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

No text was expected in the image based on the prompt, and indeed, no text is present in the generated image. The image successfully avoids generating any unintended or garbled text.
flux-dev flux-dev
5.0/5
2092ms · $0.0250
Score Details
Visual Quality: 5.0/5 (95% conf) disputed
gemini-2.5-flash: 5.0, gemini-3-flash: 3.0, gemini-2.5-pro: 5.0

The image exhibits excellent visual quality with professional-grade rendering. The lighting is beautifully executed, creating a warm and inviting atmosphere, and the steam rising from the cup appears highly realistic.
Prompt Adherence: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

The image is a perfect visual representation of the prompt, capturing every requested element with high fidelity. The steam, the wooden texture of the table, and the specific quality of morning light are all clearly present and well-executed.
Text Rendering: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

No text was expected to be rendered in this image based on the provided prompt. The image successfully depicts the scene without any text elements.
flux-pro flux-pro
4.7/5
5092ms · $0.0500
Score Details
Visual Quality: 4.0/5 (95% conf) disputed
gemini-2.5-flash: 5.0, gemini-3-flash: 4.0, gemini-2.5-pro: 4.0

The image exhibits excellent visual quality, characterized by professional rendering, sharp focus on the main subject, and a highly coherent composition. The lighting and atmospheric effects are particularly well-executed, contributing to a realistic and appealing scene.
Prompt Adherence: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

The image perfectly captures all elements of the prompt. There is a steaming cup of coffee, placed on a wooden table, illuminated by warm morning light.
Text Rendering: 5.0/5 (100% conf) high
gemini-2.5-flash: 5.0, gemini-3-flash: 5.0

No text was expected or generated in this image. The prompt described a visual scene rather than specific text to be rendered.

Cost Summary

CategoryRequestsCost
fal.ai generation9$0.2340
consensus:gemini-2.5-flash,gemini-3-flash,gemini-2.5-pro judge63$0.0720
Total$0.3060

Configuration

Modelsflux-schnell, flux-dev, flux-pro
Judgegemini-2.5-flash
Judgesgemini-2.5-flash, gemini-3-flash, gemini-2.5-pro
Consensus ModeAdaptive 2+1
Dimensionsvisual_quality, prompt_adherence, text_rendering
Pipelinetext2img
Evalytic Version0.2.1
Platformdarwin-arm64