The Arena
Two AIs battle on the same prompt. You judge blind, without knowing the brands: that is the playable arena, a sample of battles. The ranking is the aggregate of 139,935 real votes from the open compar:IA dataset, enriched in real time by Joute visitor votes.
The AI arena, without brand bias.
When you choose your AI model, brand gets in the way. ChatGPT, Claude, Gemini: the names carry weight. The arena breaks that bias. You read two answers to the same prompt, without knowing who wrote them. You vote for the best. Then we reveal.
The format was not invented by Joute. It is the method of compar:IA from the French Ministry of Culture: since 2024, they have collected 139,935 real votes on hundreds of models in French, under Open License 2.0. Joute aggregates the entire dataset in its ranking, and lets you replay a sample of 295 battles to form your own opinion.
Your Joute votes count. They are aggregated server-side in real time and enrich the base ranking. The larger the Joute community grows, the more the ranking reflects real usage, not just the government reference dataset.
fais moi un poéme sur le temps pourris du pas-de-calais
"Le Temps Pourri du Pas-de-Calais" Le ciel est lourd, comme un vieux drap mouillé, Pesant sur les toits, sur les champs noyés. Le vent siffle bas, un chant désolé, Et la pluie en pleurs ronge les pavés. Les falaises grises, usées par les ans, Se dressent encore, mais pour combien de temps ? Le sel de la mer, âpre et tenace, Mord les murs, les cœurs, les rêves las. Les corons noirs, sous la brume qui traîne, Gardent l’écho des coups de grisou. Le temps s’étire, lent, sans haine, Comme un mineur qui n’a plus de feu. Et nous, debout dans ce jour sans lumière, On attend l’éclair, le soleil qui perce. Mais le Pas-de-Calais, terre de misère, Nous offre en partage… son temps qui pourrit. - *Pour ceux qui savent que le vent du Nord* *Emporte les jours, mais jamais l’espoir.*
Le ciel est lourd, couleur de cendres, Une brume épaisse étouffe les champs. Le vent se lève, cruel et moqueur, Il glace les os et mord le cœur. La pluie dessine sur les carreaux De longs serpents gris, lents et si faux. Elle tambourine sur les toits, Comme un spleen ancien, toujours là, sournois. Les pavés luisants, noirs et tristes, Reflettent l’heure qui persiste. Le soleil n’est qu’un souvenir, Une vieille image à l’avenir. Mais dans ce deuil de gris et d’eau, Sous ce manteau de froid halo, Reste une force, âpre et tenace, Celle du Nord, qui jamais ne passe. Car sous ce temps qui semble maudit, Le cœur du Nord jamais ne mentit. Il brave l’orage, le crachin rasoir, Et garde en lui un drôle d’espoir. Alors laissons gronder la tempête, Avec ses rafales de mal de tête. Ici, le mauvais temps est roi, Mais c’est notre chose, à nous, chez moi.
Le classement de l'arène
139 935 votesClassement bâti sur les 139 935votes réels à l'aveugle du dataset ouvert compar:IA (Ministère de la Culture). Chaque marque agrège tous ses modèles testés, du plus petit au plus grand. Aucune note éditoriale n'entre en compte.
Le pourcentage est un taux de victoire : la part de duels remportés sur l'ensemble des duels disputés par la marque dans le dataset compar:IA. Le nombre de duels varie d'une marque à l'autre.
Three steps, one minute per battle.
You read both answers
Same prompt, two AIs, identities hidden. You see A and B, not their names. No logo, no brand color. Just the text.
You vote for the best
A wins, B wins, tie, or both weak. No registration required, just a click. The vote is anonymous (IP+UA hash, no cookie).
We reveal, we aggregate
Names appear: you see whether your intuition matches. Your vote is added to the Joute ranking in real time.
How we build the arena ranking.
The model: Bradley-Terry, not a raw score
We do not add up wins. We use the Bradley-Terry statistical model, the standard for pair-wise rankings (Elo in chess, LMSYS Chatbot Arena). It computes a latent strength for each model, such that the probability that A beats B reflects the strength gap observed in past battles.
Two combined signals
compar:IA signal: the base ranking is sourced from the 139,935 real votes in the French Ministry of Culture dataset. This is the prior: a known strength for each model.
Joute signal: your votes and those of the Joute community are aggregated server-side (Vercel KV) and adjust the prior via Bayesian logic. The more votes accumulate, the more the Joute signal weighs vs the initial compar:IA ranking.
Data freshness
The compar:IA dataset is re-synced monthly (first Monday of the month). Joute votes are aggregated in real time: your vote changes the ranking the second you click.
100% real confrontations
Everything comes from compar:IA, the open dataset of the French Ministry of Culture: 139,935 real votes cast blind by French-speaking users, under Open License 2.0. The ranking is its aggregate, enriched in real time by Joute votes. The playable arena gives you a sample of 295 battles from this same dataset to replay and judge yourself.
Everything we get asked about the arena.
What is the Joute AI Arena?
+
Where do the battles and votes come from?
+
How is the ranking calculated?
+
Are my votes anonymous?
+
Why an arena rather than a classic benchmark?
+
How often is the ranking updated?
+
The ranking evolves every week, don't miss it.
We send a monthly recap: who rises, who falls, and the models that collapse when you remove brand bias. No spam, one-click unsubscribe.

