PromptDuel

"Stop guessing which prompt is better. Duel them and let the data decide."

I designed, coded, and deployed PromptDuel within a single 24-hour sprint on Christmas Day 2025.

The Problem

When developing AI Agents, small semantic changes in a prompt can lead to drastically different outputs. Tracking this in spreadsheets is messy. You need a way to blind test these outputs against each other to get clean, unbiased data.

The Solution

PromptDuel solves the "vibe check" problem. It is a lightweight, structured environment to evaluate LLM outputs side-by-side.

Key Features

⚖️ Side-by-Side Arena: A clean, split-screen interface for comparing two text outputs (supports Markdown).
🫣 Blind Testing Mode: Model names are hidden from voters to ensure unbiased feedback.
🔗 Instant Sharing: Generate public, read-only links for clients or team members to cast votes.
📊 Analytics Dashboard: Track vote velocity and win rates visually.
🔐 Secure: Row Level Security (RLS) via Supabase ensures data integrity.

Tech Stack

I chose a stack focused on speed and reliability:

Frontend: Next.js 14 (App Router) + Tailwind CSS
UI Library: Shadcn/UI + Lucide Icons
Backend/Auth: Supabase (PostgreSQL + RLS)
Visualization: Recharts

PromptDuel

The Problem

The Solution

Key Features

Tech Stack

Links