About.
I run The Litmus Lab. It's me, a laptop, and five AI APIs.
What This Is
Every day, I send the same yes-or-no questions to five major AI models: Claude, GPT, Gemini, DeepSeek, and Grok. Each one has to answer YES or NO. No explanations, no qualifiers, no hedging. Every answer is published permanently.
Over time, the data shows patterns: where models agree, where they split, and how often they change their minds. That last part, the drifts, is the most interesting. Same question, same settings, different day, different answer. Nobody's changing anything. They just wobble.
As of today, the record includes over 300 questions, 40,000+ runs, and 2,000+ documented drifts.
What This Is Not
This is not a truth engine. I don't label answers as right or wrong. I document what models say when forced to commit. The data speaks for itself.
Why I Built This
I'm an engineer with 30 years of B2B experience across several industries. When I started using AI models daily, I noticed they hedge constantly, paragraphs of qualifiers to avoid committing to anything. I wanted to know what happens when you take that away.
So I built a system that forces them to commit. The answers are the what. The drifts are the how sure. That's the whole thesis.
Independence
I'm not affiliated with Anthropic, OpenAI, Google, DeepSeek, xAI, or anyone else. No corporate funding, no sponsorship, no ads. I fund this myself. Anyone can submit a question. I curate what runs. The methodology is public. The data is raw.
Founded February 2026
Built by one person who loves data and has zero patience for non-answers. Start with the record or read the methodology.