Hacking Secret Hitler: From Bayesian Math to AI Agents

Over the past few weeks, my friends and I have been obsessed with Secret Hitler. If you’ve never played it, it’s a game of deception, deduction, and lying to your friends’ faces. But after a few intense sessions, my engineer brain kicked in. I realized that while the game relies on "social reads," it’s ultimately a game of probabilities.

My intuition was simple: There must be an algorithm to rank suspected Fascists.

I pulled up my favorite LLM and started brainstorming. It pointed me toward Bayes' Theorem—a mathematical formula used to update probabilities based on new evidence. That’s when I went down the rabbit hole.

Phase 1: The "Math" Solver

I wrote a script to track the game state. I didn't want a bot that just played randomly; I wanted a calculator that saw through the lies. I tweaked variables to model human behavior:

What is the probability a Liberal lies? (Near zero).
What is the probability a Fascist lies? (High, but calculated).
How does "social pressure" propagate suspicion?

I brought the script to our next game night. Every round, I fed it the data: Who is President? Who is Chancellor? What was claimed? What was enacted?

The results were terrifying. Over a few rounds, the script started ranking players. By the mid-game, it was predicting the Fascists with scary accuracy. I realized then how hard it actually is to hide from math.

Here is the logic engine I used. It uses a Particle Filter approach to estimate the probability of every player's role:

REPO : https://github.com/samarthmahendraneu/secret-hiter.ai

codePython

# secret_hitler_inference.py
# A Bayesian approach to catching Fascists

import random
import string
from collections import defaultdict
from math import comb

# ===== Behavior Model Parameters =====
P_F_PRES_LIES = 0.7   
P_L_PRES_LIES = 0.0   
P_F_CHAN_LIES = 0.6   
SOCIAL_INFLUENCE = 0.35

def deck_likelihood(draw_F, deck_F, deck_L):
    total_cards = deck_F + deck_L
    if total_cards < 3 or draw_F < 0 or draw_F > 3:
        return 1e-6
    return comb(deck_F, deck_F) * comb(deck_L, 3 - draw_F) / comb(total_cards, 3)

def likelihood_of_obs(obs, assignment, deck_F, deck_L, player_scores):
    pres_idx, chan_idx, pres_claim_draw, pres_claim_pass, chan_claim_got, enacted = obs
    pres_role = assignment[pres_idx]
    chan_role = assignment[chan_idx]

    # Calculate probability based on role behavior and card stats
    # ... (Full implementation in the repo)

    return deck_like * lie_like * enact_like

Phase 2: The "Human" Problem

The math script was cool, but it had a flaw: It required me to have friends available to play.

I wanted to test strategies and play solo, but existing online bots for Secret Hitler are usually terrible. They play randomly and don't understand nuance. That’s when I thought: What if I built AI Agents with memory and personalities?

I designed a new system using OpenAI's API and Structured Outputs. I didn't just want them to vote; I wanted them to argue.

How the Agents Work

Memory: Just like a human, each agent has a memory buffer. They don't remember the whole game perfectly (that would be cheating), but they remember recent accusations, votes, and specific lies.
Backstory: To prevent boring discussions, I gave every agent a persona. Some are "data nerds," others are "aggressive accusers," and some are just "clueless vibe-checkers."
Structured Decision Making: Using Pydantic models, the AI doesn't just output text; it outputs specific actions (vote, nominate, discard) alongside their reasoning.

The Turing Test

After fixing a mountain of bugs, I finally sat down to play. I was assigned Fascist. I was playing against 6 AI agents.

I expected to steamroll them. I was wrong.

I was genuinely surprised by their coordination. They didn't just look at the cards; they looked at the votes. When I tried to defend myself, an agent named "Riley" (configured as an over-analyzer) pointed out a contradiction in my voting history from three turns ago. My fellow Fascist AI teammate stayed quiet, likely realizing that defending me would blow their cover.

They successfully identified me and froze me out of the government. It felt like playing with real people.

The Future of Tabletop Gaming

This experiment proved to me that we are on the verge of a new era in gaming. We aren't far from a future where AI fills in the missing seats at your board game night, indistinguishable from your human friends (except maybe they won't spill drinks on the cards).

If you want to try this yourself, the code is below. It connects to OpenAI, generates personalities, and manages the complex state of Secret Hitler automatically.

The Code

You can run this in your terminal. It handles the deck, the rules, and the AI players.

REPO : https://github.com/samarthmahendraneu/secret-hiter.ai

codePython

# secret_hitler_terminal_ai.py
# Simple terminal-based Secret Hitler experience with AI-driven players.

import json
import random
from dataclasses import dataclass, field
from typing import Dict, List, Optional
from openai import OpenAI
from pydantic import BaseModel, Field

# ---------- CONFIG ----------
# Make sure to export your API Key or set it here
# OPENAI_API_KEY = "sk-..." 
MODEL = "gpt-4o-mini"  # Recommended for speed/cost
client = OpenAI()

# ================== STRUCTURED OUTPUT MODELS ==================
class AIDecision(BaseModel):
    action: str
    target: Optional[str] = None
    choice: Optional[str] = None
    text: Optional[str] = None
    discard_index: Optional[int] = None
    enact_card: Optional[str] = None

class AIComment(BaseModel):
    action: str = "comment"
    text: str = Field(..., description="A short, natural comment in character")

# ================== GAME LOGIC ==================
@dataclass
class Player:
    name: str
    role: str  # "L", "F", or "H"
    persona: str
    memory: List[str] = field(default_factory=list)
    alive: bool = True

    def system_prompt(self) -> str:
        # Logic to inform the AI of its secret role and known information
        if self.role == "L":
            role_text = "You are a Liberal. Protect democracy."
            info = "You know no one else's role."
        elif self.role == "F":
            role_text = "You are a Fascist. Deceive the Liberals."
            info = "You know your teammates."
        else:
            role_text = "You are Hitler. Pretend to be a Liberal."
            info = "You do NOT know who the Fascists are."

        return f"You are {self.name}, {self.persona}. {role_text} {info}"

# ... (Rest of the game engine, deck management, and voting logic)

def ai_decide(player: Player, table: Table, action: str, context: str) -> AIDecision:
    """
    The core brain: Sends game state to LLM and requests a structured decision.
    """
    full_context = f"""
    {player.system_prompt()}
    GAME STATE: {table.game_state()}
    CURRENT ACTION: {action.upper()}
    CONTEXT: {context}

    Make a strategic decision based on your role.
    """

    resp = client.beta.chat.completions.parse(
        model=MODEL,
        messages=[{"role": "user", "content": full_context}],
        response_format=AIDecision,
    )
    return resp.choices[0].message.parsed

# ... (Main loop handles phases: Nomination -> Vote -> Legislative -> Executive Action)

if __name__ == "__main__":
    # Start the game
    print("Booting up AI Agents...")
    main()

Note: The full script includes the complete game loop, win conditions (policies vs. Hitler assassination), and memory management functions.

This project started as a math curiosity and ended as a realization that AI agents are becoming incredibly capable social actors. Give the code a spin

Hacking Secret Hitler: From Bayesian Math to AI Agents

Phase 1: The "Math" Solver

Phase 2: The "Human" Problem

How the Agents Work

The Turing Test

The Future of Tabletop Gaming

The Code

Comments

More from this blog

Zigzagged Deletes: MongoDB TTL Regression Caused by a WiredTiger Cursor Bug

Building Orion: A C++23 Distributed Task Runtime Inspired by Ray

Rethinking Metadata Indexing in Analytical Data Systems

Modern Storage Engines

Command Palette

Phase 1: The "Math" Solver

Phase 2: The "Human" Problem

How the Agents Work

The Turing Test

The Future of Tabletop Gaming

The Code

Comments

More from this blog