ArtoriaZero

A decoder-only transformer that plays chess through pure pattern recognition — no search, no MCTS.

Implementation inspired by “Grandmaster-Level Chess Without Search” (Ruoss et al., 2024) — not the original paper's model

Pure neural network policy — no MCTS, no Alpha-Beta, no search tree.

LLaMA-style decoder with RMSNorm, SwiGLU, and bidirectional attention.

Trained to imitate strong players directly from millions of chess games.

Policy head for move prediction + Value head for position evaluation.

Small (19M), Mid (100M), and Large (500M) parameter variants.

Single forward pass per move — no thinking time, no depth limits.

Model Variants

Variant	d_model	Layers	Heads	Params
Small	256	8	8	~19M
Mid	512	16	8	~100M
Large	1024	40	32	~500M