検索

I ran a language model on a PS2

The Emotion Engine has 32 MB of RAM total, so the trick is streaming weights from CD-ROM one matrix at a time during the forward pass — only activations, KV cache and embeddings live in RAM. This means models bigger than the RAM can still run, they just read more from disc. Had to build a custom quantized format (PSNT), hack endianness, write a tokenizer pipeline, and most of the PS2 SDK from scratch (releasing that separately). The model itself is also custom — a 10M param Llama-style architecture I trained specifically for this. And it works. On real hardware.

  • LLM
  • オープンソース
  • コード生成

AI サマリー

This project demonstrates running a 10M parameter Llama-style language model on a PlayStation 2 by streaming weights from the CD-ROM due to the console's limited RAM. It involved creating a custom quantized format, modifying the PS2 SDK, and developing a custom tokenizer.

おすすめ対象

Retro computing enthusiasts, Embedded systems developers, AI researchers interested in resource-constrained environments

重要な理由

Enables running modern AI models on severely limited hardware through innovative data streaming and custom software development.

主な機能

  • Runs a 10M parameter Llama-style language model on a PlayStation 2.
  • Streams model weights from CD-ROM to overcome 32MB RAM limitation.
  • Utilizes a custom quantized format (PSNT) for model weights.
  • Includes a custom tokenizer pipeline.

ユースケース

  • A retro-computing enthusiast could use this to experiment with AI on vintage hardware, showcasing the capabilities of older systems for modern tasks.
  • A game developer specializing in retro-style games might integrate this LLM into a PlayStation 2 title to generate dynamic in-game dialogue or lore, adding a unique interactive element.
  • A student learning about AI model optimization could study the techniques used to quantize and stream model weights, applying these principles to resource-constrained embedded systems.