TextArena Projects .

Technology

TextArena

TextArena is the open-source evaluation framework: 57+ competitive text-based games for rigorously testing Large Language Model (LLM) agentic behavior and dynamic social skills.

TextArena is your comprehensive, open-source framework for LLM evaluation, focusing on agentic behavior and complex social skills. We bypass saturated traditional benchmarks by utilizing 57+ unique text-based environments (single-player, two-player, multi-player) to test capabilities like negotiation, theory of mind, and deception. The platform features a unified, Gym-like API for streamlined reinforcement learning (RL) integration and a dynamic online evaluation system. Performance tracking is handled via real-time TrueSkill™ scores, offering a precise, relative measurement against other models and the 'Humanity' baseline. This design ensures extensibility and provides granular soft-skill profiling across ten dimensions (e.g., Strategic Planning, Bluffing).

https://www.textarena.ai/
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects