WebLLM Projects .

Technology

WebLLM

WebLLM is the high-performance, open-source LLM inference engine that runs large language models directly in your browser with WebGPU acceleration.

WebLLM is a high-performance, open-source inference engine from the MLC-AI team, designed to run Large Language Models (LLMs) entirely within the client's web browser. It leverages WebGPU for hardware acceleration and WebAssembly (WASM) for efficient CPU computations, delivering near-native performance without relying on a backend server. This architecture ensures enhanced user privacy (data stays local) and eliminates cloud API costs, while maintaining full compatibility with the OpenAI API for functionalities like streaming and JSON-mode generation.

https://mlc.ai/web-llm
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects