Technology
LLaMA-65B
Meta AI’s 65-billion parameter foundational model designed to outperform GPT-3 while running on consumer-grade hardware.
LLaMA-65B is the flagship variant of Meta’s Large Language Model Meta AI suite, trained on 1.4 trillion tokens sourced from public datasets like CommonCrawl and GitHub. It utilizes a standard transformer architecture with specific optimizations (including RoPE embeddings and SwiGLU activation functions) to achieve state-of-the-art performance on benchmarks like Chinchilla and PaLM. By releasing the weights for research, Meta enabled the open-source community to build high-performance tools (such as llama.cpp) that run efficiently on single-node setups (8x A100 GPUs) or quantized consumer hardware.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1