Technology

PyTorch/XLA

Enables PyTorch models to run efficiently on XLA devices (Google TPUs, etc.) using the XLA deep learning compiler.

PyTorch/XLA is a Python package implementing the XLA (Accelerated Linear Algebra) deep learning compiler as a high-performance backend for PyTorch. This integration allows users to execute standard PyTorch models on XLA devices, primarily Google Cloud TPUs, with minimal code modification. The system utilizes lazy tensor tracing: it records PyTorch operations into an Intermediate Representation (IR) graph, which the XLA compiler then optimizes, compiles to HLO (High-Level Opcodes), and executes on the accelerator. This compilation process ensures high Model FLOPs Utilization (MFU) and enables cost-efficient scaling across thousands of TPU cores for large-scale training and inference jobs.

https://pytorch.org/xla/

1 project · 1 city

Related technologies

LLaMA-65B 1 TPU 1

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

LLaMA: 10x Faster Inference on TPU

San Francisco Jul 6

LLaMA-65B PyTorch/XLA