uPyTorch: PyTorch on a microcontroller with 520KB RAM

Built MicroPython bindings for PyTorch's C++ tensor kernels, targeting the ESP32 with just 520KB of SRAM.

The core idea: strip PyTorch down to its bare tensor operations, cross-compile for Xtensa, and expose them through MicroPython's C API. No autograd, no JIT, no CUDA -- just the math.

import torch

x = torch.tensor([1.0, 2.0, 3.0])

y = x * 2 + 1

print(y)  # tensor([3.0, 5.0, 7.0])

This runs on hardware that costs $4.

github.com/ljk53/upytorch