Articles by Tag #onnxruntime

Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!

Using an AI Coding Agent to Ship 2-Bit Quantization for WebGPU

How a developer paired with an AI agent to find and fix five layered bugs in ONNX Runtime's GPU...

Learn More 0 0Feb 11

Bringing 2-Bit Quantization to ONNX Runtime's WebGPU Backend

A story of five bugs, bit-level debugging, and running transformer models at 2-bit precision in the...

Learn More 0 0Feb 11

TFLite vs ONNX Runtime: Pi Zero Latency at 32ms vs 89ms

ONNX Runtime is 2.8x faster than TFLite on Raspberry Pi Zero — and I didn't expect...

Learn More 1 0Mar 1

ONNX Runtime vs TFLite Android: 3x Speed Benchmark

The Benchmark That Made Me Question Everything TFLite was supposed to be the gold standard...

Learn More 0 0Feb 14

ONNX Runtime Mobile: 8ms Inference on iPhone 13

The 200ms Problem Your edge AI model works great on a dev server. 30ms inference, low...

Learn More 0 0Feb 11