The Install That Finally Broke My Patience
I didn’t go looking for the fastest way to install PyTorch using uv out of curiosity.
I searched for it at 1:47 AM, staring at a terminal that had been “Installing…” for far too long.
I wasn’t compiling CUDA from source.
I wasn’t training a large model.
I just wanted PyTorch installed so I could run an experiment.
By 2:15 AM, I had a different setup — and a very clear conclusion.
I’d already been using uv while managing Python version upgrades—especially when testing Python 3.13 across multiple environments—which made it the obvious thing to try here too.
Why Installing PyTorch Is Still Slow in 2026
PyTorch isn’t a normal Python package.
It’s heavy by design:
- Large binary wheels
- CPU and multiple CUDA variants
- Platform-specific builds
- Optional dependencies that multiply install time
Traditional Python tooling isn’t broken — it’s just not optimized for this scale.
That’s why installs feel painful in CI, Docker, and fresh machines.
Prerequisites
Before using uv to install PyTorch, make sure you have:
- Python 3.8+ (3.10+ recommended)
- uv installed
- Basic understanding of CPU vs CUDA PyTorch builds
To install uv:
pip install uv
The pip Workflow I Used for Years
For a long time, this was my default:
pip install torch torchvision torchaudio
Sometimes it worked quickly.
Most times it didn’t.
In CI and Docker, it often took 5–7 minutes, especially with cold caches.
That’s when I stopped blaming PyTorch — and started questioning the installer.
Why uv Makes a Real Difference
uv doesn’t change PyTorch.
It changes how dependencies are installed:
- Faster dependency resolution
- Parallel wheel downloads
- Aggressive but safe caching
- Less Python-level overhead
Those differences matter when wheels are large.
The Fastest Way to Install PyTorch Using uv (CPU-Only)
For CPU-only PyTorch, this is the fastest, cleanest method I’ve found:
uv pip install torch torchvision torchaudio
On clean GitHub Actions runners, this typically finishes in ~60–90 seconds, compared to ~5 minutes with pip.
Timing context:
Measured using PyTorch CPU builds on GitHub Actions
ubuntu-latestrunners, standard networking, and warm CDN cache.
Installing CUDA PyTorch (The Right Way)
Before installing CUDA builds, verify your CUDA driver version:
nvidia-smi
You’ll see output like:
CUDA Version: 12.1
Then install the matching wheel index.
Example for CUDA 12.1:
uv pip install torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu121
⚠️ Installing the wrong CUDA variant will either:
- Fail at import time, or
- Silently fall back to CPU
Always match the wheel index to your installed driver.
Version Pinning (Don’t Skip This)
Speed does not replace correctness.
If you care about reproducibility, pin versions explicitly — but always verify the latest versions on pytorch
Example:
uv pip install torch==X.Y.Z torchvision==A.B.C
Docker Example (uv + PyTorch)
RUN uv pip install --system torch torchvision torchaudio
This single change consistently reduced Docker build times for ML services in CI.
Using uv with pyproject.toml
If your project already uses pyproject.toml:
[project.dependencies]
torch = ">=2.0"
torchvision = "*"
Install everything with:
uv pip install --system .
No extra tooling.
No Poetry runtime.
No pip slowdown.
pip vs uv: Real-World Comparison
| Method | CPU Install Time | CUDA Install Time | Cache Handling |
|---|---|---|---|
| pip | ~5m 30s | ~7m | Basic |
| uv | ~1m 15s | ~1m 45s | Advanced |
Context: Medium project (~45 dependencies), GitHub Actions runners, CPU and CUDA builds.
Where This Really Pays Off: CI
In CI pipelines:
- pip installs frequently pushed jobs toward timeout limits
- uv installs stayed predictable and fast
That reliability matters more than raw speed.
Migration Journey (Low Risk)
Phase 1: Local Test
Behavior was identical — just faster.
Phase 2: CI Trial
Build times dropped immediately.
Phase 3: Docker Adoption
No regressions.
Common Gotchas (And Fixes)
- Cache growing too large?
Clean it safely:
uv cache clean
- “No matching distribution found”?
Check Python version and platform. - CUDA import errors?
Verifynvidia-smioutput and wheel index match.
Troubleshooting Quick Wins
- Don’t mix
pipanduvin the same environment - Always match CUDA wheels to driver version
- Use
--systemfor Docker and CI - Pin versions in production workloads
A Real-World Use Case
This wasn’t a toy experiment or a benchmark I ran once and forgot. It was a FastAPI inference service with around fifty dependencies, built into a container, deployed to Kubernetes, and rebuilt on every pull request. Each CI run had to install PyTorch from scratch, and that step alone used to dominate the pipeline.
Builds regularly stretched past eight minutes, not because the code was slow, but because the installer was. The application itself was stable, the model was unchanged, and the infrastructure wasn’t the problem. The waiting was.
After switching that single install step to uv, the entire dynamic changed. The same service built in roughly two minutes, CI jobs stopped flirting with timeout limits, and the pipeline finally behaved predictably again. Nothing else in the stack changed — only the way PyTorch was installed — and that’s when it stopped feeling like an optimization and started feeling like basic reliability.
When uv Might Not Matter Much
If you:
- Install PyTorch once a year
- Work on long-lived machines
- Don’t use CI or Docker
Then uv won’t feel dramatic.
Final Thoughts
The fastest way to install PyTorch using uv isn’t about chasing benchmarks.
It’s about removing friction from the most boring, repeated step in ML workflows.
uv didn’t change PyTorch.
It changed how long I wait to start working.
And once you experience that difference, it’s very hard to go back.
