Most teams start distributing model artifacts through S3 buckets, shared NFS mounts, or ad-hoc scripts that copy files between machines. It works until it doesn’t. You lose track of versions, caching is nonexistent, and every deployment environment needs its own access pattern.
ORAS (OCI Registry As Storage) solves this by treating container registries as general-purpose artifact stores. You push model files the same way you push Docker images – with tags, digests, and layer caching – but without wrapping anything in a container. Your existing registry infrastructure (Docker Hub, GitHub Container Registry, AWS ECR) becomes your model distribution layer.
Why ORAS for Model Artifacts#
Container registries already handle the hard parts of artifact distribution: content-addressable storage, deduplication, geo-replication, access control, and pull-through caching. ORAS lets you store arbitrary files in these registries using the OCI artifact spec.
The practical benefits:
- Versioning with tags and digests – tag a model
v1.0, latest, or by commit SHA, and always get reproducible pulls via digest - Layer deduplication – if your tokenizer and config stay the same between model versions, the registry only stores them once
- Existing infrastructure – if you already run a registry for containers, you don’t need a separate system for models
- Pull-through caching – registries like Harbor can cache pulls, so your GPU nodes don’t all hammer the same origin
Installing the ORAS CLI#
Grab the latest release for your platform:
1
2
3
4
5
6
7
8
9
10
| # Linux (amd64)
curl -LO https://github.com/oras-project/oras/releases/download/v1.2.2/oras_1.2.2_linux_amd64.tar.gz
tar -xzf oras_1.2.2_linux_amd64.tar.gz
sudo mv oras /usr/local/bin/
# macOS (Apple Silicon)
brew install oras
# Verify installation
oras version
|
Pushing Model Artifacts#
Say you have a trained sentiment model with three files: the weights, a config, and a tokenizer. Push all of them as a single artifact:
1
2
3
4
| oras push ghcr.io/myorg/models/sentiment:v1.0 \
./model.safetensors:application/vnd.safetensors \
./config.json:application/json \
./tokenizer.json:application/json
|
Each file gets its own layer with a media type. This matters because clients can selectively pull layers – if you only need the config to check hyperparameters, you don’t have to download the 2 GB weights file.
Pull everything back:
1
| oras pull ghcr.io/myorg/models/sentiment:v1.0 --output ./model-artifacts/
|
Tag an existing artifact with a new tag without re-uploading:
1
2
| oras tag ghcr.io/myorg/models/sentiment:v1.0 latest
oras tag ghcr.io/myorg/models/sentiment:v1.0 production
|
Inspect what’s inside an artifact before pulling:
1
| oras manifest fetch ghcr.io/myorg/models/sentiment:v1.0 | python3 -m json.tool
|
Authenticating with GitHub Container Registry#
Before pushing to ghcr.io, authenticate with a personal access token that has write:packages scope:
1
| echo "$GITHUB_TOKEN" | oras login ghcr.io --username "$GITHUB_USERNAME" --password-stdin
|
For AWS ECR, get a temporary token:
1
2
| aws ecr get-login-password --region us-east-1 | \
oras login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
|
Python Wrapper for CI/CD#
Calling oras from Python via subprocess is the most reliable approach for pipeline integration. Here’s a wrapper that handles pushing, pulling, and tagging with proper error reporting:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
| import subprocess
import json
import os
from pathlib import Path
class OrasClient:
def __init__(self, registry: str = "ghcr.io"):
self.registry = registry
def _run(self, args: list[str]) -> subprocess.CompletedProcess:
result = subprocess.run(
["oras"] + args,
capture_output=True,
text=True,
)
if result.returncode != 0:
raise RuntimeError(f"oras command failed: {result.stderr.strip()}")
return result
def login(self, username: str, password: str) -> None:
subprocess.run(
["oras", "login", self.registry,
"--username", username, "--password-stdin"],
input=password,
capture_output=True,
text=True,
check=True,
)
def push(self, repo: str, tag: str, files: dict[str, str]) -> str:
"""Push files with media types. files = {path: media_type}."""
ref = f"{self.registry}/{repo}:{tag}"
file_args = [f"{path}:{media}" for path, media in files.items()]
result = self._run(["push", ref] + file_args)
print(f"Pushed {ref}")
return result.stdout.strip()
def pull(self, repo: str, tag: str, output_dir: str = ".") -> None:
ref = f"{self.registry}/{repo}:{tag}"
os.makedirs(output_dir, exist_ok=True)
self._run(["pull", ref, "--output", output_dir])
print(f"Pulled {ref} to {output_dir}")
def tag(self, repo: str, existing_tag: str, new_tag: str) -> None:
ref = f"{self.registry}/{repo}:{existing_tag}"
self._run(["tag", ref, new_tag])
print(f"Tagged {ref} as {new_tag}")
def manifest(self, repo: str, tag: str) -> dict:
ref = f"{self.registry}/{repo}:{tag}"
result = self._run(["manifest", "fetch", ref])
return json.loads(result.stdout)
# Usage
client = OrasClient(registry="ghcr.io")
client.login(os.environ["GITHUB_USERNAME"], os.environ["GITHUB_TOKEN"])
# Push model artifacts
client.push(
repo="myorg/models/sentiment",
tag="v1.0",
files={
"model.safetensors": "application/vnd.safetensors",
"config.json": "application/json",
"tokenizer.json": "application/json",
},
)
# Promote to production
client.tag("myorg/models/sentiment", "v1.0", "production")
# Pull on a serving node
client.pull("myorg/models/sentiment", tag="production", output_dir="./models/sentiment")
|
GitHub Actions Integration#
Here’s a workflow that pushes a model artifact to GHCR after training completes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
| # .github/workflows/push-model.yml
name: Push Model Artifact
on:
workflow_dispatch:
inputs:
model_tag:
description: "Model version tag (e.g., v1.0)"
required: true
permissions:
packages: write
jobs:
push-model:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install ORAS
run: |
curl -LO https://github.com/oras-project/oras/releases/download/v1.2.2/oras_1.2.2_linux_amd64.tar.gz
tar -xzf oras_1.2.2_linux_amd64.tar.gz
sudo mv oras /usr/local/bin/
- name: Download trained model
run: |
# Replace with your actual model download step
aws s3 cp s3://my-training-bucket/runs/latest/ ./artifacts/ --recursive
- name: Login to GHCR
run: echo "${{ secrets.GITHUB_TOKEN }}" | oras login ghcr.io --username ${{ github.actor }} --password-stdin
- name: Push model artifact
run: |
cd artifacts
oras push ghcr.io/${{ github.repository_owner }}/models/sentiment:${{ inputs.model_tag }} \
./model.safetensors:application/vnd.safetensors \
./config.json:application/json \
./tokenizer.json:application/json
- name: Tag as latest
run: |
oras tag ghcr.io/${{ github.repository_owner }}/models/sentiment:${{ inputs.model_tag }} latest
|
Versioning Strategy#
A good tagging convention keeps things traceable. Use semantic versions for stable releases and commit SHAs for traceability:
1
2
3
4
5
6
7
8
9
10
11
12
| # Semantic version after evaluation passes
oras push ghcr.io/myorg/models/sentiment:v2.1.0 ./model.safetensors:application/vnd.safetensors
# Git SHA for exact reproducibility
oras tag ghcr.io/myorg/models/sentiment:v2.1.0 sha-a1b2c3d
# Environment promotion
oras tag ghcr.io/myorg/models/sentiment:v2.1.0 staging
oras tag ghcr.io/myorg/models/sentiment:v2.1.0 production
# List tags for a repository
oras repo tags ghcr.io/myorg/models/sentiment
|
The production tag is a moving pointer. Your serving infrastructure always pulls production, and promoting a model is just retagging. Rollback is retagging the previous version.
Common Errors and Fixes#
denied: permission_denied when pushing to GHCR – your token needs write:packages scope. Generate a new PAT at GitHub Settings > Developer settings > Personal access tokens, and make sure the token has access to the target organization.
manifest unknown on pull – the tag doesn’t exist. Check available tags with oras repo tags ghcr.io/myorg/models/sentiment. Typos in the repository path are the usual culprit.
unsupported media type – some older registries don’t support OCI artifacts. Docker Hub has supported them since 2023, but self-hosted registries (Harbor < 2.6, Nexus < 3.42) may need upgrades. Check your registry’s OCI artifact support before committing to ORAS.
Large file timeouts – model files over 1 GB can time out on slow connections. ORAS doesn’t support resumable uploads natively. Split large models into shards before pushing, or increase your HTTP client timeout:
1
2
3
4
5
6
| # Set a longer timeout for large files
ORAS_HTTP_TIMEOUT=3600 oras push ghcr.io/myorg/models/llm:v1.0 \
./model-00001-of-00004.safetensors:application/vnd.safetensors \
./model-00002-of-00004.safetensors:application/vnd.safetensors \
./model-00003-of-00004.safetensors:application/vnd.safetensors \
./model-00004-of-00004.safetensors:application/vnd.safetensors
|
ERRO[0000] credentials not found after logging in – ORAS stores credentials in ~/.docker/config.json by default. If you’re running in a container or CI environment without that path, set the config explicitly:
1
| oras login ghcr.io --username "$USER" --password-stdin --registry-config /tmp/oras-config.json < /dev/null
|
ECR repository doesn’t exist – AWS ECR requires you to create the repository before pushing. Unlike GHCR, it won’t auto-create:
1
| aws ecr create-repository --repository-name models/sentiment --region us-east-1
|