Cheatsheet: Share your work on the Hub
Authenticate first
Section titled “Authenticate first”from huggingface_hub import notebook_loginnotebook_login() # notebook# hf auth login # terminalPaste a token from your account settings; it is cached for later uploads.
The three routes (easiest to most manual)
Section titled “The three routes (easiest to most manual)”| Route | Use when | Key calls |
|---|---|---|
push_to_hub API | Almost always | push_to_hub=True, trainer.push_to_hub(), model.push_to_hub() |
huggingface_hub library | Programmatic repo management | create_repo, upload_file |
| git + git-lfs | Full manual control | git lfs install, clone, add, commit, push |
Route 1: push_to_hub (preferred)
Section titled “Route 1: push_to_hub (preferred)”# During Trainer training:training_args = TrainingArguments( "bert-finetuned-mrpc", save_strategy="epoch", push_to_hub=True)# ... trainer.train() ...trainer.push_to_hub() # final push + auto-generated model card
# Or directly on objects:model.push_to_hub("dummy-model")tokenizer.push_to_hub("dummy-model") # always push the tokenizer toohub_model_id="org/name" targets an organization. trainer.push_to_hub() auto-writes a card with hyperparameters + metrics.
Route 2: huggingface_hub library
Section titled “Route 2: huggingface_hub library”from huggingface_hub import create_repo, upload_filecreate_repo("dummy-model") # private=True to hide; repo_type="dataset"/"space"upload_file("config.json", path_in_repo="config.json", repo_id="username/dummy-model") # HTTP, no git; <5GB per fileRoute 3: git + git-lfs (barebones)
Section titled “Route 3: git + git-lfs (barebones)”git lfs installgit clone https://huggingface.co/username/dummy-model# model.save_pretrained(dir) + tokenizer.save_pretrained(dir)git add . && git commit -m "First model version" && git pushgit-lfs auto-tracks large weight files; small JSON files use plain git.
What a usable model repo contains
Section titled “What a usable model repo contains”| File | What it is | Handler |
|---|---|---|
config.json | Architecture + settings | git |
weights (*.bin / *.safetensors) | The trained parameters | git-lfs |
| tokenizer files | Vocab, tokenizer_config.json, special tokens | git |
README.md | The model card | git |
This is exactly the set from_pretrained expects, which is why one line loads it.
The model card (README.md) essentials
Section titled “The model card (README.md) essentials”- What it is for (the task, the intended use)
- Where it came from (base checkpoint, training data)
- What it cannot do (limitations, biases, language scope)
- How to use it (a short code snippet)
A good card is the difference between a model others adopt and weights nobody touches.
Words to use precisely
Section titled “Words to use precisely”- Namespace: your username or an organization you can write to.
- git-lfs: Git Large File Storage; tracks large binaries (weights) separately from plain git.
- Model card: the repo’s
README.md; documentation + honesty statement.
Recommended further study
Section titled “Recommended further study”- Hugging Face LLM Course, Chapter 4: “Sharing models and tokenizers.”
huggingface.co/learn/llm-course/chapter4. Released under Apache 2.0; this lesson mirrors its structure with original prose.