Most developers use Git daily without ever looking at what’s inside the .git directory. Understanding Git’s object model turns it from a black box into a transparent, debuggable system — and makes the mental model for rebasing, merging, and reflog much clearer.
The Object Store
Everything in Git — file content, directory structures, commits, and tags — is stored as a content-addressed object in .git/objects/. The filename is the SHA-1 hash of the compressed content.
1 | |
There are four object types:
| Type | Description |
|---|---|
blob |
File contents (no filename, no metadata) |
tree |
Directory listing — maps filenames to blob/tree hashes |
commit |
Points to a tree + parent commit(s) + author + message |
tag |
Annotated tag — points to a commit with extra metadata |
Blobs: Storing File Content
When you stage a file, Git creates a blob from its contents:
1 | |
The blob stores only the raw content — no filename, no timestamp. Two files with identical content share one blob. This is how Git deduplicates content efficiently.
1 | |
Trees: Storing Directories
A tree object maps names to hashes:
1 | |
Each line contains:
- Mode —
100644for regular file,040000for directory,100755for executable - Type —
blobortree - Hash — the SHA-1 of the object
- Name — the filename or directory name
Nested directories are represented by nested tree objects. The entire working tree at any commit is a single root tree with sub-trees.
Commits: The Spine of History
1 | |
A commit stores:
- Tree — a pointer to the root tree (snapshot of the entire working directory)
- Parent(s) — zero parents for the initial commit, one for normal commits, two for merges
- Author + Committer — name, email, timestamp (can differ after rebasing)
- Message — the commit message
A commit is a snapshot, not a diff. Git computes diffs on the fly by comparing tree objects.
How Branches Work
1 | |
A branch is just a 41-byte file containing a commit hash. Creating a branch is instantaneous:
1 | |
HEAD is a symbolic ref — usually pointing to a branch:
1 | |
When HEAD points directly to a commit hash (not a branch), you’re in detached HEAD state.
The Index (Staging Area)
The staging area (.git/index) is a binary file that maps paths to blob hashes. It represents the next commit — your working directory minus any unstaged changes.
1 | |
When you run git add, Git:
- Hashes the file content → creates a blob object
- Updates the index entry for that path to point to the new blob
When you run git commit, Git:
- Reads the current index
- Creates tree objects for each directory
- Creates a commit object pointing to the root tree and the previous commit
- Updates the current branch ref to point to the new commit
Rebasing Under the Hood
git rebase main replays commits:
- Finds the common ancestor of
featureandmain - For each commit on
feature(after the ancestor), computes the diff from its parent - Applies each diff on top of
main‘s HEAD, creating new commit objects with new hashes - Moves the
featurebranch ref to the last new commit
The original commits still exist in the object store until garbage collection runs. They’re accessible via git reflog.
Reflog: The Safety Net
1 | |
The reflog is local-only and expires after 90 days by default. It’s what makes git reset --hard recoverable.
Packfiles
As a repo grows, Git packs loose objects into binary packfiles for efficiency:
1 | |
Inside a packfile, Git stores objects as deltas against similar objects, achieving significant compression. git gc triggers repacking.
Practical Takeaways
- Commits are snapshots, not diffs — understanding this clarifies why
git rebasecreates new commits - Branches are pointers — moving, renaming, and deleting them is cheap
- Staging is explicit — the index gives you fine-grained control over what goes into a commit
- Nothing is lost — until GC runs, the reflog can recover any state you’ve been in
Understanding the object model makes commands like git cherry-pick, git bisect, and git stash far less mysterious — they’re all just manipulating the same four object types.