How Git Uses SHA-1 for Commit History

How Git Uses SHA-1 for Commit History Unlocking the Internals of Git’s Immutable Architecture Git is more than just a version control system—it’s a cryptographic ledger that builds its commit history on top of SHA-1 hashing. This design enables immutability, traceability, and distributed consistency. Anatomy of a Git Commit Every Git commit is represented by a SHA-1 hash that encodes the entire state of the project at a point in time. git log --pretty=raw Output: commit 9fceb02b21337d3025f69e22f68c82d20a000000 tree 36b74b3b8f6a... parent cf23df2207d9... author John Doe committer John Doe Commit Object Breakdown Commit SHA-1: Fingerprint of the current state. Tree SHA-1: Represents the directory structure. Parent SHA-1: Links to prior commits (commit chaining). Metadata: Author, committer, and commit message. SHA-1 and Security Git’s SHA-1 hashing ensures collision resistance and referential integrity: A change in any object results in a new hash. Git defends against SHA-1 collisions using structural integrity checks. Git now supports SHA-256 (git init --object-format=sha256) for enhanced security. Use git fsck to validate object integrity: git fsck --full hash-object: Understanding the Core Command The git hash-object command is a plumbing-level tool to compute and optionally store a SHA-1 hash. Key Features: Operates independently of Git repositories. Deterministic: same input → same output. Supports write mode (-w) to persist objects. echo "Hello Git" | git hash-object --stdin # Output: 8cf2d8a03c123f8824ac46aa20a6b924ad44f0c8 Add the object to .git/objects: echo "Hello Git" | git hash-object -w --stdin Inside .git: Object-Oriented Versioning When git init is run, Git creates the .git/ directory as the project database. Structure: .git/ ├── HEAD ├── config ├── objects/ │ ├── info/ │ ├── pack/ │ └── [hashed objects] ├── refs/ ├── hooks/ ├── index The objects/ Directory Houses all blobs, trees, commits, and tags. Each object is stored as: Folder: First 2 characters of SHA-1 File: Remaining 38 characters Example: .git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238 Example: Blob Object Storage and Retrieval echo "Hello Git Internals!" | git hash-object -w --stdin Retrieve it: git cat-file -p # Output: Hello Git Internals! Shortened hashes are valid as long as they’re unique: git cat-file -p 557db03 Git Object Format & Compression Internally, Git stores: \0 Example: blob 11\0Hello World blob: object type 11: size \0: null byte separator Git compresses this format using Zlib. Advanced: Building Commit History by Hand Step 1: Blob Storage echo "Hello World" | git hash-object -w --stdin Step 2: Tree & Commit Generation git add hello.txt git commit -m "First commit" Git creates: A blob for the file. A tree linking the blob. A commit referencing the tree. git log -1 # commit c7a78d3... Inspect: git cat-file -t c7a78d3 git cat-file -p c7a78d3 Annotated Tags Internals Create a tag: git tag -a v1.0 -m "First release" A tag is also an object: git cat-file -t # Output: tag git cat-file -p Tag Object Fields object: commit it points to type: always "commit" tag: name tagger: metadata message: tag message Final Thoughts Git’s commit model, built atop SHA-1 (and SHA-256), is a masterclass in content-addressable storage. Every file, directory, and history point is an immutable, verifiable object. Whether you’re debugging history, scripting Git automation, or studying internals—understanding the object model and SHA-1 plumbing is key to Git mastery. Want to go deeper? Clone Git itself and inspect its C source, or experiment with git plumbing commands in a sandbox repo. Follow me for more deep dives into Git, dev tools, and the low-level internals that power modern development.

May 8, 2025 - 23:23
 0
How Git Uses SHA-1 for Commit History

 How Git Uses SHA-1 for Commit History

How Git Uses SHA-1 for Commit History

Unlocking the Internals of Git’s Immutable Architecture

Git is more than just a version control system—it’s a cryptographic ledger that builds its commit history on top of SHA-1 hashing. This design enables immutability, traceability, and distributed consistency.

Anatomy of a Git Commit

Every Git commit is represented by a SHA-1 hash that encodes the entire state of the project at a point in time.

git log --pretty=raw

Output:

commit 9fceb02b21337d3025f69e22f68c82d20a000000
tree 36b74b3b8f6a...
parent cf23df2207d9...
author John Doe 
committer John Doe 

Commit Object Breakdown

  • Commit SHA-1: Fingerprint of the current state.
  • Tree SHA-1: Represents the directory structure.
  • Parent SHA-1: Links to prior commits (commit chaining).
  • Metadata: Author, committer, and commit message.

SHA-1 and Security

Git’s SHA-1 hashing ensures collision resistance and referential integrity:

  • A change in any object results in a new hash.
  • Git defends against SHA-1 collisions using structural integrity checks.
  • Git now supports SHA-256 (git init --object-format=sha256) for enhanced security.

Use git fsck to validate object integrity:

git fsck --full

hash-object: Understanding the Core Command

The git hash-object command is a plumbing-level tool to compute and optionally store a SHA-1 hash.

Key Features:

  • Operates independently of Git repositories.
  • Deterministic: same input → same output.
  • Supports write mode (-w) to persist objects.
echo "Hello Git" | git hash-object --stdin
# Output: 8cf2d8a03c123f8824ac46aa20a6b924ad44f0c8

Add the object to .git/objects:

echo "Hello Git" | git hash-object -w --stdin

Inside .git: Object-Oriented Versioning

When git init is run, Git creates the .git/ directory as the project database.

Structure:

.git/
├── HEAD
├── config
├── objects/
│   ├── info/
│   ├── pack/
│   └── [hashed objects]
├── refs/
├── hooks/
├── index

The objects/ Directory

  • Houses all blobs, trees, commits, and tags.
  • Each object is stored as:
    • Folder: First 2 characters of SHA-1
    • File: Remaining 38 characters

Example:

.git/objects/55/7db03de997c86a4a028e1ebd3a1ceb225be238

Example: Blob Object Storage and Retrieval

echo "Hello Git Internals!" | git hash-object -w --stdin

Retrieve it:

git cat-file -p <hash>
# Output: Hello Git Internals!

Shortened hashes are valid as long as they’re unique:

git cat-file -p 557db03

Git Object Format & Compression

Internally, Git stores:

 \0

Example:

blob 11\0Hello World
  • blob: object type
  • 11: size
  • \0: null byte separator

Git compresses this format using Zlib.

Advanced: Building Commit History by Hand

Step 1: Blob Storage

echo "Hello World" | git hash-object -w --stdin

Step 2: Tree & Commit Generation

git add hello.txt
git commit -m "First commit"

Git creates:

  • A blob for the file.
  • A tree linking the blob.
  • A commit referencing the tree.
git log -1
# commit c7a78d3...

Inspect:

git cat-file -t c7a78d3
git cat-file -p c7a78d3

Annotated Tags Internals

Create a tag:

git tag -a v1.0 -m "First release"

A tag is also an object:

git cat-file -t   # Output: tag
git cat-file -p 

Tag Object Fields

  • object: commit it points to
  • type: always "commit"
  • tag: name
  • tagger: metadata
  • message: tag message

Final Thoughts

Git’s commit model, built atop SHA-1 (and SHA-256), is a masterclass in content-addressable storage. Every file, directory, and history point is an immutable, verifiable object.

Whether you’re debugging history, scripting Git automation, or studying internals—understanding the object model and SHA-1 plumbing is key to Git mastery.

Want to go deeper? Clone Git itself and inspect its C source, or experiment with git plumbing commands in a sandbox repo.

Follow me for more deep dives into Git, dev tools, and the low-level internals that power modern development.