登录查看更多内容

GIT Internal (part 1)

Huy Nguyen

.NET developer | Software engineer | Database | Algorithm

发布日期: 2024年9月5日

If you're a developer, you've probably used Git, and not just that, you probably use it daily. But have you ever wondered how Git is structured? What happens when we commit code or how code merging is handled?

In today’s post, let’s "dissect" Git, a revolutionary software in the programming world. Make sure you’ve already mastered some basic commands like git commit, git branch, and git merge,...

Everything is a hash

Yes, you heard that right. The world inside Git is nothing but hashes, even your code is represented as a hash. Keep this idea in mind because it relates to the next part.

Git Objects: Blob, Tree, and Commit

First, let’s briefly go over Git. The most basic component in Git is an object. Objects can be blobs, trees, or commits. And of course, all of them are identified by a hash.

Blob

Blob (binary large object) contains the content of your code.
It’s different from a file, as a file includes metadata (like creation date or file name), while a blob contains only the file's raw byte content.
A blob is identified by a SHA-1 hash generated from its content.

Tree

The tree maps the directory structure of your code and is also identified by a SHA-1 hash.
Its content points to the hashes of blobs or other trees. Think of blobs as leaf nodes, and trees as non-leaf nodes in a tree structure.

In the example above, the tree corresponds to a file system where the root directory contains one file /test.js and one subdirectory /docs. The /docs directory contains two files: /docs/pic.png and /docs/1.txt.

领英推荐

Fundamentals of GIT

Neill Ferguson 11 个月前

Git 101: The Ultimate Crash Course to Master Git

Huzaifa Asif 1 年前

Introducing the Trumbitta Flow: a Git rebase flow

William ?????? Ghelfi 2 年前

Commit

Commits are something you're likely very familiar with. They’re the result of the git commit command you use daily.
A commit is like a snapshot, recording the state of the entire directory at a given point in time.
A commit contains a pointer to the hash of the root tree, the author (the person who made the commit), the message (the commit message), and the commit time. Uniquely, a commit can also point to one or more previous commits (in the case of merging), known as parent commits.
It’s also identified by a SHA-1 hash.
Each commit represents a full snapshot when you git commit, not just a list of changes since the previous commit.

Q&A Section

At this point, some might wonder: if each commit contains the full snapshot of the directory at that moment, doesn’t that mean we have to store a lot of data with every commit?

Let's jump into an example. In the directory tree above, suppose we change the content of the file 1.txt from "HELLO WORLD" to "HELLO WORLD!". You can see that the tree and blob have changed (in red).

At first glance, it looks like the new commit saves a lot of data, but if you look closely, you’ll notice that unchanged parts remain the same.

In conclusion, if an object hasn’t changed, Git doesn’t create a new copy of that object but keeps the original intact.

Recap

Blob: Contains the file content.
Tree: Maps the directory and points to blobs and other trees.
Commit: A snapshot of the working tree.
For example, if I have a file with the content "Hello world" and you have a similar file, the blob hash will be the same because it’s hashed from the file’s content.
Similarly, if I have a folder containing a subfolder and other files, and you have an identical folder with the same structure and file names, the tree hash will be the same.
However, if you commit and I commit, the commit hashes will likely be different because they hash from the content of the author, commit message, and commit time.

In the next part, I'll talk about branches in Git. Stay tuned!

要查看或添加评论，请登录

Huy Nguyen的更多文章

Connect site to site VPN

2025年3月20日

Connect site to site VPN

Site to site VPN là gì Site to site vpn s? thi?t l?p ???ng h?m b?o m?t (secure tunnel) gi?a 2 hay nhi?u network khác…

1 条评论
0.01 và 0.25 ^ 4

2025年2月26日

0.01 và 0.25 ^ 4

Hello m?i ng??i. Cu?i tu?n th? 7 v?a r?i mình có m?t bu?i offline v?i c?ng ??ng wecommit.

1 条评论
Domain-driven design - Tactical design

2025年1月13日

Domain-driven design - Tactical design

Tóm t?t DDD M?c tiêu c?a DDD là thi?t k? ph?n m?m ??t nghi?p v? vào trung tam, tách bi?t nghi?p v? v?i c?ng ngh?. DDD…
Gi?i thi?u AI và Machine learning

2025年1月5日

Gi?i thi?u AI và Machine learning

Hello ace, h?m T7 v?a r?i mình có offline nhóm wecommit, n?i dung v? Gi?i thi?u AI và Machine learning. Bu?i chia s?…

2 条评论
MVCC trong postgresql

2024年12月17日

MVCC trong postgresql

MVCC trong postgresql là gì MVCC (Multi-Version Concurrency Control) là c? ch? ki?m soát concurrency ?? x? lí nhi?u…
Event storming cùng microservice và Domain driven design

2024年12月10日

Event storming cùng microservice và Domain driven design

Hello m?i ng??i, th? 7 v?a r?i mình có bu?i chia s? v?i anh em wecommit v?i ch? ?? "Event storming cùng microservice và…

7 条评论
Free talk: L?p trình viên chuyên nghi?p

2024年12月1日

Free talk: L?p trình viên chuyên nghi?p

Hello anh em, ngày h?m qua mình l?i có bu?i off cùng anh em wecommit. Ch? ?? tu?n này là v? "L?p trình viên chuyên…
Làm sao ?? làm quen d? án m?i m?t cách nhanh nh?t

2024年11月18日

Làm sao ?? làm quen d? án m?i m?t cách nhanh nh?t

Hello anh em, tu?n v?a r?i mình v?a có 1 bu?i offline c?ng ??ng we commit. B?n cu?i tu?n OT nên nay mình m?i có th?i…
Loay hoay ch?n h??ng ?i, làm tech lead hay làm qu?n ly?

2024年11月9日

Loay hoay ch?n h??ng ?i, làm tech lead hay làm qu?n ly?

H?m nay mình l?i có d?p ng?i cùng anh em wecommit ?? bàn lu?n ch? ?? này. Mình hi?n ?ang là m?t middle dev, và c?ng…

7 条评论
Phan m?nh index trong database

2024年11月1日

Phan m?nh index trong database

Khi update data trong b?ng th??ng xuyên có th? làm index b? phan m?nh, gay ra v?n ?? v? hi?u n?ng khi truy xu?t d?…

1 条评论

See all articles

GIT Internal (part 1)

Huy Nguyen

.NET developer | Software engineer | Database | Algorithm

Everything is a hash

Git Objects: Blob, Tree, and Commit

Blob

Tree

领英推荐

Commit

Q&A Section

Recap

Huy Nguyen的更多文章

社区洞察

其他会员也浏览了

Introducing the Trumbitta Flow: a Git rebase flow

Unlocking the Power of Lesser-Known Git Commands: A Guide for Developers

Day 7: Advanced Git Commands #90DaysofDevOps

Step-by-Step Guide to Implementing Git Pre-Commit Hooks

Git Internals

10 Git Commands You’ll Wish You Knew Earlier

Mastering Git Worktree

The Cherry-Pick Adventure: A Git Story

Code quality - Tools

Day 3: Git Basics and Commands #90DaysofDevops

Everything is a hash

Git Objects: Blob, Tree, and Commit

Blob

Tree

领英推荐

Commit

Q&A Section

Recap

Huy Nguyen的更多文章

Connect site to site VPN

0.01 và 0.25 ^ 4

Domain-driven design - Tactical design

Gi?i thi?u AI và Machine learning

MVCC trong postgresql

Event storming cùng microservice và Domain driven design

Free talk: L?p trình viên chuyên nghi?p

Làm sao ?? làm quen d? án m?i m?t cách nhanh nh?t

Loay hoay ch?n h??ng ?i, làm tech lead hay làm qu?n ly?

Phan m?nh index trong database

社区洞察

其他会员也浏览了

Introducing the Trumbitta Flow: a Git rebase flow

Unlocking the Power of Lesser-Known Git Commands: A Guide for Developers

Day 7: Advanced Git Commands #90DaysofDevOps

Step-by-Step Guide to Implementing Git Pre-Commit Hooks

Git Internals

10 Git Commands You’ll Wish You Knew Earlier

Mastering Git Worktree

The Cherry-Pick Adventure: A Git Story

Code quality - Tools

Day 3: Git Basics and Commands #90DaysofDevops