对于刚接触 Git 的人来说,这很可怕。但别担心:所有的提交都还在。
包括 Visual Studio 在内的各种 GUI 会阻止对 Git 的访问(这可能是好是坏,取决于你的观点),所以你看不到真正发生了什么,我不使用这些 GUI,因为它们保留你从看到发生了什么,所以我不能准确地说,你的 GUI 中的每个点击按钮是做什么的。 然而, Git的工作方式如下:
你可能——事实上,你应该——在这一点上提出异议:我们怎么知道是HEAD
指提交还是分支名称? Git 的回答是:我会根据我目前想要的任何一个来选择一个。 有些东西需要一个分支名称,在这种情况下,HEAD
变成分支名称。有些事情需要提交,在这种情况下变成提交。HEAD
基本上,Git 有两种内部方式来询问HEAD now 是什么。一个给出一个分支名称的答案,比如master
或main
其他什么,另一个给你一个原始的提交哈希 ID。
好的,所以,考虑到这一点,我们现在记得git log
打印出这样的日志:
commit eb27b338a3e71c7c4079fbac8aeae3f8fbb5c687 (...)
Author: ...
...
commit fe3fec53a63a1c186452f61b0e55ac2837bf18a1
...
也就是说,我们看到所有这些奇怪的哈希 ID 一次一个地溢出。哈希 ID 是每个提交的真实姓名。每个提交都有一个全局唯一的哈希 ID:不允许两个不同的提交具有相同的一个。这就是为什么哈希 ID 如此之大和丑陋的原因。它们看起来很随意。它们实际上不是随机的,但它们是不可预测的。3
像这样的分支名称main
会转换为提交哈希 ID。原始哈希 ID 已经是哈希 ID。无论哪种方式,只要给定正确的哈希 ID,Git 都可以找到提交。
每个提交都包含每个文件的完整快照,4加上一些元数据:关于提交本身的信息,例如谁提交、何时提交,以及他们当时可以写入的日志消息。对于 Git 本身来说至关重要的是,此元数据中的一项是上一次提交的原始哈希 ID。
这里还有一个关于提交的随机事实很有用:一旦提交,任何提交的任何部分都不能被更改。这就是哈希 ID 的实际工作方式,这对于 Git 作为分布式版本控制系统至关重要。但这也意味着任何 Git 提交都不能包含其未来子提交的原始哈希 ID,因为我们不知道当我们创建提交时它们会是什么。提交可以存储他们父母的“名字”(哈希 ID),因为我们在创建孩子时确实知道他们的祖先。
这对我们来说意味着提交记住了他们的父母,这形成了一种向后看的链。我们所要做的就是记住最新提交的原始哈希 ID。当我们这样做时,我们最终会得到一个可以像这样绘制的链:
... <-F <-G <-H <--main
在这里,名称 main
包含最新提交的真实哈希 ID ,出于绘图目的,我们称之为H
. CommitH
又持有较早提交的哈希 ID G
,后者持有仍然较早提交的哈希 ID F
,依此类推。
我们现在可以看到它是如何git log
工作的:它从当前提交,开始H
,由当前分支,选择main
。为了成为main
当前分支,我们将特殊名称附加HEAD
到 name main
:
...--F--G--H <-- main (HEAD)
Git 用于HEAD
查找main
,用于main
查找H
,并向我们展示H
。然后 GitH
用来查找G
并显示给我们G
;然后它用于G
查找F
,依此类推。
当我们想查看任何历史提交时,我们通过哈希 ID 将其挑出,并告诉 Git:直接附加HEAD
到该提交。我们可以这样画:
...--F <-- HEAD
\
G--H <-- main
当我们git log
现在运行时,Git 会转换HEAD
为一个哈希 ID——这一次它直接找到了;没有附加的分支名称——并且向我们展示了 commit F
。然后git log
从那里向后移动。G
提交和在哪里H
?他们无处可寻!
但没关系:如果我们运行git log main
,git log
则以 name 开头main
,而不是 name HEAD
。找到了 commit H
,它git log
显示了;然后git log
移动到G
,依此类推。或者,我们甚至可以运行:
git log --branches
或者:
git log --all
找到所有分支git log
或所有参考(“参考”包括分支和标签,还包括其他类型的名称)。
(这带来了另一个单独的蠕虫罐头,这完全是关于如何git log
处理“想要”“同时”显示多个提交的情况。在这个答案中我根本不会去那里.)
这种“查看历史提交”模式在 Git 中称为分离 HEAD 模式。这是因为特殊名称HEAD
不再附加到分支名称。要重新附加您的HEAD
,您只需选择一个分支名称,使用git checkout
or (Git 2.23 或更高版本)git switch
:
git switch main
例如。您现在已经检查了分支名称main
选择的提交,并且HEAD
现在重新附加到 name main
。
Before we stop, there's one more really important thing to learn, which is: how branches grow. But let me get footnotes out of the way first.
1There's an exception to this rule, necessary in a new, totally empty repository that has no commits at all. That exception can be used in a weird way later, in a non-empty repository. You won't be making use of this though.
2小写变体 ,head
通常在 Windows 和 macOS 上“有效”(但在 Linux 和其他平台上无效)。但是,这是具有欺骗性的,因为如果您开始使用该git worktree
功能,head
(小写)将无法正常工作——它有时会导致您错误的提交!——而HEAD
(大写)则可以。如果您不喜欢全大写,请考虑使用速记@
字符,您可以使用它来代替HEAD
.
3 Git 在这里使用加密散列:与加密货币中发现的相同类型的东西,尽管没有那么严格(Git 目前仍在使用 SHA-1,它在加密术语中已经过时)。
4The snapshots are stored in a special, read-only, Git-only, compressed and de-duplicated format. Git shows commits as "changes since previous commit" but stores commits as snapshots.
How Git branches grow
Suppose we have the following situation:
...--G--H <-- main (HEAD)
We now want to make a new commit, but we'd like to put it on a new branch. So we first as Git to make a new branch name, and point that name to commit H
too:
git branch develop
which results in:
...--G--H <-- develop, main (HEAD)
Now we pick develop
as the name to have HEAD
attached-to, with git checkout
or git switch
:
...--G--H <-- develop (HEAD), main
Note that we're still using commit H
. We're just using it through the other name now. The commits up through and including H
are on both branches.
We now make a new commit, the usual way we do in Git. Once we're ready, we run git commit
and give Git a log message to put in the metadata for the new commit. Git now:
- saves a snapshot of every file (de-duplicated as usual);
- uses the current commit as the parent for the new commit, so that our new commit—which we'll call
I
—will point backwards to existing commit H
;
- adds our configured
user.name
and user.email
as the author and committer of this new commit, using "now" as the date-and-time;
- uses our log message; and
- actually writes all of this out as a commit, which assigns it its unique hash ID. (The uniqueness comes in part from the date-and-time stamp, and in part from the input hash ID
H
, and in part from the snapshot we've saved: everything that is in the new commit goes into making up the new random-looking hash ID, which is why we can't predict it.)
So now we have this new commit I
, pointing back to existing commit H
:
...--G--H
\
I
Now Git does the other bit of magic that makes it all work: git commit
writes I
's hash ID into the current branch name. That is, Git uses HEAD
to find the name of the current branch, and updates the hash ID stored in that branch name. So our picture is now:
...--G--H <-- main
\
I <-- develop (HEAD)
The name HEAD
is still attached to the branch name develop
, but the branch name develop
now selects commit I
, not commit H
.
It's commit I
that leads back to commit H
. The name just lets us find the commit. The commits are what really matter: branch names are just there to let us find the last commit. Whatever hash ID is in that branch name, Git says that that commit is the last commit on that branch. So since main
says H
right now, H
is the last commit on main
; since develop
says I
right now, I
is the last commit on develop
. Commits up through H
are still on both branches, but I
is only on develop
.
Later, if we like, we can have Git move the name main
. Once we move main
to I
:
...--G--H--I <-- develop, main
then all commits are once again on both branches. (I left out HEAD
this time because we might not care which branch we are "on", if both select I
. In fact, we can delete either name—but not both—because both names select the same commit and that's all we need to find the right hash ID. If we were to write this hash ID down somewhere, we might not need any name. But that would be ... yucky, at best. We have a computer; let's have it save the big ugly hash IDs for us, in nice neat names.)