A Detailed Guide To Understand How Git-Rebase Works | by Dwen | Jul, 2022

Usage details and working principle

0*oOfTFmXy 5 QuoVC
Photo by Andyone on Unsplash

After a week of in-depth study, I thoroughly understood the working principle of Git-Rebase. Today, I am going to write a more in-depth analysis blog and share it with you.

The documentation for the git rebase command says “Reapply commits on top of another base tip”, which literally means reapply commits on top of another base tip.

This definition sounds a bit abstract, but it can be understood as changing the base of a branch from one commit to another, making it appear as if the branch was created from another commit.

As shown below:

Suppose we create a Feature branch from Master’s commit A for new feature development, then A is the base end of Feature.

Then Master added two new commits B and C, and Feature added two new commits D and E.

Now for some reason, for example, the development of new features depends on commits B and C, and we need to integrate the two new commits of Master into the Feature branch. In order to keep the commit history clean, we can switch to the Feature branch to perform the rebase operation:

git rebase master

The execution process of rebase is to first find the nearest common ancestor commit A of these two branches (that is, the current branch Feature, the target base branch Master of the rebase operation).

Then compare the previous commits (D and E) of the current branch relative to the ancestor commit, extract the corresponding modifications and save them as temporary files, and then point the current branch to the commit Cpointed to by the target base Master.

Finally, using this as the new base end, the modifications previously saved as temporary files are applied sequentially.

We can also understand the above as changing the base of the Feature branch from commit A to commit C, which looks like the branch was created from commit C, and commits D and E.

But actually, this is just “looking”, internally Git copies the contents of commits D and E, creates new commits D’ and E’ and applies them to a specific base (A→B→C).

Although the new Feature branch looks the same as before, it is made up of brand new commits.

The essence of the rebase operation is to discard some existing commits, and then correspondingly create some new commits that are identical in content but are actually different.

The main purpose.

rebase is often used to rewrite commit history. The following usage scenarios are common in most Git workflows:

  • We pulled a feature branch from the master branch for feature development locally.
  • The remote master branch later merged some new commits.
  • We want to integrate the latest changes from master in the feature branch.

The above scenarios can also be achieved using merge, but using rebase allows us to maintain a linear and cleaner commit history. Suppose we have the following branches:

D---E feature
A---B---C master

Now we will use the merge and rebase respectively to integrate the B and C commits of the master branch into the feature branch, add a commit F to the feature branch, then merge the feature branch into the master, and finally compare the commits formed by the two methods differ in history.

Use merge.

  • Switch to the feature branch: git checkout feature.
  • To merge updates from the master branch: git merge master.
  • Add a new commit F: git add . && git commit -m “commit F”.
  • Switch back to the master branch and perform a fast-forward merge: git checkout master && git merge fature.

The execution process is shown in the following figure:

We will get the following commit history:

* 6fa5484 (HEAD -> master, feature) commit F
* 875906b Merge branch 'master' into feature
| | 5b05585 commit E
| | f5b0fc0 commit D
* * d017dff commit C
* * 9df916f commit B
* cb932a6 commit A

Use rebase.

The steps are basically the same as using merge, the only difference is that the command in step 2 is replaced with: git rebase master.

The execution process is shown in the following figure:

1* oV M4y0VBHNJ M9YmS65g

We will get the following commit history:

* 74199ce (HEAD -> master, feature) commit F
* e7c7111 commit E
* d9623b0 commit D
* 73deeed commit C
* c50221f commit B
* ef13725 commit A

It can be seen that the commit history formed by using the rebase method is completely linear and compared to the merge method, there is one less merge commit, which looks cleaner.

Why keep your commit history clean?

What’s the benefit of a seemingly cleaner commit history?

  • Satisfy the cleanliness of some developers.
  • When you need to go back to the commit history for some bug, it is easier to locate which commit the bug was introduced from.

Especially when you need to use git bisect to troubleshoot bugs from dozens or hundreds of commits, or when there are some large feature branches that need to frequently pull updates from the remote master branch.

Using rebase to integrate the remote changes into the local repository is a better option.

The result of pulling remote changes with merge is that every time you want to get the latest progress on the project, there will be a redundant merge commit.

And the result of using rebase is more in line with our intent: I want to build on other people’s completed work with my changes.

Other ways to rewrite commit history.

Using git commit --amend is more convenient when we only want to amend the most recent commit.

It works in the following scenarios:

  • We just made a commitment, but haven’t pushed to the public branch yet.
  • Suddenly I found that the last commit still left some small tails unfinished, such as a line of comments that we forgot to delete or a small clerical error. We can quickly complete the revision, but we don’t want to add a separate commit.
  • Or we just feel that the commit message from the last commit was not written well enough and want to make some changes.

At this time, we can add new modifications (or skip), and use the git commit --amend command to execute the submission. After execution, a new editor window will be entered, and the submission information of the last submission can be modified. These changes will be applied to the last commit.

If we have already pushed the last commit to the remote branch and executing the push now will prompt an error and be rejected. On the premise of ensuring that the branch is not a public branch, we can use git push --force Force push.

Note that as with rebase, Git doesn’t actually modify and replace the previous commit internally, but instead creates a brand new commit and repoints this new commit.

Rewrite commit history using rebase’s interactive mode.

The git rebase command has two modes: standard and interactive. In the previous examples, we used the default standard mode. Add the -i or --interactive option after the command to use the interactive mode.

The difference between the two modes?

As we mentioned earlier, rebase is “reapplying commits on top of another base”, and in the process of reapplying, these commits are recreated and can naturally be modified.

In the standard mode of rebase, the commits of the current working branch are applied directly to the tip of the incoming branch.

In interactive mode, it allows us to rewrite, reorder, and delete commits through the editor and specific command rules before reapplying.

The most common usage scenarios of the two are therefore different:

  • Standard mode is often used to integrate the latest changes from other branches in the current branch.
  • Interactive mode is often used to edit the commit history of the current branch, such as merging multiple small commits into a larger one.

It’s more than just branches.

Although our previous examples performed rebase operations between two different branches, in fact, the parameters passed to the rebase command is not limited to branches.

Any commit reference can be considered a valid rebase base object, including a commit ID, branch name, tag name, or a relative reference such as HEAD~1.

Naturally, if we rebase a historical commit on the current branch, the result is that all commits after this commit will be reapplied to the current branch, allowing us to make changes to those commits in interactive mode.

Finally entering the topic of this article, as mentioned earlier, if we execute rebase on a commit of the current branch in interactive mode, that is (indirectly) rewrites all commits after this commit. Next, we will introduce it in detail through the following example.

Suppose we have the following commits in the feature branch:

74199cebdd34d107bb67b6da5533a2e405f4c330 (HEAD -> feature) commit F
e7c7111d807c1d5209b97a9c75b09da5cd2810d4 commit E
d9623b0ef9d722b4a83d58a334e1ce85545ea524 commit D
73deeedaa944ef459b17d42601677c2fcc4c4703 commit C
c50221f93a39f3474ac59228d69732402556c93b commit B
ef1372522cdad136ce7e6dc3e02aab4d6ad73f79 commit A

The next thing we will do is:

  • Merge B and C into a new commit, keeping only the commit information of the original commit C.
  • delete commit D.
  • Move to commit E after commit F and rename (i.e. modify commit info) to commit H.
  • Include a new file change in commit F and rename it to commit G.

Since the commit we need to modify is B→C→D→E, we need to use commit A as the new “base”, and all commit after commit A will be reapplied:

git rebase -i ef1372522cdad136ce7e6dc3e02aab4d6ad73f79 # The parameter is the ID of commit A

Next, you will enter the following editor interface:

pick c50221f commit B
pick 73deeed commit C
pick d9623b0 commit D
pick e7c7111 commit E
pick 74199ce commit F
# Rebase ef13725..74199ce onto ef13725(5 commands)
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]

Note: the commit information after the commit ID above is for descriptive purposes only, modifying them here will have no effect.

The specific operation commands have been explained in great detail in the editor’s comments, so we directly perform the following operations:

  1. Make the following changes to commits B and C.
pick c50221f commit B
f 73deeed commit C

Since commit B is the first of these commits, we can’t squash or fixup it (there’s no previous commit), and we don’t need to reword commit B to modify its commit info, because it’s later in the When commit C is merged into commit B, it allows us to make changes to the merged commit.

Note that the display order of submissions in this interface is from top to bottom, from old to new, so we change the command of submission C to s (or squash) or f (or fixup), which will merge it into the previous submission B (above). , the difference between the two commands is whether to retain the commit information of C.

2. Remove commit D.

d d9623b0 commit D

3. Move commit E after commit F and modify its commit info.

pick 74199ce commit F
r e7c7111 commit E

4. Add a new file change in commit F.

e 74199ce commit F

5. Save and exit.

Next, the commands we modify or retain for each commit are executed in order from top to bottom:

  • The pick command for commit B is executed automatically, so no interaction is required.
  • Then execute the squash command for commit C, which will enter a new editor interface that allows us to modify the commit information after merging B and C.
# This is a combination of 2 commits。
# This is the first commit note:
commit B# Here are the commit instructions #2:commit C

We delete the commit B line and save and exit and the merged commit will use commit C as the commit information.

  • The drop operation on commit D is also performed automatically, with no interactive steps.
  • Conflicts may occur during the rebase process. At this time, the rebase will be temporarily suspended, and we need to edit the conflicting files to merge the conflicts manually.
  • Once a conflict is resolved, mark it as resolved with git add/rm <conflicted_files> and then execute git rebase --continue to continue with the rebase step.
  • Or you can do git rebase --abort to abort the rebase operation and revert to the state before the operation.
  • Since we moved up the position of commit F, the edit operation on F will be performed next. A new shell session will be entered:
stop at 74199ce... commit F
You can now patch this commit, using
git commit --amendWhen you are satisfied with the change, executegit rebase --continue
  • We add a new code file and execute git commit --amend to merge it into the current previous commit (ie: F), then modify its commit information to commit G in the editor interface, and finally execute git rebase --continue to continue the rebase operation.
  • Finally, perform the reword operation on commit E, and modify its commit information to commit H in the editor interface.

You’re done! Finally, let’s confirm the commit history after the rebase.

64710dc88ef4fbe8fe7aac206ec2e3ef12e7bca9 (HEAD -> feature) commit H
8ab4506a672dac5c1a55db34779a185f045d7dd3 commit G
1e186f890710291aab5b508a4999134044f6f846 commit C
ef1372522cdad136ce7e6dc3e02aab4d6ad73f79 commit A

Exactly as expected, and you can also see that all commit IDs after commit A have changed, which confirms that Git recreated those commits as we said earlier.

# 1. Do a rebase before merging.

Another common use of rebase is to perform a rebase before pushing to the remote for merging, generally to ensure a clean commit history.

We first develop in our own function branch. When the development is completed, we need to rebase the current function branch to the latest master branch, resolve possible conflicts in advance, and then submit the modifications to the remote.

This way, the maintainer of the master branch of the remote repository no longer needs to merge and create an additional merge commit, but only needs to perform a fast-forward merge.

Even with multiple branches being developed in parallel, you end up with a completely linear commit history.

# 2. rebase to another branch.

We can compare the two branches by rebase, taking out the corresponding changes, and then applying them to the other branch. E.g:

F---G patch
D---E feature
A---B---C master

Suppose we created a branch patch based on commit D of the feature branch, and added commits F and G.

Now we want to merge the changes made by the patch to master and publish it, but we don’t want to merge the feature yet. In this case, we can use — -onto <branch>option for rebase:

git rebase —onto master feature patch

The above will take the patch branch, compare the changes it made based on the feature, and reapply those changes on the master branch, making the patch look as if the changes were directly based on master. The patch after execution is as follows.

A---B---C---F'---G' patch

Then we can switch to the master branch and perform a fast-forward merge on the patch:

git checkout master
git merge patch

git pull via rebase strategy.

After a certain version of Git, running git pull directly will prompt the following message:

warning: Doing a pull operation without specifying a merge strategy for the deviating branch is not recommended. You can suppress this message by executing the following command before the next pull operation:  git config pull.rebase false  # Merge (default policy)
git config pull.rebase true # rebase
git config pull.ff only # fast forward only

It turns out that git pull can also be merged by rebase, because git pull is actually equivalent to git fetch + git merge .

In the second step, we can directly replace git merge with git rebase to merge the changes obtained by fetch, which also avoids additional merge commits to maintain a linear commit history.

The difference between the two has been compared above. We can regard the Master branch in the comparison example as a remote branch and the Feature branch as a local branch.

When we perform a git pull locally, we actually pull the changes from the Master and merge them into the Feature branch.

If both branches have different commits, the default git merge method produces a single merge commit to integrate those commits.

Using git rebase is equivalent to recreating the local branch based on the latest commit of the remote branch, and then reapplying the locally added commit.

There are several specific ways to use it:

  • Add specific options each time you execute the pull command: git pull — -rebase.
  • Set the configuration item for the current repository: git config pull.rebase true, add the --global option after git config to make the configuration item effective for all repositories.

In the above scenarios, rebase is very powerful, but we also need to realize that it is not a panacea, and it is even dangerous for novices.

If you are a little careless, you will find that the commits the git log are missing, or you are stuck in a certain step of the rebase and don’t know how to restore it.

We’ve mentioned above that rebase has the advantage of maintaining a clean, linear commit history, but it’s important to be aware of the following potential drawbacks:

If it involves commits that have already been pushed, a forced push is required to push the commits after the local rebase to the remote.

  • So never do a rebase on a public branch (that is, there are other people developing on this branch), otherwise someone else doing a git pull will merge a confusing local history of commits, which will be pushed back to the remote branch. The remote submission history will be disrupted, and in severe cases, it may pose a risk to your personal safety.
  • Not friendly to novices, novices are likely to “lose” some commits by mistake in interactive mode (but they can actually be retrieved).
  • If you use rebase frequently to integrate updates to the master branch, a potential consequence is that you will encounter more and more conflicts that need to be merged.
  • Although you can handle these conflicts during the rebase process, this is not a permanent solution, and it is more recommended to frequently merge into the master branch and then create new feature branches, rather than using a long-lived feature branch.

There are also some points that we should try to avoid rewriting the commit history.

There is an argument that the commit history of a repository is a record of what actually happened. It is a historical document, which is valuable in itself and cannot be altered arbitrarily.

From this point of view, changing the commit history is blasphemy, you’re using a lie to cover up what actually happened.

What if the commit history resulting from the merge is a mess? Since this is the case, these traces should be preserved so that future generations can refer to them.

And frequent use of rebase can make it harder to locate bugs from historical commits, see Why you should stop using Git rebase.

After rebase in interactive mode and executing commands like squash or drop on commits, commits are removed directly from the branch’s git log. If you make a mistake, you’ll break a sweat thinking those commits are gone forever.

But these commits are not actually deleted. As mentioned above, Git does not modify (or delete) the original commit, but recreates a new batch of commits and points the current branch tip to the new commit.

So we can use git reflog to find and repoint the original commits to restore them, which undoes the entire rebase.

Thanks to Git, even if you do something like rebase or commit --amend to rewrite the commit history, it doesn’t really lose any commits.

git reflog command.

Reflogs are a mechanism that Git uses to record updates to the branch tip of the local repository. It records all commits that the branch tip has ever pointed to, so reflogs allow us to find and switch to a commit that is not currently referenced by any branch or tag.

Whenever the branch tip is updated for any reason (by switching branches, pulling new changes, rewriting history, or adding new commits), a new record will be added to the reflogs.

This way, every commit we make locally will definitely be recorded in reflogs.

Even after rewriting the commit history, reflogs contain information about the old state of the branch and allow us to revert to that state if needed.

Note that reflogs are not stored permanently, they have an expiration time of 90 days.

Let’s continue from the previous example. Suppose we want to restore the A→B→C→D→E→F commit history of the feature branch before the rebase, but there are no last 5 commits in the git log at this time, so it needs to be retrieved from the reflogs Find, run git reflog and the results are as follows:

64710dc (HEAD -> feature) HEAD@{0}: rebase (continue) (finish): returning to refs/heads/feature
64710dc (HEAD -> feature) HEAD@{1}: rebase (continue): commit H
8ab4506 HEAD@{2}: rebase (continue): commit G
1e186f8 HEAD@{3}: rebase (squash): commit C
c50221f HEAD@{4}: rebase (start): checkout ef1372522cdad136ce7e6dc3e02aab4d6ad73f79
74199ce HEAD@{5}: checkout: moving from master to feature

reflogs completely records the whole process of switching branches and rebase. Continuing to search down, we found the commit F that disappeared from the git log:

74199ce HEAD@{15}: commit: commit F

Next, we re-point the tip of the feature branch to the original commit F via git reset:

# We want to restore the files in the workspace as well, so we use the --hard option   
$ git reset --hard 74199ce
HEAD now located in 74199ce commit F

Running git log again will find that everything is back to before:

74199cebdd34d107bb67b6da5533a2e405f4c330 (HEAD -> feature) commit F
e7c7111d807c1d5209b97a9c75b09da5cd2810d4 commit E
d9623b0ef9d722b4a83d58a334e1ce85545ea524 commit D
73deeedaa944ef459b17d42601677c2fcc4c4703 commit C
c50221f93a39f3474ac59228d69732402556c93b commit B
ef1372522cdad136ce7e6dc3e02aab4d6ad73f79 commit A

Thank you for reading this article.

Stay tuned for more.

News Credit

%d bloggers like this: