Skip to content

Rebase. Squash. Merge. Repeat.

You open a pull request. The CI checks pass. A reviewer leaves a comment:

"Please squash your commits and rebase onto the latest main."

If you are new to contributing to open source or working on a team with a structured workflow, that request can feel like an obstacle between you and getting your work merged. It is not. It is a signal that the project cares about its history, and that caring about history is worth your time too.

This post covers the contributor side: what to do to a branch before opening a pull request (or merge request, as GitLab calls it), covering rebase, squash, and sign.

The maintainer side, covering the three GitHub merge strategies and why squash and merge is the right default, is covered in Squash and Merge: A Better Default.

A lot of what follows is inspired by Marc Gasch's Git rebase, squash...oh my!. This post builds on that foundation and connects these practices to the structured commit workflow covered in Conventional Commits: How to Write a Better Git Commit Message and the contributor expectations described in CONTRIBUTING.md: Writing Practical Contribution Guidelines for GitHub Repositories.

The Branch-Based Workflow

The foundation of everything here is the branch-based workflow. The idea is simple: never work directly on main. For every piece of work, whether it is a new feature, a bug fix, or a documentation update, create a dedicated branch off of main, do the work there, then integrate it back via a pull request.

git checkout main
git pull origin main
git checkout -b feat/add-datastore-cluster-support

The branch is yours. You can commit freely, make mistakes, backtrack, and experiment. The important thing is that main remains stable and that the branch has a clear purpose. A branch named feat/add-datastore-cluster-support tells anyone who looks at it exactly what it contains. A branch named my-changes tells them nothing.

Two setups are common. If you are a member of a team with direct push access to the repository, you push your branch to the shared repository and origin is the same repository everyone else is working in. If you are an external contributor or prefer to work from your own copy, you fork the repository to your own account, push your branch to your fork, and open the pull request from there. In that case, origin points to your fork. Add the upstream repository as a separate remote so you can fetch new commits from it:

git remote add upstream https://github.com/example/repo.git

The rebase steps later in this post use origin/main as the target, which is correct for the team workflow. Fork contributors should substitute upstream/main wherever origin/main appears.

The Problem With Merge

When your feature branch falls behind main (because other people are merging their own work in the meantime), you need to bring those changes into your branch before you can merge. The most obvious way to do that is to merge main into your feature branch:

git checkout feat/add-datastore-cluster-support
git fetch origin
git merge origin/main

This works. Git creates a merge commit that records the point where the two branches joined, and you continue from there. But if you do this repeatedly over the life of a feature branch, you accumulate merge commits that describe housekeeping rather than progress. The history of your branch ends up interleaved with unrelated work from main.

When that branch gets merged into main, all of those merge commits come with it. Over time, git log becomes a branching tangle rather than a readable record. The project history stops telling a coherent story.

* 3f7a921 Merge branch 'main' into feat/add-datastore-cluster-support
* b4c1e03 wip
* 9a2d8f1 Merge branch 'main' into feat/add-datastore-cluster-support
* 7e6a3c4 more fixes
* 2b5d0e9 fix
* 8c4f1a7 actually works now

That is not a history anyone can use.

Rebase: Replaying Your Work on Top of Main

Rebasing is the alternative. Instead of creating a merge commit that joins two branches, a rebase takes your commits and replays them on top of a new base. The result is a linear history that reads as if you had done your work after the latest changes to main, even if the reality was more complicated.

git checkout feat/add-datastore-cluster-support
git fetch origin
git rebase origin/main

Git temporarily sets aside your commits, fast-forwards your branch to the tip of origin/main, then replays each of your commits one at a time on top of it. If there are conflicts, Git pauses and lets you resolve them before continuing.

The result is that your branch now starts from the current tip of main. When it gets merged, there are no merge commits to clutter the history. The log reads cleanly:

* 4a9e1b2 feat(datastore): add support for datastore clusters
* 1c3f5d8 test(datastore): add cluster selection tests

That is the difference. Merge preserves every junction in the history. Rebase linearizes it so that the history reflects what changed, not how the branches moved around each other while it was happening.

The Golden Rule

There is one rule about rebasing that you must internalize before you use it on anything that matters:

Never rebase commits that have already been pushed to a shared branch.

Rebasing rewrites history. When you rebase, each commit gets a new SHA hash. If someone else has pulled your branch and based work on those commits, a rebase will make your history diverge from theirs in a way that is painful to reconcile.

The rule is simple: rebase freely on branches that only you are using. Never rebase main, develop, or any branch that other people are actively working with. If you are unsure whether a branch is shared, err toward merging.

On your personal feature branch, rebasing is safe and desirable. The branch has not been merged. Nobody is depending on its history being stable. Rebase as often as you want to keep it current with main.

Squashing: Cleaning Up Before the Pull Request

While you were working on your feature branch, you probably made commits like these:

* a1b2c3d add initial datastore cluster support
* 4e5f6a7 fix typo in variable name
* 8b9c0d1 wip: add tests
* 2e3f4a5 tests passing now
* 6b7c8d9 address review feedback
* 0e1f2a3 address more review feedback

Those commits document how you got to the solution. They are not useful to anyone reading the history six months from now trying to understand what the feature did. The intermediate steps, the typo fixes, the "wip" commits, should not appear in main.

Squashing condenses multiple commits into a single, coherent one before the branch is merged. The tool for this is interactive rebase.

git rebase --interactive origin/main

This opens your editor with a list of your commits since main:

pick a1b2c3d add initial datastore cluster support
pick 4e5f6a7 fix typo in variable name
pick 8b9c0d1 wip: add tests
pick 2e3f4a5 tests passing now
pick 6b7c8d9 address review feedback
pick 0e1f2a3 address more review feedback

Each line starts with a command. pick means keep the commit as-is. To squash commits into the one above them, change pick to squash (or s):

pick a1b2c3d add initial datastore cluster support
squash 4e5f6a7 fix typo in variable name
squash 8b9c0d1 wip: add tests
squash 2e3f4a5 tests passing now
squash 6b7c8d9 address review feedback
squash 0e1f2a3 address more review feedback

After you save and close the editor, Git replays the commits, combining the squashed ones into the first pick. It then opens your editor again with all of the commit messages concatenated, so you can write a single clean message for the combined result.

That is where the Conventional Commits format comes in. Write one message that accurately describes the complete change:

feat(datastore): add support for datastore clusters

Builders and post-processors now accept a `datastore_cluster` key as
an alternative to specifying a single `datastore`. When a cluster is
specified, vSphere Storage DRS selects the target datastore within
the cluster based on the active placement policy.

Closes: #574

The intermediate work vanishes from the history. The result is one commit that does one thing and says clearly what it does.

Interactive Rebase Commands

The interactive rebase editor supports several commands beyond pick and squash. Here are the ones worth knowing:

Command Short Action
pick p Keep the commit as-is
reword r Keep the commit but edit its message
edit e Stop and let you amend the commit
squash s Meld into the previous commit, combining messages
fixup f Meld into the previous commit, discarding this message
drop d Remove the commit entirely

fixup is particularly useful. When you have a commit that corrects something in the commit above it, fixup folds it in without prompting you to edit the combined message. The correction disappears into the commit it belongs with, and you never have to think about its message because it gets discarded.

Handling Non-Fast-Forward Errors

After rebasing locally and then trying to push, you will get an error:

 ! [rejected]        feat/add-datastore-cluster-support -> feat/add-datastore-cluster-support (non-fast-forward)
error: failed to push some refs to 'origin'
hint: Updates were rejected because the tip of your current branch is behind its remote counterpart.

This happens because rebasing changed the commit SHAs on your branch. The remote still has the old history, and a regular push will not overwrite it. The solution is a force push with the lease flag:

git push --force-with-lease origin feat/add-datastore-cluster-support

--force-with-lease is the safer version of --force. It checks that the remote branch has not been updated since your last fetch before overwriting it. If someone else has pushed to your branch while you were rebasing (unlikely on a personal feature branch, but possible), the push will fail with an error rather than silently overwriting their work. Use --force-with-lease over --force every time.

Commit Signing

Commit signing is the final piece of branch hygiene worth covering. A signed commit cryptographically verifies that it was made by the identity attached to it. A commit attributed to [email protected] might have been made by anyone who typed git config user.email [email protected]; a signed commit from that address proves the author controls the corresponding key.

For a full walkthrough of GPG and SSH key setup, verified badges on GitHub and GitLab, no-reply email configuration, DCO sign-off, and git hooks for automatic signing, see Signing Your Git Commits: From Zero to Verified.

If you need to sign commits you have already made on a branch without signing, rebase with --exec to amend each commit retroactively:

git rebase --interactive --exec "git commit --amend --no-edit --gpg-sign" origin/main

This replays every commit in the range through git commit --amend, adding a GPG signature to each one without changing the commit content or message.

With the branch rebased, squashed, signed, and pushed, the pull request is ready for review. The maintainer's choice of merge strategy determines what lands in main. A full discussion of the three GitHub merge strategies, why squash and merge is the right default, and how to configure GitHub to enforce it is in Squash and Merge: A Better Default.

A Complete Workflow

Putting all of this together, here is the workflow from branch creation through merge. It reflects the contributor flow described in the contributing guidelines post, with the rebase and squash steps made explicit.

Start from a current main:

git checkout main
git fetch origin
git rebase origin/main
git checkout -b feat/add-datastore-cluster-support

Work and commit freely on the branch:

# ... make changes ...
git add .
git commit --message "wip: initial datastore cluster support"
# ... more changes ...
git add .
git commit --message "fix: correct variable name"

Keep the branch current as main moves forward:

git fetch origin
git rebase origin/main

Resolve any conflicts that arise, then continue:

git rebase --continue

Before opening the pull request, squash and clean up:

git fetch origin
git rebase --interactive origin/main

Use the interactive rebase editor to squash work-in-progress commits, drop anything that was reverted, reword any commit whose message no longer accurately describes the change, and write a final message that conforms to Conventional Commits.

Push the cleaned branch:

git push --force-with-lease origin feat/add-datastore-cluster-support

Open a pull request with a well-formed title. The title is the squash commit message. Follow Conventional Commits format: feat(datastore): add support for datastore clusters. GitHub will use this as the commit subject when you (or a maintainer) squash merges.

Update after review feedback:

When a reviewer asks for changes, make them on the branch and amend or fixup the relevant commit rather than adding a new "address review feedback" commit:

git add .
git commit --fixup <sha-of-commit-to-fix>
git rebase --interactive --autosquash origin/main
git push --force-with-lease origin feat/add-datastore-cluster-support

The --autosquash flag automatically marks fixup! commits as fixup in the interactive editor, so you just save and close without rearranging anything manually.

Merge using squash and merge. The maintainer squash merges the pull request. One clean commit lands on main with the PR title as the subject. The working history of the branch is gone from main, and the PR itself remains accessible by number if anyone needs to trace back through it. For the full case for squash and merge and how to configure GitHub to enforce it, see Squash and Merge: A Better Default.

Why This Matters

The payoff for this discipline is not immediately visible. The pull request gets merged, the branch is deleted, and the feature is deployed. The history is not something you look at every day.

But when something breaks, the history is exactly where you look. A clean, linear history lets git bisect find the commit that introduced a regression in minutes. A history full of merge commits, "wip" messages, and fixups makes the same search take hours. And git revert can cleanly undo one pull request's worth of work instead of requiring you to figure out which scattered commits belonged to a given change.

When someone asks why a particular design decision was made, or why a specific constraint exists, the commit body is where you find the answer. The approach Marc Gasch describes in his article, and what this post has tried to expand on, is about treating the history as a first-class artifact of the project: something that should be as readable and useful as the code itself.

The habit is not difficult to build. A few extra commands before you open a pull request. The return compounds every time someone reads git log and finds what they were looking for without having to decode the noise.

References