An SCM such as Git is more than just a database for source code. It's not only the thing you need to interact with to get code to production, but also a log of changes on a project.
It's not just the last couple of weeks of commits that are worth looking at. Any commit remains relevant weeks, months and years later.
A commit serves multiple purposes. The first one is to explain a change during its review and the second is to explain a change to a future reader. It all revolves around communication.
I gave a talk based on this post at #NoRuKo 2020. Scroll to the bottom to check out the video and slides.
Git is about talking to your team. It's about talking to yourself, from the future. What about that colleague who hasn't even joined the team yet? Yes, them too.
Git is a great resource to document why a change was made and the considerations that were part of the process of creating the commit. It's the documentation of how the project evolved.
An example: Using git blame in Practice
Do you use git blame? Is it useful or is it a bunch of gibberish commits, including classics like: "WIP", "Fix bug", "update tests", "Move method", or my favorite: "..." (Yes, it's just three dots)?
After some digging, you've found the change that introduced a bug, but you are no further in finding why the change was made in commit "WIP". You ask the person who authored it for more information, but they may not remember why the change from a year ago was necessary or they may no longer work at the company. So you're going to have to assume things about why a change was made and work off of that. That should not be necessary.
An example: Pull Request reviews
When you or your team are reviewing a Pull Request, the same thing is happening, but you can ask why a change was made while it's still fresh in their memory. But you first need to ask. To make a good review, the reviewer also needs to know if and which alternative solutions (if any) were considered and why. We need to make better commits to make that possible.
Communication in commits
Why "The why" is important
A commit with only a subject says very little. At best it tells you which change was made, which may be useful to tell what code was changed, but the reason why the change was made is missing.
Explaining why a change was made is a good moment to reflect on the process that got you there. Was the change made because it was the most logical choice, or were there alternative approaches, and why were these not chosen?
If anyone else ever looks back on the commit, they can see if the restrictions a change was written under still apply or not, or how a certain bug slipped in under the assumptions a change was written.
I use writing commit messages to rubber duck problems. When I'm stuck on a change, rather than grab my rubber ducky, I explain it in writing. This way, I also immediately have my commit message for my team.
Explaining the "why"
To explain the "why" of why a commit was made, the commit message can explain the bug that's occurring that prompted a fix, along with its solution. This way, we have the reason for the commit as well as how the issue was fixed.
For example, there is a method that needs to be called with an Integer value. In the production app, an invalid argument error is reported on the line the method is called. Upon closer inspection, it is found that the method is sometimes called with a String value.
The quickest commit to write would be something along the lines of "Fix method call", but this commit without a message doesn't explain the actual problem.
A commit should explain the situation that causes the error—some user input isn't cast to an Integer. And how it was fixed—the controller action "update" will now cast the user input to an Integer.
# Subject Fix Ticket#set_priority being called with Strings # The scenario When Ticket#set_priority was called from the TicketsController the priority value was sometimes a String type. This caused an `InvalidArgumentError`: InvalidArgumentError: Invalid argument type given `String`, expected `Integer` 1: from /app/lib/models/ticket.ext:23 in `set_priority` 2: from /app/lib/controllers/tickets_controller.ext:45 in `update` # How The `user_params` method will now cast all `priority` params to an Integer before sending it to the Ticket model. # Alternatives <Colleague A> and I chose to cast it to an Integer in the `user_params` method so that all other actions in the controller can also be able to use this logic. This way, not all the methods need to implement their own casting, making for a more fragile setup, something that's easy to miss when a new action is added. This allows us to clean up the `new` and `create` methods as well. We didn't want to move the casting to the `set_priority` method because this controller is the only place it can ever be called with a String. # Additional sources Fixes #1234
Communication in Pull Requests
When a Pull Request on GitHub is created with only one commit, it prefills the Pull Request title and description with the commit subject and message. This is not by accident. This way, the important information of the Pull Request is right there at the top. That's not to say that a good Pull Request only has one commit, oh no, but it does help a lot with creating smaller Pull Requests.
If a Pull Request has multiple commits, I recommend you to use the same format. Repeating all the commits in the Pull Request's description as separate headings.
I use the important commit's subject as the Pull Request title or write a brand new one if it helps identify all the combined changes.
There's very little additional context a Pull Request needs after a good title and description. Commits should contain the important bits of a Pull Request. If more context needs to be added to the Pull Request body, it should have been part of one of the commits.
This also prevents a lot of back and forth between author and reviewers to explain the changes in a Pull Request, speeding up the review time.
Keep Pull Requests small
If a set of changes start to become large and difficult to review, send in smaller chunks as Pull Requests early to get feedback on the changes. This way you know you're on the right track.
If you find yourself refactoring code while working on a change, send in a separate Pull Request for the refactor if possible. This way, the Pull Requests are easier to review because their size is smaller and focus only on one type of change. Your team can get started reviewing the refactor while you work on finishing the other changes.
Nobody truly likes to review Pull Requests with 20 commits or 20_000 line changes. It's difficult to find the important changes between all the noise.
Also, make sure every Pull Request can be merged independently, so it doesn't have to stay open until the entire change is done.
Present a tidy history
A reviewer doesn't need to experience the exact same process you went through to get to the final changes. If you first went for approach A and halfway through, switched to approach B, do not include both A and B, but only the final result. But do document the alternatives and reasons why approach B was chosen in the commit message.
Rebase your commits to present a tidy history with just the necessary changes. Nobody needs to see when you merged the upstream develop branch back into your branch. Or when you signed off for the day and committed all your changes into "WIP" to 'back it up' on a remote server. This is noise that hides changes in undescriptive commits, and why they were made.
# History as it was logged during development * WIP * Add button to edit user profile * Merge remote-tracking branch 'origin/develop' * Refactor button component * Fix tests
# Tidy history ready for review * Refactor button component * Add button to edit user profile
Any additional commits as a result of reviews are of course fine, as they communicate the changes that were made as part of the review. Allowing reviewers to see what changes were made based on their feedback. (So no "Processed feedback" or "Update path/to/file.ext" commits.)
Git is about communicating with your team. Document why you made a change in your git commits. Your team, your future self, and your future colleague will thank you.
Make sure all the context needed is present in commits and Pull Requests in terms of why a change was made, how and what alternatives there were.
Improve Pull Request review turn around time by keeping them small. And make them better by understanding why a change was necessary and review it with that thought in mind.
Talk at #NoRuKo 2020
I gave a talk a based on this post at #NoRuKo 2020. Check out the video and slides. In this talk I go a little bit more in detail about the ways communication can go wrong commits, git blames, and Pull Requests, and how we can improve that.