Git log is not a changelog
agateau.comThe purpose of a git commit message is to answer the question “why does this commit exist?” That is the principal question you should be answering when you type `git commit`. This is the question you will be asking yourself when you find that commit in `git blame` or if it shows up in `git bisect`. Try to help your future self out.
The changelog, on the other hand, answers the question “why does the customer care about these changes?” It could be the same reason as why the commit exists, but the question is different. If the customer doesn’t care, maybe it doesn’t even need an entry. Maybe why they care is slightly different than the reason the commit(s) exist.
This is why I advocate for a hand-updated CHANGELOG.md. It’s a very small amount of writing that forces developers to consider how their PRs will impact customers that an auto-generator will never be able to do well.
We autogenerate our CHANGELOG.md extract from Changelog-[section] our git commit messages (Release Captain massages them if necessary), and have a GH bot which checks that you have a Changelog- note somewhere in your PR commits (can be Changelog-None: ....).
This avoid merge hell on the CHANGELOG.md file.
Setting `CHANGELOG.md merge=union` in .gitattributes mostly eliminates spurious merges conflicts on the changelog.
Oh wow, thanks for the tip! Had no idea this existed.
Me too! Awesome, thanks!
Doesn't make sense to automatically do this, some changes can be mutually exclusive
> This avoid merge hell on the CHANGELOG.md file.
Right, if you don't do something like this you can be in merge hell very quickly...
I do hand-updated CHANGELOG.md files, for all my projects.
Sometimes, my Git commits can be kind of verbose (and I'll occasionally put in smartass comments).
I wouldn't dream of mining my Git logs for CHANGELOG entries.
Actually, I hand-run many things that a lot of developers automate.
Part of it is because I have control issues...
A couple jobs ago, I was at a company that had a "mandatory" squash / rebase / merge workflow so history would be clean. On top of that, they forced all their developers to update a change log as part of the merge. That file was a source of contention / merge conflict for nearly every PR, often requiring additional rounds of rebasing. On top of that, it was full of information that could've been gathered from the git log. Big waste of time, in my opinion.
We had something kind of similar, but designed to avoid merge contention. We had each PR include a randomly-generated number, and a `changelogs/` folder where you could add a `${number}.md` that was either blank or had a message that would be added to the changelog. After you made the PR, you could run a bash script to edit the PR to contain the number and generate the `${number}.md` file.
It felt kind of silly and I don't know if anyone actually looked at the changelogs, but it took 2 minutes out of my day and worked well.
> The purpose of a git commit message is to answer the question “why does this commit exist?”
Why? Every time I have asked this people say because you'll search the logs (which I have never done in my life) or because "it's good practice"
If you have never needed to find out why a particular piece of code is the way it is, then you have been very lucky to work in very clean code bases. In my own experience, this has been an infrequent but inevitable part of work - maybe once a month or so, I have had to understand whether a particular piece of code, that seems wrong, had a good reason for existing or not.
Sometimes it turned out to be a mistake in the original commit, or working around a limitation that no longer exists, other times it has saved me from re-introducing a bug that someone had spent effort fixing.
ArrayBoundCheck didn't say they never had to find out why a particular piece of code is the way it is, just that they didn't have to search through log messages.
For isntance, when you use "git blame" and similar tools, the log messages are not involved. You might end up reading the log message of the commit that was responsible for a change, but you didn't search log messages to get there.
In a project with poor commit messages, they will be of little use; reading them won't produce much value, let a lone searching.
Anyway, that seems like the best possible interpretation of the user's comment, anyway.
tsimionescu didn't say anything about searching the logs.
The commit message explains (/should explain) why the commit was made. That is useful even if you arrive at the commit via git blame, bisect, etc.
I think you just haven't discovered you can? I don't know of anyone arbitrarily looking through logs, but it's incredibly useful with git blame when you get to a section of code and don't understand it or typically it's done in an odd or unintuitive way. The blame shows who wrote it, information as to what they were working on via the actual commit message, and branch information.
If the person still works with you, you can just ask them about it.
If they don't or you don't want to bother them, the commit message tells you what the change was and often times why to give you a better context.
All the teams I've worked on included information like feature, bug, defect, and the associated ticket number in the branch name, so you have the information at hand to go look at the ticket directly and see what requirements were needed.
I'm starting to understand why. It appears my workflow has the information other people would want from a git log elsewhere (test, specs, examples, etc)
Let's say you have some code written 2 years ago. 2 months ago someone made a bug fix in the code replacing a few of the lines. How would you know from test, specs or examples which specific lines were modified and by whom? I don't get it.
Why would I want to do that? Usually if there's a problem I'll either write a new test or see if someone modified a test I thought was covering the case
Usually people 'own' a file or part of the system so that wouldn't really be happening anyway
> Why would I want to do that?
Because sometimes we can learn from history. If a mistake was made at some point it can be good to understand why. Of course you can adopt the mindset that you don't care what ended up causing a bug, but learning from mistakes is a good thing.
> see if someone modified a test I thought was covering the case
How do you see if someone modified the test then? I feel like we are maybe misunderstanding eachother because to me, and seemingly most other commenter here, this is such an obvious no-brainer that it seems something is lost in the communication.
> Because sometimes we can learn from history. If a mistake was made at ...
A commit message helping me learn anything that I didn't get from a code comment sounds like a stretch. I'm doubting this
> How do you see if someone modified the test then? I feel like we are maybe misunderstanding eachother
Clearly. I was suggesting I might see commit messages if I'm using git blame to find out if a case was removed from a set of test or if it never existed in the first place. But I don't see how messages would help at all in anything I do. What I'm looking for is far too specific to be in a commit message and this whole thread about using commit messages to learn sounds nonsensical.
As I look at the top ten committers to our key service, two of us are still here, the other eight (including the lead from day one) switched teams or left the company over the last five years. Unfortunately I don’t think 27% turnover per year is unusually high in tech, so ownership isn’t a good replacement for written records.
That's right; for log messages to be useful, they have to be very disciplined. They have to stick to a certain format and then if the information you're looking for is of the kind which is provided by that schema, then they are useful.
As an example, say log messages are strictly required to contain a bug database ticket number (even if they are not fixes: bug database tracks tasks too) then that is useful; you can quickly search the git log for a bug number to find all of its commits.
Doesn't mean others are in the same boat as you. The team I work on frequently write detailed commits to the point where I can (and have several times) successfully searched information from years ago on a block of code with the ticket number and the developers (even myself) reasoning at the time.
Maybe it depends on the project but I've found myself doing this so often that I won't stop, it's such a small task that has given me so much benefit. If it doesn't help anyone then nothing was lost.
Why are you searching a year or more back? Is this using git blame?
To explain why a certain block of code exists. When the code came into existence is rarely that important. You just want to know why.
Why does the code fence against a particular circumstance you didn't think should be possible? Why does it call out to something you think is unrelated? Those questions can be answered by a proper commit message.
Sometimes, in some kinds of projects, the commit messages looks like this:
To know the "why", you have to read the ticket; you will not find anything in the git log.JIRA: #1234 Adjusted the FOOBAR parameter from 42 to 73.Oh jeez, please do not make me run “git log” and then open a hundred tabs in an old bug tracker that may or may not still exist to figure out when a problem may have been introduced. I want code reviewers to insist on at least somewhat useful messages for us to skim at 3 AM.
I set a breakpoint and run the test suite for that answer
That answers what the code does, but not why it does what it does.
So you never need to understand when or why a bug was introduced? Or you never need to understand why the current behavior as is it is?
Why don’t you put that as a comment on the code?
Comments are for why code is written the way it is, commits are for why the code is written in the first place.
You _could_ tag every line of code with
but if you did it would become unreadable and wouldn't be kept up to date.// JIRA-123 The PM wants this to be blueBecause if 37 changes are done to the same 15 line function over time, the amount of comment material will dwarf the function. And most of it will pertain to historic versions of the function which are not what actually appears below the comment; a comment made 13 revisions ago makes sense for the 13-revision-old version of the function.
You just update the comment?
I work on several legacy codebases which were:
- written by people who were not me
- written by people who have since left the company
- the original documentation has rotted away through wiki replacements, issue tracker replacements, or being lost via people turnover or system replacement.
The result in many cases is the code is the _only_ documentation of the system behaviour, so seeing what was introduced together is important context to understanding why it is the way it is. I probably run git blame more than git commit at this point and there's a real QoL difference between the good commit messages and the "changes for ticket12345" commit messages.
Never looking at git history is like, not reading comments or something. It's an incredibly valuable resource for understanding why the existing code is the way it is.
You didn't answer why. My code passes all the unit test and we almost always have real code using it immediately. The function works. Why am I reading it? The only thing I read are bug reports (usually a spec problem, not normally a logic bug) and new features, or test outputs
If someone complains that our system did something weird, sometimes it’s a mistake we can just fix, but sometimes it’s a non-obvious consequence of some requirement (which I might not have been aware of) that we have to explain to our consumer. It also helps a lot to tell whether it has been like that for a week or five years.
I can’t even imaging having a spec that fully answers anything like this; Microsoft tried for that level of detail but I found they couldn’t keep it up to date.
Because you'd want to know why code does what it does.
If a bug pops up, you'd want to know why some code exists and if the original premise of the code is as it should be or needs revision.
I think it's very rare that a developer receives all that information up front in a team. Eg. Old codebase, lost knowledge or people leaving.
I think commits should contain atomic-yet-meaningful changes and the commit message should describe this as well as possible.
It's worth rewriting the history to achieve this and squashing or splitting commits until this is the case. You shouldn't do this for the benefit of your users or a changelog, you should do this in order that it is easier to bisect the history or for other contributors to understand exactly the change a commit relates to. There is nothing worse than commits which combine a working bug fix with a half-written feature -- split them out!
Obviously, it's possible to inadvertently create a misleading history by re-arranging the order that work was done or getting rid of failed attempts at a solution, but generally the false reality is easier to understand and good understanding is key.
Yeah, we don’t do much but I started writing up a brief description with a link to the PR, hotfix, or commit, so we can easily find links to relevant changes if we need to. It’s not that difficult to write it up manually. Automating it is too prone to either errors or a less than helpful message.
Yes the Git log is the ChangeLog.
For instance, I retired the ChangeLog in the TXR project in 2015; commit messages continue to be in the ChangeLog format. A ChangeLog file could easily be produced from the commit messages.
Replicating that information in a file that is checked into git is silly; you're just begging for merge conflicts. Any time anyone sends you a patch, if it is not rebased to your current HEAD, you have a guaranteed conflict in the ChangeLog file.
Why do that to yourself.
> The changelog targets your users. It must answer questions like:
> "What cool new feature is in this version?"
> "Is this annoying bug fixed?"
> "Is it safe to upgrade, or do I need to adjust my code/workflow to this new version?"
Oh, I see what this person's problem is. He's referring to some lower case "changelog" that everyone calls "release notes".
I agree; the git log is not your release notes.
Please don't call "release notes" "changelog" in 2022.
Release notes aren't a change log because, doh, they don't (exactly) log the changes.
(Author here) Yes, "release notes" is probably the appropriate term. The issue remains however: many projects generate their release notes from their git commit messages.
Regarding the conflicts issue, this is where tools like Changie can help: instead of modifying one single file, every merge adds its own separated entry. They are then assembled together at release time with `changie batch $VERSION` to produce a single file for $VERSION, then merged into the global changelog/release notes file with `changie merge`.
This is something you could do with a commit hook which verifies that commit messages have a well-formed release notes entry section, and some seven line Awk script to combine them together when needed.
If "every merge adds its own separated entry" and that doesn't refer to a special section of the commit message, you're doing something silly.
I find that many programmers are hyper-focused on writing automation for things that feel like a chore. This thinking has its purpose, but it’s almost an addiction.
For these folks, putting deliberate effort into change logs, release notes, and documentation feels wasteful.
My hunch is that this is due to a missing feedback loop: we are unlikely to get feedback about documentation, and more likely to get feedback on our project’s code.
My own writing improved after a past project had a strong feedback loop with my documentation’s intended audience. This has been so damn rare in my career, that it’s never surprising when I meet programmers who are uncomfortable with technical writing.
I think the reason programmers want to automate documentation is:
- they are programmers, automating things is their job, that's what they are good at, so of course they will do that
- there is the general "don't repeat yourself" idea. Documentation repeats the code so, ideally, if both are needed one should be generated from the other. Sometimes, the code can be generated from the documentation, but most of the times you can't, so documentation becomes secondary to the code.
I very much agree with this. Sometimes it's good to stop and think about what is lost when automating a task instead of doing it manually.
Use multiple "-m" parameters in your git commit.
git commit -m "feat: script pretty print" -m "added variables for bold, normal, and a nice blue arrow"
becomes: feat: script pretty print
added variables for bold, normal, and a nice blue arrow
In your git log output. Use extra "-m" sections for stuff like ticket references, or other relevant information like a link to a design document.Alternatively just avoid the `-m` flag and open the message in `$EDITOR`
But then I have to figure out how to exit vim.
MVP
That depends what you mean by a changelog.
According to the GNU coding standards, which have been used for decades for a large amount of the core software on a Linux system, what you should put in the changelog looks quite a lot like a good git commit log would look like[0]. And Linux currently uses the git log as the changelog, and IIRC had a similar format in the pre-git era.
> The changelog targets your users. It must answer questions like:
> "What cool new feature is in this version?"
> "Is this annoying bug fixed?"
> "Is it safe to upgrade, or do I need to adjust my code/workflow to this new version?"
To me, that sounds like what a lot of projects would call "release notes" or (per GNU) "NEWS file"[1], not the changelog.
[0] https://www.gnu.org/prep/standards/standards.html#Change-Log... [1] https://www.gnu.org/prep/standards/standards.html#NEWS-File
I agree, I used to have a NEWS file in my projects (later a NEWS.md), but as others commented, the signification of the term "changelog" has changed. Sites like https://keepachangelog.com/ really refers to release notes or news.
GNU's notion of a changelog predates the widespread use of source control, and is indeed there to cover which is now covered by git history. It's not what anyone using modern software development approaches means by a changelog.
> It's not what anyone using modern software development approaches means by a changelog.
Well, it's a good job then that this forum is cool with blurring the subtle distinctions between technical terms that have been in use for decades, and is happy to just go with whatever mainstream definition of a word has the most traction...
...on this site for news about breaking into computer systems.
/s
Yes, a commit log isn't a changelog. However, a good commit log can make writing your change log much easier. While this isn't an automatic process, writing a changelog becomes a bit of filtering of the commit messages as well as rephrasing them for the intended audience.
FreeBSD uses
Relnotes: yes, or text for the release notes
in commit messages to note that the commit has significant user-visible effects.> Git commit messages and changelogs do not have the same target audience. [...] Some people dislike merge commits. [...] If that bothers you, you can always ask contributors to rebase before merging.
I think that squashing and merging, much like commit messages and changelogs, also have different target audiences.
If there are contributors to the project who aren't proficient with git, asking them to rebase would be a huge obstacle for them and create a much worse mess. Squashing simplifies their workflow and allows any mistakes (e.g. checking in the database, accidental merges from the wrong branch, etc.) to be cleaned up and kept out of the permanent history.
Agree with the premise of the article: changelogs—or rather, news files—are not the commit history. Describing the changes between releases at a high level is super important but also a skill that’s hard to acquire.
I thought I had written about this way back as well, but what I found instead is a post from 2005 that’s tangentially related. I remember that some projects back in the day tried to replicate the equivalent of git log in their ChangeLog file… by hand!
Here is the post: https://jmmv.dev/2005/08/manual-changelogs-thing-of-past.htm... — and pardon the English and its structure. Not as nice as I’d like it to be, but that was written 17 years ago!
The missing part of this is filtering git history with "git log --first-parent", but that only works if your repo has good a) merge commit messages and b) hygiene in creating your merges so that your "Pretty" history is on branch 1.
On the other hand, it takes a lot of discipline to make the second branch useful at all. If you constantly make short commits with messages like "stuff", why bother keeping them in the repo at all? There's a ton of junk you don't need in the typical code review, and making it useful requires... Discipline.
At work we squash and use standard-version to create a generic change log, then have a script that ingests that change log and goes to the PR to grab any associated tickets, screenshots, and release notes the developer may have written. The body of the PR is split into sections (just using markdown headers), so there is control over what goes into the release notes.
A git log is not a change log, sure. But PRs can contain a lot of useful information.
I use the git log to feed my changelog. I prefix the stuff that's supposed to go in the release notes with a asterix and the technical boring stuff is just a normal line. Then at release time I have a script that pulls the asterix prefixed lines from the change log into the RELEASENOTES.md. I wouldn't want to bother with more.
How do you know that your asterixed commit should go into the release notes? What if...
* fixed thing X so that user can do Y
broke things so that another commit is needed:
* fixed X again, so that user can finally do Y (for realz this time)
This would not make a great release note.
you just wouldn't put a * in front of the second one
or you might have an alternate prefix for "things that go in development build changelogs but not final changelogs", e.g. "-"
When you squash and merge a PR, GitHub will by default prefix commit messages of the squashed commits with an astrix.
I completely agree with this. Every single project with « conventional commits » that try to generate release notes end up with completely useless release notes. Just bite the bullet and write for humans, it’s not that hard, especially if you do it on the fly as suggested by the post.
If you write dev facing projects, unless your volume is that high, your commit history can absolutely be your change log using conventional commits.
Otherwise your PM can write change logs basing off of JIRA or something. Your change log generated from commits can still be useful for incident management, etc.
NixOS maintainer here and this is really annoying me since months. I can't really know what from a 100+ line git log is a breaking change and needs special attention and what is totally uninteresting for consumers of the piece of software.
use https://www.conventionalcommits.org/en/v1.0.0/, it has extensions that can autogenerate a nice looking changelog.
Yep, something like Git Cliff[1] is great for generating release notes from your commit messages.
And conventional commits are good thing to do regardless of whether you use them for release notes or not. Commit messages should be helpful and immediately obvious, too often its "fixed bug" or "finally figured out foo!", which really tell you nothing - might as well not have a message.
This is the kind of tools I dislike. It does generate a "nice looking changelog", but the result is not as useful as a separately written changelog, because the content is not curated, so the signal-noise ratio can be quite high.
Yeah, at work we combine that with https://semantic-release.gitbook.io/semantic-release/ for changelog generation and continuous deployment to NPM. It takes some getting used to, but I like it.
we use https://github.com/anchore/chronicle to generate release notes in a changelog format using the issues and PRs from GitHub as the source of truth. In this way time well spent in the curation of issues and PRs (which is something we need to do anyway) means that we automatically get release notes for free. (disclaimer: I'm the author of chronicle)
Your changelog is also not release notes.
For me, it's as simple as my audiences are different, hence what I choose to put in them will often be different.
Most software is written for internal use and not sold to external customers. For this case Changelogs are busy work without much utility up until the point your organization is big enough that you and your "customer" may as well be separate companies. In that case you're better off writing a blog post or release announcement than a text file change log which is for grey beards, not users.
instead of a changelog, would be neat to ship a 'spec' file that says what features the codebase provides, as well as detailed semantics
plus maybe a 'fixes' list (because a spec file doesn't need to say 'what used to not work')
then compute diffs of these using git history to produce a changelog
Apparently nobody knows the difference between NEWS and CHANGELOG anymore. Maybe re-read the GNU coding Standards. https://www.gnu.org/prep/standards/standards.html#NEWS-File
git log let’s you check if you might be missing something from the change log (aka release notes) though.
Or if you didn’t actually release something you thought you did.
no changelog until somebody complains?
export that cr*p out of whatever issue tracker you using. and merge commits are great
Alternately: be religious about (merged) branches and changelog entries being 1:1.