Conventional Commits: A specification for structured commit messages
conventionalcommits.orgImagine the following future:
“Have you linted and unit tested your commit message?”
“Junior Developer wanted. 10 years of Conventional Commits experience required.”
“Download Conventionalizer! Now you can write Conventional Commits in plain English, having all the syntax automatically generated! (node, erlang OTP and Jerry’s pre-alpha TensorFlow binding library required. Windows support coming soon.)”
Something tells me the authors are hard at work solving a problem nobody needs solving.
On come on, this entire "spec" can be summarized in two sentences. It can be validated with a 13 character regex.
I share some of your sentiment though: I feel like the biggest reason to enforce a style like this is not for "machine readable commit messages" (I mean, why?), but to encourage people to split refactors and features in separate commits. This makes it easier to understand what's going on later.
I think this site should've begun with that, and left the spec as a footnote.
It's handy for auto bumping versions, too. If you have "feat" commits you know to bump minor, "fix" to bump patch, and "BREAKING" to bump major.
We use this in our lerna monorepo and it works like a charm as the CI can just bump whatever packages based on the paths and commit messages.
> Why?
The machine-readable part is useful for generating changelogs (eg. broken out by type) or implementing semver (eg. detecting breaking changes).
but isn't it an antipattern to generate change-logs from commit subject lines?
That all depends on what you do with the meta-data in the commits. For example, at my work, we analyze the history at the level of merges, not commits, and that works pretty well for us.
Merge information is incredibly useful to developers (even if only to provide a smart "git bisect"). But would you generate a user-visible changelog from your merges? I personally wouldn't, and that's the point GP was making.
> But would you generate a user-visible changelog from your merges?
We can, and do. Knowing that the merges are going to wind up as items in the changelog leads us to size and structure them accordingly, or at least make a best effort to. The end result is a changelog where most PRs are somehow "noteworthy" and the ones that aren't (usually "chore" type in Conventional Commits lingo) can all be grouped together down the tail end.
The point isn't to produce a perfect result, but one that is "good enough". With this system, the friction to produce a new release is so low, and we have so many projects, that we can easily push out a steady stream of releases at very low cost with this approach. And following the pattern across all the projects has been great for consistency, compared to the relative chaos in patterns and procedures that we had before.
That all depends on what you do with the meta-data in the commits. For example, in my work, we analyze the history at the level of merges, not commits, and that works pretty well for us.
Of course not. How else would you do it?
Have a dedicated changelog you maintain[1].
Patch descriptions are for developers, not for users -- aside from the fact they're the wrong granularity (no user cares if it took 20 patches and three PR cycles to implement a feature), they should contain details and justification that are only useful for future debugging or for review purposes (which users also don't care about). And if you have a bugfix for a previously-merged patch that hasn't yet been released, why would you include the bugfix in the changelog?
Yes, with the right format and discipline you could generate reasonable changelogs from your commit logs -- but at that point it's so much easier to just keep a CHANGELOG.md.
In my experience its unlikely this will be complete or accurate on any project maintained by a team of more than a couple of people.
Common practice for a long time has been to include a ticket id in the commit and then use a script to pull all the ticket ids from the commits since the last release and pull the (user facing) release note info from ticketing system.
Of course you don't want commit messages going to users but you don't want to rely on handcrafted lists either. Particularly when a release as far as a customer is concerned is infrequent and contains changes from many repos.
Make a human read the commit history (or tickets) and summarize changes in the language useful for users, not for developers.
Exactly. I can read commit messages myself, but there's a lot of stuff in there that's not relevant to me as a consumer - e.g. refactorings, cleanups, etc.
I'm sure the human appreciates some help from a script that can parse metadata in the commit messages.
You ever play or work on a game? Changelogs are a big deal over there. Automatically collecting them would be highly valuable.
Do this from your JIRA (or equivalent) tickets then, you already have user oriented descriptions there. you can easily collect all tasks and bug tickets done during a sprint and generate a changelog. Though in practice I would expect someone to cleanup by hand.
How do you link the tickets to the code that was actually built to be sure the changelog is complete and accurate?
It's usual to include the ticket id in the commit or PR so you can pull it from logs at build time and have a canonical list. Then grab the info to include from the ticketing system.
Ha! Maybe I just worked in a less organized game studio, but the only reliable record was the commit log.
Except when people would literally just mash their keyboard like this:
Commit 1234: iuadiuasdbuidawbiywbuqbqbdfpbdpiube
"THIS ONE COMMIT MESSAGE DRIVES PROFESSIONAL DEVELOPERS CRAZY!!!"
> feat: a commit of the type feat introduces a new feature to the codebase
instead of 'feat:', why not 'feature:'?
I dislike partial abbreviation because it is confusing; yes doc for document and max for maximum make sense but in this case feat is literally a different word?
And while we're at it, why don't we just use full sentences?
Before:
> feat: allow provided config object to extend other configs
After
> Add option for config object to extend other configs
I know this isn't the point of changelogs, but I've been using the verbs from KeepAChangelog to start my commit messages and it's been going well so far.
> Add template preview to status page
> Change textarea to increase height on `:focus`
> Remove deprecated CLI flags
> Fix margin styles causing layout problems
I agree that full sentences (e.g. like Email subjects or blog post titles) are better. Tagging commits with feature/bug/etc is not particularly useful, as more often than not the line between feature, bug, etc is blurry. At times it can also be unclear what tag to use, leading to arbitrary choices. For example: is a performance improvement a feature, or a bug fix? The tags also add no value when reading commit messages.
Setting that aside, the conventional commit "standard" (https://xkcd.com/927/) doesn't focus on what I think is the most important aspect of a commit: a good commit subject and message. In fact, prefixing the subject line with certain tags limits the amount of characters you have for writing the message; assuming you want to stick with the usual 50 character limit.
> feature/bug/etc is not particularly useful
It's useful for automatically determining the next semantic release version by inspecting the commit history alone.
fix: <-- patch
feat: <-- minor
breaking: <-- major
So why not just use patch, minor, major? Then we don’t even need the indirection.
Some fixes will require breaking changes. Fixes can be part of the message with the issue #, etc.
In this case you suppose to add a ! after fix: fix! By using patch, minor and major you lose some information (the purpose of the commit)
That seems inaccurate, it looks more like fix: feat: are both potentially patch level while fix!: feat!: breaking change: and breaking change!: indicate major changes... possibly? Ouf I think the commit is just absolutely the wrong level to encode this at - I much prefer ticket level encoding of this information.
I was explaining how standard-version [0] works. Conventional Commits homepage doesn't mention anything about the '!' syntax. [1] I'm unsure if you're suggesting that's how you think it should work, or how it actually works. I haven't ever tried using the '!' syntax, so I can't say for certain.
In terms of release management, it makes configuring CI jobs simpler with one less parameter. If you automatically release merges to master, you can use standard-version with conventional commit syntax. On other projects, I've seen people use GitHub PR labels to mark 'major', 'minor', or 'patch' releases (the CI system reads this information when generating releases).
If you feel it's inappropriate for this information to live in your commit history, you'll need to specify it through one of these other options.
Since I don't have a dedicated team for this sort of infrastructure (I maintain my own Jenkins jobs), I find that Conventional Commits get the job done, so I can focus on other things. There could be better ways, but I have more pressing problems than demand my attention, with higher priority than optimizing my CICD configurations.
[0]: https://github.com/conventional-changelog/standard-version
[1] has:
"or appends a ! after the type/scope, introduces a breaking API change"
I've got an "Unreleased" section in my changelog in which I log any changes that are relevant to consumers, with sections for bug fixes, new features, and breaking changes. No need to clutter my commits with that.
> the usual 50 character limit.
People do this? I guess it's fine if your commits are <100 lines of code but it seems needlessly concise, to the point of losing information.
Personally, almost all my commits follow email styling: first line tries to be concise, newline, then a list of what changes have been made along with a brief justification. Feels like a bare minimum unless you'd rather tie everything into pr's (which can't be seen from command line). This is why for people unfamiliar with git I refuse to tell them about commit -m because I feel it leads to undocumented history when you need to audit code.
disclaimer: I tend to work on long lived projects (5+ years)
Edit: bonus point that I forgot - by having a detailed commit log that reads like prose, it's easy to explain "everything you guys are working on" to the non technical folks.
You're arguing against a strawman because you misunderstood the parent comment.
The character limit was only referring to the subject line. Under it you can have as many lines of additional context as you wish. That's the normal way; the subject is brief, the message is arbitrarily long.
For what it's worth, the linux kernel limits it to around 70 chars and uses a subsystem prefix, so clearly that can work on large long-lived complex projects.
Your argument against 50 characters in the entire message, including the body, is not one anyone would ever argue for, so you've constructed a complete strawman.
And some useful context: the 50-character limit is the result of the subject lines then being readable next to a commit graph that Git can display.
Surely the most important part of a commit is the code committed, not the accompanying message.
Yes and no. Both are important.
The code is what actually gets run, sure, but good code with bad or misleading documentation can cause trouble later on. Commit messages are documentation.
And commit messages are possibly the best form of documentation when it comes to debugging a problem -- it lets you know exactly what a developer was thinking when the code was written.
Why? Why is there such a strong desire for full sentences?
It's pedantic to me. I'm completly fine with terse incomplete sentences. They are not harder to understand. Plus I work with international teams. Terse is often easier to write and understand for non-native speakers.
I worked on an international team with non-native speakers and we found proper casing and punctuation easier to read.
OCD seems prevalent in programmers. For me full sentence comment / commit message requirements are the ultimate process for the sake of process. Such a bikeshedding waste of time. Such policies serve absolutely no objective purpose. They're only there to satisfy some leader's asthetic senses and feeling of control.
As for easier to read thousands of newspaper headlines and magazine article titles provide strong evidence otherwise. Even HN itself is proof it doesn't matter
I also dislike unnecessary abbreviations. "Feat" saves you just 3 characters, but earns you ugliness and confusion.
It seems like it was made by someone who accidentally spilt coffee on their keyboard which made their 'U' key sticky.
I like my "Feature" much more than "feat". This is my default commit message that I edit to contain what I want:
<Type>: <Description>
# Type can be:
# - Feature: A new feature
# - Bugfix: A bug fix
# - Docs: Documentation only changes
# - Styling: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
# - Refactor: A code change that neither fixes a bug nor adds a feature
# - Performance: A code change that improves performance
# - Tests: Adding missing tests
# - Chore: Changes to the build process or auxiliary tools and libraries such as documentation generation
Nice! "Styling" could be ambiguous though. I personally use "Formatting" instead.
I typically go with "Linting". "Formatting" could mean something different in some contexts.
That's even a better choice. Thank you!
And only saves you like 3 symbols. Made a pull request fixing it https://github.com/conventional-commits/conventionalcommits....
You're not saving them any work by creating that PR. That's better discussed and agreed on in advance.
Titles of commit messages should be short and concise, any extra information can go into the body of the commit. A common guideline is to have the title capped at 50 characters. If you work on the CLI, having a concise git log is far easier to skim than having very long commit messages and if I want to know more about the commit, I will check the body. It's also what a lot of websites use to truncate the title.
From my experience, a few characters less do matter (which is also why I dropped conventional commits and just use "Add blah blah to blah", "Fix typo in user-facing message").
`feat` is only a saver over `feature` if you're making a descriptor that was optional into something required - I admit I've never worked with a public commit history but for our internal projects commit messages are expected to give some justification or explanation of the necessity of the change without any formatting specifically enforced (though we require branches to contain at least one commit that pulls in the related issue ticket #).
I much prefer encoding structured information like this at the ticket level where history can be more easily corrected and items are expected to be visible for all of time.
Probably because tools like github truncate large titles
Because it makes you feel good from sounding like you've just accomplished a feat when you commit the feature.
Commit messages should be short. If you're familiar with conventional commit syntax (and chances are, your team will tell you to follow it, if your repository follows it), then your mind will automatically expand 'feat' to 'feature'.
Tha i wha I cam her t writ a wel. D w reall nee t repea th infamou Uni mistak wit th "creat" syscal? Thin o th childre!
I'm sticking to the GNU ChangeLog format, thanks.
https://www.gnu.org/prep/standards/html_node/Change-Logs.htm...
This widely used format gives details about what is being done to each function.
This was designed to be used in a ChangeLog file, so it has to be adopted for repository use. We don't have to record the date and name, since that is in the commit meta-data. WE write a commit title, and then the ChangeLog entry becomes the details placed after the blank line. That entry is mandatory: no title-only commits! There can be one or more discussion paragraphs between the title and the ChangeLog entry. We know that these paragraphs aren't ChangeLog entry material because they don't begin with the asterisk.
Like this: http://www.kylheku.com/cgit/txr/commit/?id=b2739251281d7f6ef...
Personally, I feel like this style puts the focus on what rather than the why. I also dislike that it seems to be centered on multiple changes in one commit.
We've been following conventional commits for our front end code for the last year or so at my work. In other repositories, we've loosely followed the keep a change log conventions. I find conventional commits great when your repository will produce a package to be consumed by others. For example, conventional commits for our shared JS code helps us produce great change logs and helps us easily follow semver for the NPM packages our other applications use.
However, I don't find it that useful in the the final applications, even counter productive, since it typically will take up quite a bit of space in the commit title. Many of our front end devs completely ignore title length conventions now.
Why don’t they put the additional information in the body of the commit?
I see this in nearly every company I go to - everyone rushing to skip over adding anything useful to the permanent log by using git commit -m rather than a plain got commit.
This is the place where (mentioned elsewhere in this thread) things like issue tracker links and other context can and _should_ go if you're using something like CC.
Oh, we do. We are generally pretty great at filling in good details in the body. I didn't mention that originally because I didn't think it was noteworthy.
The main problem is very commit titles that end up looking like:
feat(SomeScope.OtherScope.Class): add support for abc and xyz option
For use cases where this level of rigour is desired, it would be nice to have real separate metadata vs. convention. Doing this by convention is unreliable.
You can get a basic level of enforcement for free by turning on the "Semantic Pull Requests" bot that will let you know when you forget the type (or use an invalid one):
https://github.com/probot/semantic-pull-requests
It obviously won't catch your mistake if you forget to mark a breaking change as breaking, but it's a start.
I don’t think this is so much to make your commit messages better, as it is to make sure that all of them can be automatically processed into changelog and semver updates.
This is the way to think about it. It's concise enough and has tooling in enough languages to where generating the changelog from the commit messages is just a CI step, but it doesn't offer much more. I like and have used Conventional Commits for several years, but the goal is just tooling around telling others what changed outside of reading the git log, e.g. PMs who want an HTML artifact.
This strikes me as quintessential bike shedding, process for the sake of process.
At $DAYJOB, we organically switched from not having any formal style to having an internal formal style. People seemed to want the benefits of tooling integration and clearer communication.
Right now, we are switching SCM's and are looking at adopting Conventional to replace our internal style. I've already started using Conventional and have really appreciated it. It makes it fast and succinct (remember, line length "requirements" in git) to get the information you need even in one-line logs. Also, it makes CHANGELOG maintenance easier, whether using an automated tool or doing it by-hand.
Not happy with the other ones, I've created my own commit style validation tool, committed [0] and have deployed it on my open source projects. Like code style enforcement in CI, I like delegating this to a tool since it makes the requirement very clear for contributors.
The one thing I'm disappointed with with Conventional is that they did not follow git conventions for multi-line trailers.
Similar experience here. On really big teams sure, you can bike shed the format a ton, but they’re all relatively close enough but CC has some good tooling so we just ran with it. Results have been fine, didn’t waste a bunch of time debating it.
Haven’t figured out a good way to integrate co-authors easily with it though.
Wouldn't Co-Authors just be a footer/trailer?
Isn't wording a bit off? "scope" should describe what the commit DOES, not what you are personally DOING, and not what you were intended to DO.
"body", optionally, describes WHY.
Also it feels like more of a convention for a personal project with optional C(I|D) automation prerequisites. In a team there should be a clear and emphasized place for the issue tracking info (ticket number, task id etc etc)
I quite like the idea of `scope` for large, multi-component projects, so you can tell instantly from the commit message what component has been changed.
I’ve found the commit message guidelines at https://git-scm.com/book/en/v2/Distributed-Git-Contributing-... to very helpful for clarity.
“ The last thing to keep in mind is the commit message. Getting in the habit of creating quality commit messages makes using and collaborating with Git a lot easier. As a general rule, your messages should start with a single line that’s no more than about 50 characters and that describes the changeset concisely, followed by a blank line, followed by a more detailed explanation. The Git project requires that the more detailed explanation include your motivation for the change and contrast its implementation with previous behavior — this is a good guideline to follow. Write your commit message in the imperative: "Fix bug" and not "Fixed bug" or "Fixes bug."”
50 chars seems pretty arbitrary to me. I'd rather have a useful commit message. I've seen some pretty contorted messages conveying no real info in order to meet an imaginary character limit.
the 50 chars is for the subject. the commit message (body) has no character limit (apart from a character limit per line).
Right. But that's the limit I don't get. If I always have to view the expanded set of commit messages to understand anything about the commit, what's the value in a short subject? And why 50 chars? That's even more restrictive than the normal 72/76/80 chars.
I think a limit is helpful to keep the subject line short, but I frequently (like, daily) struggle to fit useful messages into only 50 chars. Personally, I find 72 chars to be the sweet spot.
I wish GitHub or Azure DevOps made this configurable (because I use them) - anyone know if any other hosted git systems have this as a configurable option?
I'm in favor of a team being able to set a limit if they'd like and tools adapting. Personally, I'd rather a 90 char commit message that conveys useful info than a contorted message intended to fit in 1/3 the size of an SMS message.
It's so that when you print the git-log graph, you don't end up with wrapping of the subject text for deeply-nested trees.
Also, in the kernel generally you reference a different commit in a commit message with the form 'commit <12 char commit id> ("subject")'. So further restricting it makes sense.
Okay. But I contend that makes the whole log worth less because the short commit messages are often devoid of useful info. Granted, I don't have an email-based patch workflow; I look at logs on both GitHub and pulled in from the CLI. On GitHub, things are even worse because I have to constantly expand commits to see what actually changed.
In any event, I think teams should be free to adopt the workflow that works best for them without tools getting in the way. GitHub wrapping commit messages at 70 chars, even though I have more than plenty enough horizontal real estate, isn't helping anyone.
It's still overly restrictive and ossified from terminals / email subjects or whatever.
I do something like this but for branch names. This spec recommends a squash-merge workflow to turn branches into commits before merge. Why would I wanna do that? It seems like throwing away a lot of detail unnecessarily.
SemVer is generally good practice but I don’t like promotion to religion. For example, during the pre-release of the firebase-functions SDK we shifted SemVer by one: 0.2.1 was a feature addition from 0.2.0 and a breaking change from 0.1.
Similarly there are rare cases where I’ve swept breaking changes under the rug because they were severe bug or security fixes that affected a corner case unlikely to be seen in the wild.
I believe the first one is semver: before 1.0, anything can change at any time. https://semver.org/
Commit messages are just that - an additional communication tool. As long as any format helps keep the understanding within a team clear with a minimum of overhead, so be it.
After all the commit message is secondary to the actual code committed.
I'm sure everyone can share an episode when a nicely worded commit had to be followed up with an ugly 'Fix a typo' message.
The most practical convention is the one that's automated to some degree, for example, issue/feature tag auto-linking or some template driven messages. Either way the message should not become an ultimate hoop to jump before the actual commit and one more thing to 'maintain', the code should be the focus.
In my experience, a commit message describing the committed behavior (even when intended) helps tie the code to the overall scope. In case when it's a bugfix, it still must be tied to a correct expected behavior.
So in some sense a commit message could serve as an auxilliary level of unit testing. Of course, I'd rather put an effort to enforce the actual practice of unit testing over structuring the commit messages.
> I'm sure everyone can share an episode when a nicely worded commit had to be followed up with an ugly 'Fix a typo' message.
There is `git commit --fixup` and `git rebase -i --autosquash` for that ;)
We’ve found conventional commits useful in our mono repo. Instead of letting the authors deal with versioning (which sometimes breaks dependencies), our build pipeline determines the semver from the commit messages. This has made it easier to deal with releases for around two dozen packages by developers spread across three different countries.
> When you used a type not of the spec, e.g. feet instead of feat
This actually had me laughing quite a bit. Because of my love for dad jokes, here are some less conventional commits:
"fete" : adding holiday support
"braking change" : a change of pace
"nix" : removing a featute
"suffix" : adding a nice to have
Conventional commits pair nicely with a Lerna monorepo when deploying multiple JS packages at once. Auto-generated changelogs and automatic semver for packages. It's worked well for us over the past year.
https://github.com/lerna/lerna/blob/master/commands/version/...
lost me at “feat”
I would rather shorten the standard tags around it than have to shorten the 80 char short commit message. Typically tooling allows you to add your own tags (e.g. imp: for improvement) so you could add feature, but I find myself needing those extra few characters more often than not
I like best the irony of "refactor!". A breaking change with a title meaning there should not be semantic changes in the code...
I have to admit that in the GitHub and PR era I rarely look at individual commits or their messages. I look at whole PRs.
prefixes such as "fix: " are better expressed at the bottom of the commit message body.
They are metadata, and as such they shouldn't take more attention than the actual data.
This matters when you are in a bug hunt in production - you want to find the culprit commit as efficiently as possible, without distractions.
maybe it's just me, but things like this sap half the fun out of development.
I have a system that creates commit messages automatically. The commit messages themselves are YAML so that they can contain various bits of metadata - current task id, timestamps for oldest/newest known builds associated with that task etc...
feat of strength or great strength of feet?
...don't follow.
This would be better if it was called the Committer Convenant.