Data on Swearing in Programming Languages
mahdiyusuf.comI'm sure this same story came up a while ago, but whatever. Here's my take on the type of swearing in commits as per the story (XXXX represents swearword of your choice).
C++: "Finally fixed it. My XXXX head hurts"
Javascript/Ruby: "Finished adding 15 new features. I am XXXX awesome"
C: "No time to swear, busy hacking. Oh XXXX whoops"
Java/C#: "Wouldn't swear, unprofessional & boss is watching"
Python: "All happy, & swears are naughty"
PHP: "Whats a commit?"
(ps. Just for fun, no flamebait intended!)
I'm willing to bet a lot of the Javascript obscenity is "F*ck browser X", or words to that effect.
I know that the most popular tag in our issue tracker is #fuckie.
It's particularly fun to pronounce allowed as if it rhymed with "ducky".
Haskell and Lisp programmers are ascended beings, and are beyond the mortal desire to cuss in commit logs.
I've sworn my fair share when trying to write Haskell bindings to libraries that assume all languages have global variables.
CPython's module initialization functions are particularly obnoxious about this.
I'm going to assume that at least 25% of swearing in Ruby is attributable to William Morgan: http://sup.rubyforge.org/svn/trunk/lib/sup/imap.rb
Obligatory: and then there's Perl, which is swearing.
I realize this post is a little tongue-in-cheek, however, you cannot use it to draw any meaningful statistical conclusions. This data makes no sense until you factor in the relative popularity of these languages on Github. We don't know if PHP has little swearing because if it is a great language, or if PHP is rare on Github. Similarly, the reason Javascript and Ruby score so highly is most probably due to the fact that they are extremely popular languages on Github.
"To make sure that the popularity of one language over another didn’t skew the results, Vos grabbed an equal number of commit messages per language."
I don't think it really matters. It may be that PHP developers that swear a lot don't use Github, or that Ruby developers that don't swear use Bitbucket, etc..
#define DEFINE_GUID_RIGHT_FUCKING_NOW_DAMMIT(x) // ... stuff
I wrote that at 2AM once when I was at a start-up. I'm not proud. (Well, okay, I am).A couple years later (after the start-up had imploded) I was asked to consult to fix issues that cropped up in the code at a customer site -- they'd bought an SDK and were having problems. During a walk-through of my fixes I found myself explaining that line of code to a suit.
The suit nodded. "GUIDs. Yeah, getting those right is tricky."
[That consulting gig was sweet; short, but I was able to charge $300 an hour. I should have charged more, they basically didn't care.]
How can there be so little swearing with php? That's all I'm doing when forced writing that language. Love this type of stuff though!
I'm not sure there's as much of a culture of using version control among PHP devs (they generally just upload the new version via FTP) or I'm sure it'd be much higher..
The results are normalized to # number of commits.
No kidding. Usually it's "...what the fuck?" when reading the documentation.
Almost all of my work is either in C or python. When I do my embedded system development in C, I generally spend my day swearing at myself and cursing the world. When I do other components in python generally thing "wow. that was easy and fast. And it's so expressive. Wow! I'm in a great mood!". It makes perfect sense to me.
I'm surprised at the Javascript figure. I leave a few swears in server-side script but the only people that will see that are other programmers; Javascript, anyone can read that.
That said, I've found a couple of rants in my old JS from the IE6 days that, well, if I clean them up they're not really sentences.
Looks like he scanned commit messages, not comments, so the only people who could read them are people browsing Github or have access to the repo some other way.
I can certainly understand the high level in C++. But Ruby vs. Python is odd. There are so many similarities in the Ruby & Python communities. The main difference seems to be that Ruby has One Framework to Rule Them All, while Python has ... various things. Maybe that has something to do with it?
There’s something really fishy about this. Or, if there isn’t, I really want an explanation.
I find using http://www.google.com/codesearch is more accurate, since it covers a lot more code.
I don't swear in comments or commit messages, but I do often catch myself quietly muttering a string of obscenities while coding. At least I hope it's quiet.
Ha. I just made a hack about this at PennApps - http://www.commitlogsfromlastnight.com
Half of these commit messages may just be sourced from http://whatthecommit.com/
I'd say PHP is lowest because it has the most non-english devs. Might be interesting to check cross-languages.
I'm surprised C# is so low. Maybe a broken-mice/keyboards chart would be more descriptive.
From an old bumper sticker:
C code.
C code run.
Run, code, run.
Run, dammit, run!
Maybe it is the demographics of the programmers who are using the languages.
Well my C# libraries on github have got some fucking catching up to do.
funny thing, would also be interesting to connect the percentage of words used related to one language to see which one is more #wtf or less #shit