On Code Review

hugodias.substack.com

75 points by HugoDias 4 years ago · 57 comments

Reader

V-2 4 years ago

"Approving a pull request without even testing the code it’s very dangerous"

Well; code review is not QA. My approval means - I'm OK with how the code layer is knitted. It doesn't mean I've tested the changes.

Of course, if there's some logical problem with the implementation (such as the author seemingly failing to handle an edge case), any careful reviewer should catch this out.

Still, this is a situation where the overlook could be detected through theoretical code analysis, so to speak: which is what a review is. That's not testing.

"some 2~3 line changes may not need to be tested, but those are the exception, not the rule."

Generally speaking everything needs to be tested (not necessarily manually), but that's beyond the point. Testing is not reviewing, and vice versa.

kashug 4 years ago

I agree that code review is not QA. But I read it as the writer of the post comes from a team where they do QA on pull-requests. And QA is assumed to be done on a per PR basis. But here I think different teams will have different approaches and pipelines.
On my current team, we want most tests to be automatic. But the manual regression testing that might be needed is mostly done by other developers. A separate test-environment is created for each PR, and the link posted in the pull-request. So when looking at a pull-quest we expect people to do both a code-review and QA.
In my team approving a PR means you approve for the code to go to production. So you need to consider both code quality and whatever QA is needed. But thats just our way t do it - other teams do things differently. And I think thats OK, there is no silver bullet.
wokwokwok 4 years ago

Bluntly, this is the problem with code reviews: at the review stage it’s too late.
Changes from this type of review are usually fine details (“use a constant for this”, “consider this edge case”, “prefer this style…”) that can be done easily by the reviewer with no context.
If you’re lucky, you might get meaningful feedback (“consider using this approach instead…”), but many people just use code reviews as a gateway to merging code (1).
This is the pragmatic approach; just trust the other developers and doing their jobs and do a light pass check to ensure that everyone is aligned on approach and style.
..but, that’s not what you were tasked with.
You were asked to review the code, not the syntax.
You cannot review code you do not understand.
I’m sorry, but your mental parser does not run code. It does not render ui, update databases or generate concurrent race conditions. You may be able to approximate some of those, but most people can only do all of those things by actually running the code.
So… you may get some value from looking at code; but I question, in almost all cases, if teams contributed significant value by doing so.
[1] - see https://blog.jetbrains.com/upsource/2017/01/18/code-review-a..., etc. google “code review as gateway”.
fredrb 4 years ago

> Well; code review is not QA. My approval means - I'm OK with how the code layer is knitted. It doesn't mean I've tested the changes.
I'm 100% on-board with this. However, I understand were OP's mindset comes from. A few years ago I was working at a development shop in a fairly large project, where we had little to no automated tests. We frequently had PRs that would break main features. At some point developers were required to perform smoke tests alongside code reviews.
lhnz 4 years ago
```
  > Well; code review is not QA. My approval means - I'm OK
  > with how the code layer is knitted. It doesn't mean 
  > I've tested the changes.
```
You will be able to do more substantiative comments if you pull down their code and test it. It will improve the quality of your review, since it can be situated in the context of how the program works instead of how the code is abstracted. Your reviews should consider the practical workings of a program, and not just consistency with patterns or superficial bugs.
I know that more experienced people are able to notice bugs or edge cases due to having a lot of experience and understanding of prior art, but noticing incorrectly designed/implemented logic/behaviour is much easier if your review includes execution of their code.
If you leave this to some later QA stage or QA team then you're just making your reviews less effective and increasing the size of the feedback loop.
- siva7 4 years ago
  
  It’s a matter of efficient role separation. Most teams i’m aware of do not manually test a PR in the same cycle as a code review. It is absolutely expected that the PR author already tested his changes sufficiently. The manual test is usually reserved for the product or QA team.
  - lhnz 4 years ago
    
    It doesn't seem possible to me that "role separation" could be more efficient.
    If you wait until code is merged into `master` before doing further testing, you're saving yourself 15 minutes of time, but when a problem is found everybody will be disconnected from the source of the bug.
    If you, as a software engineer, pull down the branch and try running the code then you'll be able to find bugs, edge-cases, misunderstandings of the design (etc) and then literally point out where they occurred in the source code to your colleagues at the point of the review.
    It saves a bunch of time to do it this way, because it only takes <15 minutes to pull down a branch and execute the code, and finding problems at this point reduces everybody's feedback loop and allows you to give meaningful feedback in relation to the diff of the source code. (Note: I'd not do this with trivial changes.)
    (I think the only reason that people rest on "role separation" here is because they've never actually tried doing anything deeper so can't comprehend that it helps to find deeper problems.)
  - pixel_tracing 4 years ago
    
    Good luck getting junior or mid level engineers to do this properly each and every single time :) hell I’ve seen even seniors do this improperly
    
    ljm 4 years ago
    
    It's a lot easier to do in an application that's easy to boot up locally. A lot of software in web-startup world grows out of that stage once the k8s/distributed architecture earworm burrows in.
    
    WorldMaker 4 years ago
    
    It can be an automation thing. With tools like (Github) Codespaces there are places today that have the ability to deploy every random PR branches and do PRs in a full code editor backed with a running deployment that you can browse. Admittedly, that's a lot easier to do when what you are deploying is web-based and automating spinning up new URLs (even just proxy forwarded localhost URLs in the case of Codespaces) is easy and things are a lot harder to automate with for instance mobile apps.
    
    siva7 4 years ago
    
    If there are engineers on the team wo assign PR for review that aren’t tested by themselves or don’t contain automated software tests (if the project has such standard) it’s time for the manager to have a 1on1 discussion about professional standards. Sometimes juniors aren’t aware of these shortcomings or are coming from a shop with bad development practices. They just need sometimes to learn professional practices and there is nothing lost.
    
    BobbyJo 4 years ago
    
    If you can get people to test in review, I don't see why you couldn't get them to test before review. If you can't get your developers to test appropriately, then you've hired bad developers.
  - ozim 4 years ago
    
    Efficient role separation?
    When devs would only write code and never click anything around in the system how can they implement anything good?
    I don't believe in "efficient role separation" that is 100%. Devs still need to click the system around and still need to attend meetings that give business context. The same for QA, they should understand basic exceptions and know where to find logs to make good reports and not just make screenshot and say "not passed" and drop all troubleshooting on the devs or ops.
    Best outcomes I see are when you have multidisciplinary team where team members work together. Of course you have Dev or QA specialists but you have to have people who don't throw stuff over the wall and are focusing on delivering togheter.
NotAnOtter 4 years ago

A submitted code review comes with the following assumptions
1. The submitter approves of the code being merged into master 2. The submitted has tested the happy path and everything seems to work

chrismorgan 4 years ago

I also find code review to be an exceptionally effective way of learning a new language, especially languages that have a different mental model from languages you’re already familiar with. I do some Rust training and mentorship, and have found the most effective ways of teaching (once past the basics) to be reviewing their code, giving suggestions on why this pattern might not scale well in Rust, how you can take advantage of ownership here, how to be pals with the borrow checker rather than cloning stuff all over the place there, how expression orientation can make your code neater and more readable, how iterator chaining could help you play to Rust’s strengths, how to work with rather than against Result-based error handling, why you should really give up trying to implement something in an inheritancy way and what an alternative design might be, &c. &c.

gocartStatue 4 years ago

I find post-coding, pre-merge code reviews ineffective, for a change. The code was already written. Time/money spent. Finding out that solution is subpar at this stage is costly, and delaying integration makes ineffective teams.

If you want to enforce standards, automate it (there are multiple highly configurable tools for that). If you want good solutions, pair program. If you want to maintain good codebase, review it periodically (codebase, not some changes to it contained in pull request) and refactor. Use TDD / BDD to ensure the app still works.

V-2 4 years ago

"I find post-coding, pre-merge code reviews ineffective, for a change. The code was already written. Time/money spent."
This implies the need for smaller pull requests maybe?
If work is broken into smaller chunks, addressing peer feedback isn't as costly, because there's time to back out of poor decisions if the peers get alerted to them early on.
Changing the course once the implementer has indeed invested lots of time and effort into a flawed solution is ineffective, true; but that might just mean the feedback was asked for too late.
"If you want to enforce standards, automate it"
Agreed, as long as you're thinking standards as in indentation, naming rules etc. All the simple stuff can be done with code inspections, and employing humans to enforce these "mechanical" standards is a waste of time. No argument there!
However, there are higher-level standards that can't really be automated though, or at least not effectively. Eg. various architectural conventions and the like.
"If you want to maintain good codebase, review it periodically (codebase, not some changes to it contained in pull request) and refactor"
Well, I can see how it could work in a small, lean project... but in some multi module behemoth with millions of LOC and history spanning back years?
"The code was already written. Time/money spent" rings much more strongly in that scenario that in reference to "regular" code review.
- KronisLV 4 years ago
  > This implies the need for smaller pull requests maybe?
  This makes me feel like perhaps pull requests shouldn't always pull/merge the changes directly, but instead there should be something like the GitLab "Start a review" functionality, where you can queue up code comments, before submitting all of them at once, but for pull requests.
  As a Jira analogy, imagine the totality of the changes that you want to merge as an issue, but the individual steps/smaller requests as sub-tasks. The entire task would only be considered done after all of the sub-tasks are done, similarly, the changes should only be merged after all of the sub-requests have been successfully reviewed. Thus, development could happen more iteratively, with feedback sooner.
  Otherwise you end up with something like the following being separate requests:
  - FEATURE-123 add data models for the feature - FEATURE-123 add service logic for the feature - FEATURE-123 add logging and validations for the feature - FEATURE-123 add unit tests for the feature - FEATURE-123 add REST endpoints for the feature - FEATURE-123 add front end code to forms - FEATURE-123 add front end logic to use the REST endpoints - FEATURE-123 add integration tests for headless browser
  You don't actually want these changes to end up in the main branch before they are done. Alone, they provide no actual value and could only confuse people. Furthermore, if there are changes that are needed to the data model after it has already been merged, then you'd also need to do those within a later merge request, or one in the middle, which would just pollute the history and serve as spam, instead of being able to add more commits to the other sub request.
  To me, all of this seems like a problem that stems from developers viewing code review as something to only be done at the moment when you want to merge the feature to your main branch. Even the tooling that we use is built with this in mind. It feels like there are better alternatives.
  EDIT: So, you'd need something like the following:
  - (WIP, 3/8 reviewed) FEATURE-123 add new feature // feature-123 branch, cannot be merged into main yet, review the below first - (reviewed) add data models for the feature // feature-123-data-models branch, will be merged into previous - (reviewed) add service logic for the feature // feature-123-service-logic branch, will be merged into previous - (reviewed) add logging and validations for the feature // feature-123-logging-and-validations branch, will be merged into previous - (in review, 9/13 comments resolved) add unit tests // feature-123-unit-tests branch, will be merged into previous - (registered) add REST endpoints for the feature // feature-123-rest-endpoints branch, will be merged into previous - (registered) add front end code to forms // feature-123-front-end-code branch, will be merged into previous - (registered) add front end logic to use the REST endpoints // feature-123-front-end-rest branch, will be merged into previous - (registered) add integration tests for headless browser // feature-123-integration-tests branch, will be merged into previous
  (personally i think merges make the most sense, but some people prefer rebasing; it's just an example)
  You can technically do the above already, by having a main feature branch and functionality specific feature branches all of which get merged into the main feature branch, before then being merged into the main branch after the feature is complete. It's just that people usually live with the view of: "One merge request == one feature", which i don't really think should be the case.
  - kashug 4 years ago
    
    > You don't actually want these changes to end up in the main branch before they are done
    I see no problem with that if it is a small change that can de deployed safely. (for instance hidden behind a feature-toggle where applicable).
    We tend to strive for small, short lived branches. We usually don't want PRs with more than a couple of days work - to keep them small, easy to test. It also makes the amount of changes going to production at any time small.
    The best would be if any specific feature is small enough to only be a few days work. But out experience is that it is not allways easy to do that - and sometimes some features will take weeks before they are done.
  - tsimionescu 4 years ago
    
    While I think this would be nice to have in principle, I think the devil is really in the details. For example, the structure you show for these sub-commits is one I generally hate to review - the data models may look OK when I look at them, but then when I see the implementation code, I realize that the data models were inadequate etc.
  - barrkel 4 years ago
    
    FWIW, that series of small separate requests, reviewed separately, is pretty much how code review works in Google. Each change is attached to the same tracking issue which is what tracks completion, rather than any single or group change being committed.
    Yes, alone they don't provide actual value, but they are small (easy to review) and can be trivially / automatically rolled back (database migrations aside) and can be deployed (without being used) safely, and even when they are exercised, it may be via a switch, itself toggled with a pull request which itself can be rolled back in isolation.
    Whatever you think is a small change, think even smaller. Think creating a class and stubbing out many of the methods with TODOs.
    Delaying the merge until all commits required for the feature wouldn't work in a large monorepo - just keeping small commits up to date can be work when there's enough people working around the clock. There's no feasible way to sustain continuously rebasing in a feature branch model in a large monorepo.
  - V-2 4 years ago
    
    "You don't actually want these changes to end up in the main branch before they are done."
    True - and it sounds like a textbook scenario for introducing a feature branch... this approach has worked well for me.
    The idea of queueing up "sub-requests" is interesting, but it comes at the cost of creating another layer of complexity (in my view).
    
    gocartStatue 4 years ago
    
    Feature branches introduce more complexity than real value (bold statement, right)?
    I really enjoy Dave Farley's videos on youtube - "Continuous Delivery" channel. A friend recently sent me this piece from Atlassian: https://www.atlassian.com/git/tutorials/comparing-workflows/...
    
    KronisLV 4 years ago
    
    Agreed! I guess what i'm trying to say (in an awfully wordy way) is that we could probably improve the UI for pull/merge requests and gain a lot from it.
    I mean, GitHub and GitLab already improved many development workflows with their inline comments (as well as other features, like code recommendations) and it feels like sooner or later more improvements are bound to come!
    
    WorldMaker 4 years ago
    
    I don't have as much experience with GitLab, but I actually think Github has a lot of tools in their PR system today for doing "early" or "work in progress" PRs incrementally. You can start a PR on an empty branch. You can apply a label "Work in Progress" and start discussion immediately. If you review a chunk of code the PR is pretty good about showing you what you've already reviewed versus what is new in the most recent update. If for some reason you feel a need to start over and force push an entirely new branch to the PR, Github's PR system handles it better than you would expect (and auto-closes old comments, etc).
    I've found the tools are better than people expect them to be and the hard part for me is convincing junior developers that it is "okay" to start PRs "early", that they don't have to feel like they've finished a task to get comments/reviews on work in progress. There's a lot of embarrassment/group politics in the way sometimes from using "PR early" workflows.
    
    boleary-gl 4 years ago
    
    GitLab employee here - that is true of GitLab too.
    But I agree the bigger challenge than any tooling challenge is getting the team convinced that starting a PR/MR early is not only okay but also the way to operate. Creating physiological safety around it as well as making iteration a part of the day to day workflow is a challenge but an important one.
frank_bin 4 years ago

I tend to agree. I prefer design drafts (documents) instead of pair programming, though.

andrewingram 4 years ago

I've observed a small (yet growing) sentiment emerging lately that code reviews aren't all they're cracked up to be, and can actually be a net negative. I'm somewhat on the fence and can see all the arguments, though still err on the side of code review. But I am somewhat convinced that the way most teams (at least those i've seen) practice them isn't ideal.

ozim 4 years ago

I think the problem here is that people who thought CR will be some "silver bullet" see it is not working for them.
Code reviews are in my opinion still a good tool. It works well in other industries as well. You want to bounce stuff from the other people before pushing it further.
Where it goes wrong is people - so it is people problem not the tool problem. If you do not have good people you won't fix them with code reviews.
It is the same with agile, managers tend to think they can just follow some agile coach and can hire anyone to the team. But the truth is, I see bad teams going bad with or without agile and good teams agile or not going strong.
freddref 4 years ago

Do you have links to reading on the arguments against code review?
- andrewingram 4 years ago
  
  I'll see what I can dig up. Some of it is from Allen Holub, who I appreciate is often met with mixed responses here; but his angle is mostly that if you have sufficient up-front collaboration (he favours mobbing), code review don't offer you anything on top of what automated tools provide, and can actually erode team trust/morale because they're perceived as a barrier often introduced as a way to say "no" to people.
  I don't agree 100%, but I am on-board with the sentiment. I've seen systemic problems of under-communication pre-review which results in me looking at a PR and seeing a fundamental issue, but not knowing how to approach tackling it without hurting feelings. I don't think eliminating reviews is the solution, but I think it's much less of a critical step than it's often made out to be. My take is that if reviews are actually preventing major issues with released code, they're almost certainly including work that should've happened much earlier in the development process.

pfdietz 4 years ago

Just asking the person who wrote the code to explain it better so I could properly review it can be helpful. I've had cases where this led to "yeah, that's not right, I'll redo it" even if I had no clue what was going on.

monster_group 4 years ago

Just to be clear - I am strongly in favor of code reviews and do them every day. However, the article advocates spending 20-40% time on doing code reviews. Good luck convincing your manager why your productivity is so low when others are churning out high quality features (because you spent 20-40% of your time reviewing others' code but they didn't spend commensurate time reviewing yours). Net result - other people's code is higher quality, they get more done. Your code is lower quality and you get less done. Good luck getting that promotion or even retaining your job.

CraigJPerry 4 years ago

In those environments i'd expect a tech lead to take the initiative to stand up and explain / defend the team's quality control processes.
If the management rebukes, you move employer.
Quality is not something the customer or your manager is responsible for. It's entirely on us as professional developers.
That said - code review is not the only way to ensure quality. Do what works. Pairing generally works even better for getting quality review time in.
detaro 4 years ago

These things obviously are to a degree a matter of team-level process and policy. In other teams it might very well be "good luck getting that promotion if you constantly refuse to participate properly in code reviews".

codeapprove 4 years ago

Almost everyone agrees that code review is important and worth spending time on, so why don't we think much about our code review tools? Most of us just accept whatever GitHub gives us (esp. smaller or newer teams)

I'm building CodeApprove (https://codeapprove.com) to be a much more powerful, enjoyable, and efficient code review tools for teams on GitHub. If you're interested in trying our Alpha email me.

raducu 4 years ago

I must be the exception, but I've only had bad or really "meh" experiences with code reviews:

Doing the reviews myself:

A lot of simple code splintered in 100 files -- I and all others that I know only review the diffs.

if(THIS_COERNER_CASE){do this ; do that}; xml configs relating to this or that business case

There's nothing to review, really, I don't know the business case, the code looks okish.

I really don't care about unused imports (unless you make a special commit called "code cleanup" that pollutes history and I hate you for that), lines too long, if you used nested ifs or returns, unless it's really nasty code (which I very rarely stumbled upon, usually written by some coding style nazzi in his youth);

Feedback for my own/others code:

- this or that "coding style" issues by some coding style nazzi; almost fine, it improved my inner compiler/code formatter

- "don't use this language feature, because I don't like it" from the highly oppinionated nutjob that leaves the company in a couple of months anyway to join a smaller company where his genius is appreciated more

- "why did you do this and that, you should have done...." half an hour later in person/over the phone conversation... "oh, yeah, I guess that actually makes sense"

- "whaaaaaat, you expect me review this monster commit? Why didn't you break it in multiple smaller commits?" -- because I care about the convenience of the person that is actually interested in investigating a feature/bug 5 years from now and not of the lazy code reviewer now

- "How could this security bug have slipped us, we have CODE REVIWS, don't we?"

eertami 4 years ago

> because I care about the convenience of the person that is actually interested in investigating a feature/bug 5 years from now and not of the lazy code reviewer now
Doesn't seem like such a big problem to make a PR with multiple commits, and then squash and merge into 1 commit after the review.
- pm215 4 years ago
  
  Personally I think that is pessimal, because if I am the person investigating in 5 years time I want to see the change split up into multiple logical commits. So lots-of-commits-plus-squash-and-merge requires the developer to do the work of splitting up the feature implementation but doesn't give the benefit of having a usefully split set of commits to the future-debugging-person...
  - raducu 4 years ago
    
    The change is a feature, it is by deffinition a "logical commit".
    Like "migrated this JMS queue to kafka", that can span across tens of files.
    Just because the commit is big doesn't mean it's not also logical and consistent.
    In fact the reverse can be said -- if you split the commit in smaller commits (and your build is still green with each commit) -- each commit can seem contrieved, and good luck later on, when you try to understand how a piece of code works and how this xml/drl/properties file is connected to this piece of code if you have multiple commits.
    If it's a single commit, you can usually narrow down your search tremendously and can easily see the logical deoendencies.
    So for me, if it's a JIRA issue, I usually squash my commits before I push.
  - WorldMaker 4 years ago
    
    I agree. If you want the "clean" straight line from the git DAG just use --no-ff in your PR system and you get a nice clean integration log with things like git log --first-parent (git praise --first-parent, git bisect --first-parent) and can still keep the extra history around for those of us that find it useful when investigating things 5/10 years down the road.
- maccard 4 years ago
  
  > Doesn't seem like such a big problem to make a PR with multiple commits, and then squash and merge into 1 commit after the review.
  I fundamentally disagree here. If you're reviewing code, the code that is pushed should be what is reviewed, not a draft of what is reviewed..
  - somebodythere 4 years ago
    
    If you squash and merge shouldn't the final pushed code be exactly the same as what was reviewed?
    
    WorldMaker 4 years ago
    
    Not necessarily. There's a lot of dark arts to merges and there's a lot that can be hidden in them. If you've never heard of, as just one for instance, the git rerere cache and campfire horror stories of how it can be corrupted consider yourself lucky.
HugoDaniel 4 years ago

you forgot the one where you go like this
- "you are right, i completely forgot about that. thank you for catching this, i am changing the code and asking you to review it again after the fix."
chromanoid 4 years ago

The most efficient peer reviews I participated in were early code walkthroughs (prototyping/implementation phase) combined with code reviews. The actual code reviews were much more interesting because you could see what and how the programmer implemented points of critique that came up in the walkthrough. Navigating through the code was also much easier (with bigger changesets), because you already knew to some extent what was changed.

habitue 4 years ago

I am really looking for a "code review is a waste of time, let's skip it" article. Seems like there's a market need that isn't being fulfilled.

V-2 4 years ago

Sounds like the good, old "X considered harmful" template, where the headline is clickbaity, but the content actually boils down to "X considered harmful ...if done badly" (doh).

posharma 4 years ago

Hasn't enough been already said and written about the importance and effectiveness of code reviews? Do we really need yet another article on it?

mskec 4 years ago

Not everyone reads all the articles. This one has been very useful to me and I'm thankful for the submission.
- posharma 4 years ago
  
  Any software engineering book worth its salt covers this topic. Besides a quick internet search on code review best practices will give you several well written articles. What’s the point of writing and discussing the same thing repeatedly. Seriously move on already! My point is why beat a dead horse. Instead write about something new.

Settings

On Code Review

Keyboard Shortcuts