Repeat yourself, do more than one thing, and rewrite everything (2018)
programmingisterrible.comI've said it before and I'll say it again - I should encapsulate it as a law.
BeetleB's Law of DRY: Every article that complains about DRY will be a strawman argument.
The DRY acronym came from The Pragmatic Programmer, and almost every instance of DRY people complain about is not at all what is advocated in the book. There are different ways of interpreting what he wrote, but my version is: "If you have two separate requirements that are very similar, keep them separate in code. If your duplicated code is one requirement, then DRY it into one location in your code."
So this:
> Following “Don’t Repeat Yourself” might lead you to a function with four boolean flags, and a matrix of behaviours to carefully navigate when changing the code.
Is not DRY. In fact, having boolean arguments is almost a guarantee that you've violated DRY.
Another way to know you've violated DRY: If one requirement changes, do I need to add if conditions to the DRY'd function to ensure some other requirement doesn't break? If yes, you're in violation.
Never tie in multiple requirements into one function. Or rather, do it but don't call it an application of DRY.
I personally agree with your point of view on what DRY is. But,I don't think that's what's being taught or talked about anymore. When people say DRY they usually mean an abstraction layer that merges two similar concepts into one, and the DRY you talk about is just standard operating procedures. Static analysis tools will ding you on DRY rules if you have code that looks pretty similar, urging you to refactor them to be the same. Etc..
Literally had to tune out a super senior employee this morning who was talking about this flavor of DRY.
Invoking the same function from two places in the code with vaguely similar parameters is lot repeating yourself, dude.
Sometimes I wonder if the amount of sanity in the industry is finite and adding more people just makes us all crazier.
Its a zero sum sanity game and the companies trade us around to trying to win the most of it
+100
Which is exactly why static analysis tools that force you to do something need to be shot. Static analysis tools that inform you about a possible duplicate are totally fine. Give me an option to disable that particular instance.
Co-incidentally, micro-services do away with such problems in many cases due to the fact that code is "separate" and thus analyzers and sticklers don't find the "duplicates" and you can write beautifully simple code. Unfortunately it has the opposite problem then of leading to things like this Netflix architecture https://res.infoq.com/presentations/netflix-chaos-microservi... but for something simple like a personal blog (yes I exaggerate - slightly)
In the end I think the only solution is to have the right people and stay small enough to keep the right culture. That probably goes against all your metrics and growth goals of the company of course.
The only term I’ve found that gets me any relief (and honestly, it’s not much) from this sort of people is “idiomatic”
You don’t need to factor out idioms. That’s lunacy.
Doesn’t matter what it originally meant, people take it to mean you shouldn’t have repeated code and that’s the DRY we all live with which results in bad abstractions. I don’t think it’s a straw man at all, unless you use your specific definition of DRY which isn’t very useful
People are welcome to co-opt the acronym and give it another meaning. The issue is that the original DRY is a damn good principle, and it is more important to give it a name and propagate that knowledge.
If all we do is rail against the "new" DRY and forget the original one, then we are at a net loss.
I'm with you - the violations of DRY I still see regularly are clear cases of copying and pasting exactly the same logic (or magic literal value) when there was no reason not to put it in a helper function or named constant that could be referred to in both places. A code review I did yesterday had that - there were 5 or 6 lines of code that, starting with a particular regex, did some data massaging. It was determined in some cases a different reg-ex was needed for a second pass over the data, and so the submitter had simply copied the 5-6 lines of code and just changed the reg-ex used. I see that sort of thing 20 or 30 times more often than code that gets itself into knots because of excessive abstraction trying to avoid code repetition.
Some people feel productive when they can type a bunch of code quickly. And the only way to do that is to get muscle memory for writing the same code over and over and over.
I think I can categorically state that virtually every DRY violation I've come across involved little more typing than Ctrl+C Ctrl+V.
I’ve dealt with a few bad coding patterns over the years that were not copy and paste.
Though there have been a couple memorable occasions where I had to completely remove a bad pattern from the code to get people to stop using it.
To be fair, in your example it sounds like the repetition is very local and easily recognised for what it is. Not ideal, but hardly a poster child for when DRY is impactful.
If the change was otherwise good, I would remark on the repetition as "here's how I would write it differently" and not "go back and fix it now".
The first time someone changes four of the cases the same way but misses the fifth, though. That sounds like a good time to refactor.
Just came across another slightly more interesting example. We have code that has to do essentially the same thing for three different document types: it loops through a passed-in list and for each item, adds an object to a document, then initialises that object with details from the item. The "logic" in all 3 cases is exactly the same, yet there were 3 different implementations, one for each document type. The only difference between the 3 is that the function you need to call to add the object is different for each document type, and the library we're using (the MS DocumentFormat.OpenXml library as it happens) doesn't provide an abstraction for just calling the same function regardless of document type. In fact, creating such an abstraction wouldn't be complex at all - if you look at the decompiled MS source, they've basically implemented the same function 3 times, but using an internal function not available to us. As it happens there was a bug in our initialisation code (missing null check!), but of course it was duplicated across all 3 functions. I basically had 3 options:
In the end I went with c) because it required the least amount of code, and the only genuine duplication really is the foreach statement. But arguably if MS had kept their library DRY in the first place we would never have ended up with so much code duplication.a) make the same null check fix in all 3 versions, thus resulting in 3 even longer functions with the same logic b) collapse all 3 functions into one and provide an abstraction for the "adding object to document" part c) refactor the 3 functions to use the same helper function just to do the object initialisationThis is a significantly better example! When non-DRYness leaks out over interfaces it becomes much more of a problem.
I love the principle but do we want to save the principle or save the acronym? At this point IMO it's a lost cause. I wish someone with a following would make a retronym of TIE or SYNC to express it.
I remember when The Pragmatic Program 20th Anniversary came out, the authors, in interviews and the new edition itself, described DRY as "the most misunderstood" concept in the book. If for 20 years that is the most misunderstood concept, then maybe the name is not the best.
I propose the following: DRBL - Don't Repeat Business Logic
It's funny, because "dribble" or "drool" is the opposite of being dry ;)
Although it doesn't have as fun of a pronunciation in English, DRBR is probably a better acronym? Don't Repeat Business Requirements
Well, DART (Don’t Assert Redundant Truths) might be a better name for the same principal, though its perhaps somewhat opaque when doing imperative rather than declarative (e.g., functional/logic/relational) programming, since people are less likely to consider the former to constitute “asserting truths” in the first place. But it does get more to the point that thing you want to avoid isn’t code that looks similar, or even which mechanically does the same thing, but code that represents the same facts.
I have a set that’s harder to corrupt, but still not bulletproof:
Source of truth, system of record.
Business decisions should have a source of truth. But the domain of things we want to duplicate or not duplicate is bigger than just the business rules.
If you are asking two questions, it’s okay to have two implementations. If you ask the same question twice, you should get the same answer, not just the same output.
But miscreants can twist the meaning of five different parts of what I just said. Like what even is a business rule? It’s whatever the last thing they said before you got them to stop talking. But if they come back later and want something that disagrees with what they already asked, they’ll wriggle like a fish on a river bank trying to gyrate a way to interpret what they asked for to say you’re wrong (and therefore you should work nights and weekends and we don’t owe you a raise).
The problem is with the entire concept of development "principles". They are a bad way to propagate knowledge. I suspect more people have an incorrect understanding of DRY than not. Seems like a net loss to me.
We should ditch these principles altogether and focus on teaching a deeper understanding of these concepts that captures the nuances.
I don't think you can move away from principles in general. The reality is that most SW design is subjective. Not reusing code is a generally good principle. It's just that the misapplication of DRY is following one good principle but violating another one (requirements should be decoupled).
In any case, the reason I go on the anti-rant rant each time is because when I use DRY appropriately, I don't want some idiot flagging me in a code review saying "Don't do this. DRY is bad. Here are N blog posts explaining why" - when none of the blog posts are complaining about what I am doing.
I don't think you can separate the principle from its misapplication. It's misapplied because it tries to stuff useful knowledge into a memorable phrase and the nuance is lost.
Flagging code in code review is another great example of harmful behaviour principles encourage. I've stopped referencing principles altogether in code review and I encourage others to do the same. Instead I focus on trying to explain the specific impact the code will have on our specific codebase.
> I don't think you can separate the principle from its misapplication.
In practice you are right, but this is just part of the human condition. There is no substitute for experience. Wisdom can’t be taught. The map is not the territory. Yada yada.
Principles still have value as short-hand for knowledgeable practitioners though. In fact, they have outsize value in this case because strong programmers will recognize and reflect on both the upsides as well as downsides discussed here. Communication bandwidth is the single most important thing to scale teams working on irreducibly complex domains.
I also sign up into thought school where “principles” went bankrupt. When I see someone quoting principle as a sole reason for code change it automatically shouts it is shallow explanation without much thought.
We could adopt a principle to solve this problem: something that reminds us that abstractions aren't free.
This is the source of the problem of the blind application of principles - which tend to increase the number of abstractions. They aren't free, and people act as if they are.
Even good abstractions have a cost. But in a good abstraction, the benefits outweigh the costs.
Looks good for me. I also see that people treat abstraction or indirection as "always good" or at least "free".
People are still coming up with best practices for accounting, a field with thousands of years of history. Principles are fine but not final
I don’t think it was co-opted, I think it never stuck. PP didn’t invent the idea of deduplication. I’m not even sure they invented DRY. But when that book was brand new there were already people misusing the idea of deduplication.
I feel exactly the same way about DI.
Structuring code so that abstractions don't depend upon implementation details is in my top 3 principles of all time (along with pure functions and good typing).
DI frameworks a la Spring and Guice just annoy me.
Yep. And then you get bloated constructors taking in a dozen arguments many of which are just needed to pass along to the parent class constructor, and - drum roll - the CI tools complain because your constructor is similar to one in another completely different class that just happens to need similar dependencies due to the abstractions and that's a copy-paste-detection fail.
So you refactor everything, factor out the constructor, and then it passes but now you need to add a new dependency so you're right back to the same nonsense. And/or you have tons of classes getting dependencies they don't even need, because some do so the parent has to have them all.
Traits can help some.
But the abstractions and DI that were supposed to make things easier still often make things more complicated.
DI has a place. In my opinion it's for plugin systems. If you have a core system that enables third parties to extend DI can be brilliant.
I think it's very useful
Yeah I'm pretty burnt out on the DRY articles. It's so easy to misrepresent something as an absolute and talk about how it is wrong. As for DRY, it also means they don't understand / or aren't honest about what DRY is about and I immediately am skeptical about the author.
> As for DRY, it also means they don't understand / or aren't honest about what DRY is about and I immediately am skeptical about the author.
Indeed - I didn't bother reading the rest of the article.
Your re-definition of DRY is essentially that when code does the "EXACT" same thing it should be pulled into one place. Which is not something people will disagree with.
But the actual DRY definition is a little more nuanced.
> Every piece of knowledge must have a single, unambiguous, authoritative representation within a system
And this is what OP is referring to. It's the little abstractions that become big abstractions in the name of DRY that can over complicate code bases.
When it comes to heuristics intention doesn't matter. If the end result of DRY is that most people over-apply it then it is a bad heuristic.
> Your re-definition of DRY is essentially that when code does the "EXACT" same thing it should be pulled into one place. Which is not something people will disagree with
Sorry, not sure how this is my redefinition, and I would not by default agree to this. If you have code that does the exact same thing, but they are for separate requirements (which does happen), then I would not recommend refactoring to one function.[1]
If they are for the same requirement, I would.
> > Every piece of knowledge must have a single, unambiguous, authoritative representation within a system
> And this is what OP is referring to.
Sorry, but my original comment is that this is not what the OP is referring to. If you abstract into something with a lot of booleans, chances are that function is now related to multiple pieces of knowledge.
[1] I may still do it, but with the understanding that I may need to undo it when one of the requirements changes.
I prefer to use the rule, single source of truth. If you have some business logic make sure you always call the same code to run that logic.
That way you're not looking for abstraction layers to do avoid lines of code. You are looking to make sure you don't have subtle bugs that only happen in some code paths. You also only have one place to update as the business rules change.
It's the same as dry but avoids confusion.
> The DRY acronym came from The Pragmatic Programmer
> almost every instance of DRY people complain about is not at all what is advocated in the book.
It's futile to bang the drum insisting that people follow the original meaning of what the creators intended. Look at what happened to Agile. Completely perverted of its original meaning. You can't argue the case for Agile anymore just by saying that the creators of the Agile Manifesto meant for it to be something completely different, because what other term can you use to describe the micromanaging process-heavy framework has currently taken its place?
DRY may have been introduced with a caveat to not conflate two pieces of code that are only incidentally similar, but people have completely disregarded it and used it to propagate code monstrosities. It's not a strawman argument, we need a term to describe situations where functions end up using four boolean flags.
I agree. The big gotcha with DRY is what counts a duplicate. To me it's duplicate _requirements_ not just code that happens to look similar at the moment.
I phrase it as "duplication is a tool". Which is to say if something is actually definitely for sure the same requirement, you can de-duplicate it to have the compiler/tooling enforce and uphold that design constraint. This is good!
Many "duplicates" are really only similar, temporarily, and by coincidence. In those cases, it's not really "duplication" and keeping it separate is almost certainly a better choice.
That's what usually happens when less experienced devs read upon DRY (among other classics that every dev has sometimes heard of as a "best practice") and try to adhere to this biblically paired with an anxious code review environment that code has to be "clean code" perfect. The early MVC-Pattern-Communities (Rails etc.) with their overusage of acronyms like DRY contributed their weight into misguiding new devs.
> BeetleB's Law of DRY: Every article that complains about DRY will be a strawman argument.
Counterpoint: Every article that defends DRY will be a No True Scotsman argument.
No True Scotsman is relevant only when one cannot objectively define a True Scotsman (i.e. not in a recursive manner).
Here we have an objective, original, unambiguous definition of DRY. So no, this is not a No True Scotsman fallacy.
I tend to implement my DRY with OOP (polymorphism).
That doesn't go over well, with today's crowd. Not considered "cool."
I don't see how you can throw away the modern definition of DRY and put critiques of it (the modern definition) to bed just by pointing out the historical origination. They're completely different topics and the discussions thus need to be insular.
They can't be meaningfully separated because they're using the same phrase with almost, but not exactly, the same meaning. One version having a sanity check (don't repeat the actual same logic/information, at least not excessively; usually it's paired with the "Rule of Three" which is three repetitions then look for a refactor) and the other not (don't repeat anything that happens to look alike and don't actually think, follow this rule like it's written in stone). If these aren't distinguished, then you end up with everyone talking past each other both thinking the other is an idiot (rightly from both perspectives).
People arguing against the latter are making a sane and reasonable argument. And people arguing for the former are making a sane and reasonable argument. But if, in the same discussion, both senses are meant without qualification or clarification then only confusion will be found.
Then start your critique of DRY by pointing out there are multiple definitions of it, and that you are referring to one definition.
If most people "straw man" your position then it's not because they're straw manning your position but because they misunderstand your point because it is not clear enough.
If your catchphrase is Don't Repeat Yourself and some people take it to mean they shouldn't repeat themselves, the fault is entirely with the catchphrase you used.
If a third set of people then take it upon themselves to tell everyone that Actually You Should Repeat Yourself Sometimes, this is not them attacking a strawman, it's an attempt to clear up the confusion caused by the phrase/acronym DRY.
The original definition of DRY was "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." That's not what Don't Repeat Yourself means, if read literally. Because DRY sounds like it applies to code instead of to knowledge, of course it's widely misinterpreted! If they'd called it the Fight Unnecessary Copying of Knowledge principle nobody would be having this argument and we'd all get to save ourselves a lot of time.
Yeah, that's a good one. I have found, however, that denormalized tables designed for the queries that you need to do can make a massive impact on performance, especially at scale.
However, it's also a massive pain in the Automated System Structure to keep them updated properly.
>“Don’t Repeat Yourself” often gets interpreted as “Don’t Copy Paste” or to avoid repeating code within the codebase
When I think of the most difficult to understand code I've come across it was probably written by someone who lives and breathes that interpretation of DRY.
But it doesn't end with code comprehension. Extreme abstraction and countless files and components also lead to buggy and difficult to maintain code.
It's easy to lose understanding of branches and business flow when abstraction exists in the extreme.
DRY needs to be balanced with SRP (Single Responsibility Principle). You can legitimately have two functions that are exactly the same but they should not be DRY'd up if they are actually serving different purposes.
The use cases will likely diverge in the future, and if the functions are DRY'd making changes will make introducing bugs from the calling code that you're not working on easy. Eventually the single function will likely have a lot of conditions in it, which is a red flag for this situation.
>You can legitimately have two functions that are exactly the same but they should not be DRY'd up if they are actually serving different purposes.
I use what I've come to call the "Rule of Three" here.
The first time, don't even consider any kind of abstraction. Just do the thing.
The second time, don't abstract yet. Repeat yourself, but do it in a way that will scale and make abstraction possible, while mentally noting how you might abstract it.
The third time, abstract it.
Adhering to this, the vast majority of code will never reach that third stage, thus taming the complexity beast.
I do something similar, but not as precisely specified as this. But I like it.
It's similar to my own rule about when to add a tool. I wait until I'm doing something else and I feel the lack of that specific tool. The first time, it's on my radar. The second time, it's time to get the tool. If you get it on first impulse, you'll drown in tools (and won't give the tools you have a fair shake). If you wait too long, you'll acclimate to not using the proper tool.
I use the rule of "Annoyance". When I get annoyed by having to rewrite what already exists, I'll try to make it DRY, or when I have to fix things in multiple places many times.
I outsource a lot of decisions to that feeling, so I can focus on other things.
It's a matter of time anyway when it's really easy to ask ChatGPT to refactor everything into this perfect code.
> Adhering to this, the vast majority of code will never reach that third stage, thus taming the complexity beast.
yeah, because you probably forgot about at least one of the other repetitions somewhere in the code
as a developer you should know if it makes sense or not
I call it WET, “Write Everything Twice”. It’s catchy enough that people remember/follow it, esp with the antonym making it memorable :)
Thanks for putting the process of what I did twice in the last two days into clear, coherent words and logic.
I usually phrase this as balancing against loose coupling.
I've had bad experiences with the single responsibility principle. It sounds kind of right, but in practice "responsibility" is too vague and often surprisingly hard to agree on what is e.g. one responsibility vs. three responsibilities.
By contrast, loose coupling is more objective and can (at least in theory) be measured.
I call these frankenframeworks. The constant drive to DRY and reach the supposed nirvana of code being a DSL of pure business logic leads to more and more implementation details being shoved under the rug to deeper and deeper layers. But for some reason there’s no foresight that any non-trivial change requires changing more than just the business logic and so you have to resort to bolting on config options, weird hooks, mixins, “concerns”, and global state for no reason other than it’s all you can do to reach down the layers.
> When I think of the most difficult to understand code I've come across it was probably written by someone who lives and breathes that interpretation of DRY.
Totally agree with this interpretation and why I wrote a recent article about it: https://bower.sh/anti-pattern
Great article, particularly agree with the tests part. Sanity check tests (function doesn't throw an exception in common path) are essential but following all possible paths and verifying all outputs is the road to diminishing returns and subpar output.
Back when I took a Comp sci course 16 years ago (shit) I was taught that a single function should try to fit within a screen. The idea is that breaking a task up into digestible steps would both harbor more readable code, self-documentation and code reuse.
Good idea, wrong metric. What should fit in a screen is the concept. If you take a big, hairy, complicated thing and just chop it into several functions that fit on a screen you have gained nothing.
If you name the functions reasonably, and the functions don't interact except through arguments and return values, you have absolutely gained something.
Probably a good case for inclining: http://number-none.com/blow/blog/programming/2014/09/26/carm...
Inlining is the compiler's job, not mine.
You probably should read the article as Carmack makes the case for inlining as a style, not inlining as an optimisation.
I did read the article. I think I'm less in favor of that style than he is (or at least was), but even he said that the problems it addresses arise from subroutines not being pure functions and so isn't a proponent of it in the case the comment you replied to was describing. And when not writing in that style, inlining is for the compiler to do.
The original poster said if you chop a big hairy complicated thing into functions to fit on a screen you've gained nothing. The response is that if they're named reasonably and pure then you have.
Now the reason I posted that inlining article from Carmack leans heavily into this line of advice: "if a function is only called from a single place, consider inlining it."
Taking a complicated mess and breaking it into functions that are called from a single place seems better served by organising the code (maybe with duplication if needed) and writing inline comments rather than devoting a call tree to that section of code.
That's when you rotate the display to vertical orientation.
"big, hairy, complicated thing" is a code smell and you would consider doing something about it, before it becomes truly a monster. Breaking it up into its constituent parts may be the way to go, if only because my brain's CPU cache is about 8 bytes of faulty memory.
> The idea is that breaking a task up into digestible steps would both harbor more readable code, self-documentation and code reuse.
Sometimes. Other times it means you need to jump around in a file (or jump between files, even) to understand what's going on.
Which is fine - any software of non-trivial complexity is going to require looking at the behaviour of multiple execution units, which may well be in different files. But if they're named sensibly and the dependencies are clearly maintained, the typical programmer is going to have a much easier time of "understand[ing] what's going on" than a single huge largely-unstructured blob of code. The moment you can't quickly see where a block starts and ends (without using IDE shortcuts) then the code has a readability issue.
I've dealt with code like that, and often wrote it. It's incredibly hard to get right IMO. It's an intuition that you develop over years of trial and error, not an exact science or set of rules.
> When I think of the most difficult to understand code I've come across it was probably written by someone who lives and breathes that interpretation of DRY.
Good code rhymes! It's better to err on the side of less abstraction and create patterns that will become familiar to the next person working with your code.
Reading code is harder than writing it and most of the time coding is about communicating with humans or "ghosts", i.e. people you might never meet, people who worked on the code in a different context or "era".
https://sonnet.io/posts/code-sober-debug-drunk/
https://sonnet.io/posts/emotive-conjugation/#:~:text=Ghost%2...
> If a replacement isn’t doing something useful after three months, odds are it will never do anything useful.
This is painful to read, but unfortunately rings true.
As an aside, when I saw the domain name/year I thought I'd find an update to one of my favorite programming rants of all time, "programming sucks."[1]
That rant gets better every time I read it.
This is oddly comforting to read
I found myself stuck in analysis paralysis and fear of not creating a perfect app. I became my worst enemy. My app was praised by VP, managers and staff using it, yet I saw it as a pile of garbage.
Then I realized it doesn't have to be perfectly DRY, it could technically just be spaghetti. It isn't spaghetti but some things could be improved. While, better designed apps are easier to work with sometimes there are situation where it is impossible to create a formal design document, so you just need to 'send it'.
The next iteration will improve many things, but if were to do those things initially the app would be in development for years, and now it is running a business.
With time I've progressively become less concerned with staying DRY, which perhaps counterintuitively has made it easier to avoid spaghetti problems. It's easier to keep things clean with a handful of near-duplicates that are tailored to the needs of their call sites than it is with a single trying to do everything.
It's a bit more work to keep behavior consistent across duplicates but I'll take it if it means less untangling work for myself in the future.
I’ve found the same, and have leaned more into patterns and proximity as guides. Find good patterns that can be repeated easily and predictably. Also, keep related code close together so it’s easy to find and copy somewhere else. Often times there are higher level abstractions that emerge which can then be “dry”ed out, but trying to do that too early creates more problems than it solves.
Garbage that shipped and has customers and a purpose (and maybe makes money, if your company is interested in that) is called Legacy. Perfect code that never shipped doesn't have a name.
Worst case, your garbage code gets you 6-12 months with customers and it has to be thrown away. No big deal, you said it was garbage and now you've got 6-12 months of actual knowledge of what your customers need and want, instead of what you thought they would need. You can make new legacy garbage that's much better than the first version now.
When I have analysis paralysis and feel my app is a pile of garbage, I identify which part of the code I am most afraid of and then rewrite it and put tests around it, whatever it takes to become super confident that that one part I was afraid of is now working correctly.
I think of building software kind of like making pottery or sculpting in general. I just get something useful working to start with, it doesn't matter what the structure is. It's a lump of clay to be refined.
"Always do X" and "Never do Y" are almost always bad advice. Live by rules of thumb but don't become a zealot or rigid purist. Some duplication is acceptable, but lots is probably a sign that something is factored poorly or the wrong tool for the job.
Rules of thumb include but are not limited to: KISS, YAGNI, and DRY.
Another good rule of thumb is make things easy to figure out for future maintainers who you have yet to meet and may never. Programming is communicating with a future human, not just a machine. It's about people. (Insert Soylent Green jokes here.)
At least in ordinary CRUD, I find that simple, re-composable mini-components get me far more reuse than big swiss-army-knife-like components. Small components that can be copied, tweaked, remixed, or ignored with ease are more flexible.
Also, communicating via strings and string maps (dictionaries) makes them easier to mix and match than complex data structures/classes. String maps are relatively simple yet flexible for structure passing. You lose a little compile-time-type-checking by going string-centric, but there are work-arounds, such as optional named parameters that switch on type scrubbing when needed. (I love optional named parameters. Every language should have them.)
> Never rewrite your code from scratch, ever!
Is this really a common sentiment?
When it comes to rewriting others' code, it's prudent to keep in mind that it's naturally harder to understand code written by someone else. Just because you're confused in the first five minutes of looking at something doesn't mean it's an unsalvagable spaghetti. It's too easy to underestimate the time and cost of a rewrite and confuse your lack of knowledge for a fault in the codebase. Of course sometimes a rewrite is still appropriate after that consideration.
If it's your own code then you probably have a better judgement than anyone whether it's in need of a rewrite.
Doesn't everybody tend to rewrite major components of something in its early states? Though I find as I gain experience over the years I have to rewrite/"draft" code less and less.
Some of the most significant jumps in quality I've seen have been in total rewrites of my projects, at least when the project in question was at least moderately complex.
There even used to be a project that I'd rewrite every so often (but never publish) just to see how much better each iteration was. That fell to the wayside because I got busy, but I should probably pick it back up at some point.
Do you keep the project runnable at all times when you rewrite? My issue is twofold:
1) You can not do anything with the pieces until they are back together
2) If everything goes well, you get to see the exact same behavior as before. It can be faster, easier to modify or add stuff, perhaps even more elegant on the inside, but it will still be the same application.
Not the person you originally asked, but I have the same rewriting habit.
Automated tests are essential for rewriting code. If I want to rewrite untested code I either add tests or don't rewrite it. The rewritten code also needs to be tested.
Whenever possible, split up big, untestable rewrites into a series of smaller, testable rewrites. Let's say I want to rewrite component A with subcomponents B and C to use some new library Foo. I might first rewrite B to use Foo, then rewrite A (and A's tests!) to work with rewritten B. Then I rewrite C to use Foo, then rewrite A again to work with rewritten C. Then I finally go rewrite A to use Foo. This is more coding then doing the whole thing in one go (I rewrote A 3 times!), but it's a net time saver because when I make a mistake I can quickly find and fix it.
When I'm on the clock I rewrite code for practical reasons (maybe the current structure of the code can't support some new requirement, or technical debt has gotten high enough to make maintenance difficult, or whatever). This is rare-ish. Rewriting code, especially production code, is risky and time consuming and just generally not worth it.
Most of the time when I rewrite code I'm rewriting my own code, and I'm doing it for personal reasons. I like good code, I like reading it and I like writing it. To me code has aesthetic qualities, code can be beautiful and elegant. It's also educational. Imagine a writer that never edits their work; they're probably not a very good writer. It's also fun! I'm already familiar with the problem domain, so I can devote my entire focus to solving the problem instead of splitting my attention between solving the problem and figuring out wtf is going on.
For #1, yeah I do. If I'm working on more routine/boilerplatey parts I might go a little longer without running, but it's pretty important to me to verify that each bit is working as expected before moving on. It's easy to wind up in a mess if I'm operating on the assumption that what's been written so far all works.
For #2, yeah that's true, but for me less visible improvements are gratifying, because not only is the thing being rewritten being improved, but I can also apply learnings to other projects that make rewriting them less necessary. Also, it just bugs me when there's reasonable obtainable improvements in optimization, flexibility, etc that I've left on the table… feels like I left the job half-done which isn't a great feeling.
https://www.joelonsoftware.com/2000/04/06/things-you-should-...
"They did it by making the single worst strategic mistake that any software company can make: They decided to rewrite the code from scratch."
This is putting the whole company on hold to rewrite the entire product. I took the original article's statement a lot more generally, like rewriting a module or inner library, but I could've taken it wrong.
I wonder if that article has caused more damage over the years than it prevented.
I think there needs to be a version of this for pure FP languages like Haskell or even OCaml or F#; almost none of these maxims and aphorisms seem to apply.
Abstraction? It has a completely different meaning in this context. Our business is abstraction: creating precise definitions and semantic meaning where none existed before. It is much easier to create abstractions and sufficiently demonstrate their laws hold. So much so that we often design our programs and libraries with abstractions first.
This forces our programs to deal with the side-effects of interacting with the world outside of our programs' memory-space at the very edges of our program. We can prove a great deal of our code is correct by construction which lets us focus our testing efforts on a much smaller portion of our programs.
However even in non-FP languages I think a lot of these problems do go away if you use the above definition of abstraction and spend a bit more time thinking about the problem up front before writing code. Not too much, mind you, because the enemy of a good plan is a perfect one; however enough that you know what the essential properties and laws are at least tends to help and reduce the amount of code you need to consider and write.
FP has a better definition of abstraction where it is closer to the sense of “simplify”.
In procedural programming it might just mean indirection. Or silly metaphors.
If you're not a programmer and are just stumbling around trying to code, ideas like DRY and abstractions and modularity are super useful. I work with scientists/PhD students helping with their code from time to time, and it's easy to forget how much basics we take for granted.
If you're a career programmer or want to be one, then yes, it's better to try things out and figure out from experience why these principles exist. Then, you can break the rules, because you understand their purpose and limitations now.
Here, only apply Don't Repeat Yourself to data not code. Making it mean each piece of knowledge should have a single authoritative reference. E.G. Avoid (if possible) places where state is synced.
That's a great way to summarize it.
With the corollary that "hard-coded data" is still data, not code.
aka single source of truth
Lots of good ideas here, especially that duplication is better than the wrong abstraction.
One of the best pieces of advice I got really early in my career was write something 3 times before you decide to abstract it. Until you’ve done that, you just don’t know what parts you can really abstract and you’re likely wasting time. Pretty much every time I’ve ignored that advice I’ve regretted it.
Same. Writing something more than once gives you more than one perspective on the problem space. And with a better understanding of the problem space, you're more likely to find an optimal solution.
aka WET: Write Everything Twice.
I found the article to be sound in the fact of modularity and building upon what works as the old adage states "if it ain't broke don't fix it" holds true however there is always room for technology improvements when one monitors and measures the entire lifecycle of a transaction system. The world's systems exist in the way that they do today because someone took a risk on a design to work and the uptake of what “works” only spreads as the acceptance of said design is proven. I have wasted most of my adult life rewriting the same system in entirety five times and am now in the process of rewriting it again for the sixth however now I am applying it to a different industry. The design was proven over several decades in the critical uptime high transaction volume payments industry and now that same design is being generalized into other industries. The other industries applications may not have the same transaction volume requirements as the financial industries designs however refactoring what works ensures the critical availability portion as well as the scaling flexibility to meet potential high transaction volume should any other applied industries demand that same growth requirement.
Never trust an abstract principle by itself. Come up with your own concrete examples.
Tinker with these examples to improve your understanding of underlying factors of the principle.
Look for examples that might contradict the principle and understand why.
Let us take DRY for example:
If I make a change in one place, and I have to know about 4 other far away places to make the same change, will that cause problems?
What if there two duplicate lines of code in the same file, right below each other?
What if it's a 3000 line file that's duplicated? How about a 4 line function?
What if changing it in one place doesn't mean you have to change the other place?
This is something I wish was more widely discussed. Abstract principles are only templates for how to act, not prescriptive rules on what to do.
Much like a craftsman uses various stencils and measures to guide their craft, it's not simply using the stencil that is necessary but applying that stencil with skill and judgment, and when working with others, sharing those stencils kindly.
Ignore advice on what not to do.
Listen to advice on how to accomplish tasks.
A carpenter does not study how not to hang a door.
Likewise, don't listen to advice on how not to write code.
I don't think it's quite the same though... or at least I can make an argument for learning about ways not to do programming tasks because it generalizes.
There are patterns between ways not to do programming related things, e.g. use the single responsibility principle, use pure functions.
There are also so many ways to accomplish programming tasks, it's useful to be able to filter down that multitude of ways or notice "this stack overflow post has 5 bad patterns, maybe I shouldn't use it".
I don't know if there's a fancy programming acronym for this, but as much as DRY or SRP my rule is this: if I were to break this chunk of code off into a function, it would:
* Give this chunk of code a name
* Clearly document, in types and names, the inputs and outputs at its boundaries, without having to discover this through a breakpoint
Does the clarity of adding that documentation outweigh the indirection?
A great example is a set of 4-5 if-conditions. Looking at them might be unavoidable arcane-looking complexity with regex's or who knows. Now instead it's called:
if ($this->orderIsValid($order))
Isn't that nicer in most cases for the person reading it, who's trying to understand what the larger function does? Yes, even if it's only used in that one spot.A lot of this is subjective so I'm not going to pretend to have written the programmer's stone tablet of rules but that's my strategy.
My experience of this is interesting.
When I'm writing code I care a lot about, I don't have to worry about DRY stuff, because I can't really write the code without figuring out the right abstractions. It starts DRY.
But when I'm cranking out reporting code or boilerplate of some kind and I just want to finish the job, my work starts out dripping wet copypasta. I test it. I then do some squeezing -- refactoring -- to unify some of the abstractions and delete the almost-dup code. I try to DRY it up to a reasonable level. But, I confess, I don't subject the abstractions to as much scrutiny as the code I care about. My successors probably hate me for that. But...
The problem with DRY and SRP is that you might be simply moving the complexity instead of reducing it.
Eg with a bunch of one-line functions, you then need to call them, or they call each other, and then the complexity is in the call graph rather than laid out in a single function.
Code needs to be as complex as the problem it is solving, the challenge is to avoid complexity beyond that, and ideally have the code be 'transparent' to the problem, ie it's easy to see the problem and how it's being solved from the code.
As an aside, didn't 'fat controller' stop being a thing in the early 2010s, and the problem changed to 'god models'.
Your code is useless if no one is using it. If a tree falls in a forest and no one hears about it, it made absolutely no sound. Everything comes after that.
Many programmers put cart before the horse by subscribing to tidbits of "best practice". Having people using it, you can think about making it right and fast, making the making of it right and fast. Then make the right people making it right and fast. And turtle all the way done from here.
When you usually have flags in your function then you really have two functions in one which can be a problem. In practice, I usually break these kinds of functions down if it's looking like it's handling radically different cases, it does add some duplication but most times it's just the boilerplate of the language/platform than the actual work itself.
Am I the only person who hates feature flags? We're doing it so we can do trunk based Dev. It's ridiculous, it makes every feature so much more complicated. All for the sake of not managing some branches.
Dry is the difference between a good and a not good enough programmer. Good vode is easy to operate and extend later. Nondry code is a tech debt.
I don't disagree with a single thing in the OP.