Why Most Your UI/UX Ideas Are a Waste of Time But You Might Wanna Do Them Anyway

What matters is the company’s long term success

I will begin this article by providing some context. The information I am about to present is intended for people who are striving to succeed in entrepreneurship by creating value. There are various definitions of what creating value entails, but ultimately, it involves solving problems for others. If you’re trying to succeed by creating value, your success will be determined by your customers. In order to succeed at creating value, you need to develop something that customers actually want to use and which they prefer to your competitors. If you haven’t accomplished this, then by definition you have not succeeded at creating value for others.

This is the context in which I am writing for. This article is for companies, thinkers, and entrepreneurs who aspire to succeed by creating something valuable to others.

Naturally, there are other methods of “succeeding” — namely making money. There’s a sort of spectrum of how companies can operate. On one end lie the companies I’ve been talking about, the ones who prioritize creating value. On the opposite end of the spectrum, there are companies focused on raising money and creating valuation, without necessarily caring about creating value. This type of founder can use VC funds to grow, to raise more VC funds, to grow more, ad infinitum, cashing out and becoming richer and richer with each round.

The thinking in this article is irrelevant to these types of companies. They are more concerned with appearing successful rather than actually creating success, so they might want to do the opposite of what I advise here.

Finally, when I use the term “UIUX Improvement” in this article, here is precisely what I mean: a small improvement in the look and feel of a workflow in a product, which results in it being prettier or more polished but doesn’t change anything about the underlying workflow. This does not include changes that massively reduce friction or overhaul the entire way the workflow is executed by users.

An attempt at explaining statistical intuitions

When implementing a product feature, it can be compared to flipping a coin:

There is uncertainty whether the change will have a positive or negative impact.
The outcomes are often binomial in nature: ie. the user converts or they don’t convert.

In order to measure the effectiveness of a new feature, this is what we typically do: create an experiment where you show the new feature to a group of users, with 50% experiencing the old version (without the new feature) and 50% experiencing the new version (with the feature). The success of the new feature can be measured by factors such as retention and conversion rates. This is what is typically known as an A/B test.

It is important to note that the outcomes of A/B tests are deeply influenced by random variations. To illustrate this, imagine trying to improve the performance of a website where a picture of a coin is displayed, showing the side of “heads”. You might then conduct an A/B test where the coin is flipped, showing tails instead. After running 1000 people through the experiment, results may show a slight increase in conversion rates from 2.01% to 2.02% when changing from heads to tails. However, this does not necessarily mean that changing to tails actually improves conversion rates. In fact, with these numbers we can assume that the change in conversion rate is simply due to noise. In other words, it’s not statistically significant.

Let’s set aside A/B tests for a minute and think about the plain case of coin flipping. If you flip a coin 10 times, the outcome will rarely be exactly 5 heads and 5 tails (in fact, this will only happen about 24.6% of the time). But this doesn’t mean that the coin is not fair. The uneven results are a result of probabilities and randomness.

Similarly, if you do get 5 heads and 5 tails, that does not necessarily mean the coin is fair either!

“True Effect” vs “Minimum Detectable Effect”

Any probabilistic phenomenon has a “true probability.” Once you’ve been thinking about statistics and probabilities and A/B tests for years, you might forget that all of these are merely artifacts to help you reason about uncertainty. The reality is that everything has a “true value.” The conversion rate of your website has a true value.

Not that it matters for the purposes of this article, but a probabilistic experiment with a binary outcome (such as a coin toss or a website conversion) is typically modeled by way of the concept of a Bernoulli Trial, and the resulting probability distribution is a Binomial Distribution. Binomial distributions still have a mean and a standard deviation though, so at a conceptual level it should be familiar to anyone who has some understanding of Gaussian distributions.

The picture below attempts to illustrate what I mean by “true probability.” It shows a binomial distribution for conversion rates for a sample of 10,000 users:

Press enter or click to view image in full size

Binomial distribution for conversion rates for a sample of 10,000 users, where measured conversion rate was 5%. Each blue dot represents a probability. For example the probability that 475 out of 10,000 users will convert is a little under 0.0100, given the assumption that the real conversion rate is 5%. Thanks to the absolute incredible badasses at TutorialsPoint for having a free, login-less repl where I could code this. Repl link

The problem is the true probability is not knowable to us in the real world. We can compute observed conversion rates, but our website has an actual conversion rate — the percentage of all the humans in our target market who would convert if we showed them our site. The interesting part is that the “true conversion rate” number is a scalar, just a simple number. It’s not a curve of any kind.

If we follow my reasoning, changes made to a website or app also have what I’m calling a “true effect.” This refers to whether a change affects the true value of the business outcome we’re measuring (in this case, conversion rate). However, these “true effects” are often too small to measure accurately. To detect these changes, you would need sample sizes that are so large your experiments would have to run for way too long (perhaps they’d never finish). For this reason, we choose a desired minimum detectable effect for our experiments, and if the minimum detectable effect isn’t met with statistical significance at our chosen sample size, we consider the experiment to have failed to improve outcomes.

The specific mathematical formulas for determining the minimum detectable effect and sample size are beyond the scope of this article, but the main takeaway is this: most changes you make to an app or website have effects that are too small to be measured with statistical significance.

Small UI/UX improvements

Based on my experience in product-based companies and running A/B tests, I have found that generally, making small UI/UX changes almost never results in statistically significant improvements in business outcomes (I’ve literally never seen a UI/UX experiment result in a stat-sig positive outcome). A caveat: if a website is really terrible, you can redesign it to meet the standard of competitors in the niche, and in that case you will almost certainly drive positive business outcomes.

But if the website is already at or above the standard of its niche, you can try til you’re blue in the face and UI/UX will rarely if ever produce measurable positive business results. You might spend $50,000 on amazing desig consultants to give your website marginally better Apple-like look, and this is unlikely to help with conversions or driving sales. In my job at Noom, I witnessed A/B tests on UI/UX improvements many times and never saw a positive result. In my tenure as CTO of KiddieKredit, we’ve spent a lot of money in design that didn’t drive measurable results.

My hypothesis is as follows: when there are small UI/UX issues, an elite UI/UX designer will suggest changes that result in customer experience improvements, which in turn result in moving the needle of “true conversion rates” ever so slightly. But since these improvements are never large enough to be statistically significant, their value remains hidden and unmeasurable. What’s happening with successive UI/UX Improvements is more or less this (again using a chart with the same data as above, but zoomed in):

Press enter or click to view image in full size

Binomial distribution for conversion rates for a sample of 10,000 users, where measured conversion rate was 5%. The black lines represent the initial value of “true conversion rate” and the value of true conversion rate after two batches of UI/UX improvements. These effects would likely not be statistically significant with a sample of reasonable size.

The UI/UX improvements are making the business better. You just can’t measure it.

The Compounding Nature of UI/UX and its Siren Song

There is still hope for UI/UX designers working in data-driven orgs.

Measuring UI/UX changes in isolation is not a fair way to judge their impact because UI/UX compounds on itself. Making one small UI/UX improvement on an average website is unlikely to result in a positive outcome. Even multiple small improvements may not yield measurable positive outcomes. However, making a substantial number of small improvements can transform a mediocre product into an above-average one.

So, if you have averageUI/ UX and a data-driven culture, you might be caught in a trap. You might have tried making a few UI/UX improvements, and since they yielded no results, then you reduced your investment into UI/UX.

However, an alternative course of action is this: Continue to invest into UI/UX. You can do this by first hiring elite-level designer (or designers, depending on the size of your company and surface area of your product). Then, following the lead of the elite-level designers, run 50+ UI/UX improvement experiments and implement any that don’t result in a reduction in business outcomes (lower conversion rates and the like). Once you’ve accumulated 50+ UI/UX improvements, then A/B test the old version of the website (before the 50 improvements) against the current version. If you do this, there will likely be an improvement in business outcomes such as increased revenue, improved retention, or better customer satisfaction. Individually, they were too small to measure, but as a group, they can drive impact.

Now with that said, there are additional challenges. First and foremost, a backwards-ablation test like the one discussed above would be very costly to run. During the time when you made those 50+ UI/UX improvements, a bunch of other non-UI/UX related things will have changed in your app/website. Thus, running that test will require a lot of custom code and effort, so it will be expensive.

So, Should I Invest in UX or Not?

The second and more important challenge with investing in UI/UX is that most companies do not have and cannot hire elite level UI/UX designers. Many companies that strive for excellent UI and UX are led by individuals without any kind of respectable UI and UX background. Trying to achieve Apple-level UI/UX without having worked at an elite UX company like Apple is a foolish endeavor and may be a sign that leadership is mistaking ego for competence.

With that said, if you do have an elite level designer on your team, you should consider creating a structure where they are free to collaborate with engineering and introduce UI/UX improvements on a consistent basis. You can’t set the designer completely loose because they’ll often suggest things that are too difficult to implement and would have a massive technological cost, but you can create a culture where design improvements are a constant thing and don’t need to compete tooth and nail against feature work. If you do this (and the designer is actually great), the investment will pay off. Your product will be better and your brand will be enhanced in a way that will be very difficult to emulate. It will act as a mini-moat.