For 6 years, we ran the largest blind eng hiring experiment of all time. Here’s what happened.

This is part 1 of our series about eng hiring in the age of AI. Now that hiring is back, so are we! Want to hire with us? Just fill in this form.

Many people have heard of interviewing.io, but most people think of us as a mock interview platform. What you probably don’t know is that for six years, we ran the largest, live blind eng hiring experiment of all time.

You’ve probably heard about the blind orchestra auditions described by Malcolm Gladwell in Outliers. When orchestras had musicians play behind a screen during auditions, effectively hiding their gender, the likelihood of women advancing to later rounds increased by about 50%, and over time, the proportion of women in top symphonies nearly tripled.¹

After I read Outliers, the idea got stuck in my head, and I started interviewing.io so we could do the same thing.

The experiment

For six years, from 2016 to 2022, we ran the software engineering version of the orchestra experiment, except instead of hiding gender, we hid resumes.

Our users did anonymous mock interviews on our platform, and we aggregated their interview outcomes to surface top performers.

Then, top performers could book anonymous interviews at their companies of choice. The twist: companies didn’t see any resumes. Their first interaction with each candidate was a fully anonymous technical interview with an engineer from that company. The interview was exactly the same as the company’s usual technical phone screen, except for the anonymity. At the end of the interview, both sides decided whether to move forward; only then would the candidate unmask.

Image of iio business model

With this approach, from 2016 to 2022, we hosted about 10,000 real (but anonymous) interviews.

Why did we do this?

Simply put, hiring is broken, and we wanted to fix it. “Hiring is broken” has become a bit of a cliche though. How is it broken? What does broken actually mean?

I’ve spent years trying to simply articulate everything that’s wrong with the status quo’s approach to hiring. Here everything that’s wrong with hiring in one diagram.

recruiting venn diagram

Essentially, everyone is chasing the same set of candidates who look good. Unfortunately, many of those candidates aren’t looking right now, and many aren’t good. (We have a lot of data showing that pedigree doesn’t really predict performance – check out the Appendix at the bottom of this post).

But, there isn’t a straightforward way to filter for good candidates.

LinkedIn Recruiter advanced search

So, instead of being able to look directly for the right people, you have to rely on proxies. Some obvious proxies might be where someone has worked before and usually turns into looking for people from FAANG and FAANG-adjacent companies (and maybe a few very high-profile top-tier startups as well).

But, those people may not be good, and they’re probably not looking right now. So, hiring takes forever.

Our customers (without us) spend something like 200 staff hours and $40,000 per hire, split between a lot of time on sourcing (because not that many people respond) and a lot of time interviewing people who fail (because people who look good aren’t necessarily good).

And, of course, without us great non-traditional candidates languish in the online application black hole without getting a chance to show what they can do.

That’s what this experiment was trying to fix.

The results

With our blind approach, over six years, we placed thousands of engineers at FAANG and FAANG-adjacent companies and top-tier startups.

Infographic showing results of hiring blind

Of our users who got offers from top-tier companies, 46% (almost half!) didn’t have either a top school or a top company on their resume. In a normal (not blind) hiring process, these candidates wouldn’t even have gotten an interview.

In fact, many of these non-traditional candidates had been rejected by the very same company where they later got an offer through us. Many of our customers were surprised to see that, after the candidate unmasked, that the candidate had already existed in their ATS and had previously been rejected at the resume stage. One of our candidates had been rejected three times by the same company, at the resume stage. With us, he “snuck in” because he got the chance to interview anonymously and ended up getting hired. He’s still there today.

Our blind model wasn’t charity. It was good business. Because we used interview performance data, not resumes:

Our average hire had 10 years of experience (lest you think good senior engineers don’t go for stuff like this).
Our candidates (both traditional and non-traditional) converted 3X better than other sources (including internal referrals). Not surprising; we used performance data to predict performance!
On average, cost per hire went from $40k to about $10k for our customers. This was driven by less time spent on everything, especially sourcing and interviewing – when you know who’s good, you don’t need to interview nearly as many people.
Our non-traditional candidates were 2X more likely to accept offers and stayed at their job 15% longer than their traditional counterparts after placement.

In the six years we did hiring, we confirmed something that I had hoped was true at the start: it’s possible to make money and do good business while doing the right thing. Hiring great people, rather than people who look great, not only makes hiring fairer, but also cheaper, faster, and better.

We sadly had to put our hiring product on ice when the hiring freezes and the resulting downturn started in 2022, but, now that hiring is back, we’re relaunching it. You can sign up for early access here.

Why use us?

You don’t have to use us! But we are the only recruiting marketplace out there that uses interview performance data to find great engineers. And having performance data is a big deal and helps make hiring way more efficient.

It’s the only way we see to fix broken hiring. Here’s why.

With data from mock interviews, we know who’s good because we know how they do in interviews. And we know who’s looking because they’re practicing on interviewing.io. About 10,000 engineers sign up for interviewing.io every month. We know who’s good and looking, and we can help you hire faster, cheaper, and more efficiently than ever before. In other words, you’ll be hiring from the green part of the Venn diagram instead of the red.

recruiting with iio support venn diagram

Do you want to be hiring from the green part rather than the red part? Try us out. You can use us as a source of excellent candidates and/or you can use our predictive model² to help you surface the best people from your inbound applications. Just fill in this form for early access, and we’ll be in touch.

Appendix

Many candidates who “look good” aren’t actually good

“Look good” really means that candidates have worked at top-tier companies and have attended top-tier schools.

If you, like many people in our industry, believe that candidate pedigree is the be-all and end-all of hiring, I don’t know if I’m going to be able to convince you otherwise, but I’ll sure as hell try.

Here’s all the data we have, drawn from tens of thousands of interview outcomes. Conclusively, pedigree does not predict performance.

Recruiters and hiring managers can’t reliably identify talent and all disagree on what a good candidate looks like

Probably because pedigree is not a useful signal, neither recruiters nor hiring managers can identify what a good candidate looks like.

We ran two studies, ten years apart, where we showed anonymized resumes to recruiters and hiring managers and asked them to identify the strong candidates. Not only did they do poorly at this task (about as well as a coin flip), but everyone disagreed on what a good candidate looked like.

How to spot great non-traditional candidates

We’d be remiss, if while sharing the result of our hiring experiment, we didn’t also share some of the things we learned about how to spot great non-traditional candidates.

Of course, because our system was blind, we didn’t have to do this work. We had interview performance data instead. But, we were able to look through our roster of non-traditional top performers and see what their resumes had in common. Here are the trends and what you should look for.

TAs, tutors, and graders in upper-level CS classes at state schools and second-tier schools. A candidate from a state school who served as a teaching assistant was trusted to explain complex material and evaluate others’ work. That’s far more predictive of competence than a brand name on a diploma.
On the other hand, having been a coding bootcamp instructor is a BAD signal. It often correlates with people who couldn’t find traction as engineers and pivoted to teaching too early.
This is oddly specific, but look for graduates from the Bradfield School of Computer Science. Bradfield is sadly on an “indefinite hiatus”, but for years, it was a small, but very rigorous program built to fill the gaps left by traditional CS degrees and bootcamps. Students went deep on operating systems, compilers, and systems design, i.e., the hard stuff that great engineers need to know. If you see Bradfield on a resume, you’re looking at someone who chose to invest serious time in mastering fundamentals.
Distance traveled and pattern breaking. Freada Kapor Klein from Kapor Capital coined the term “distance traveled” more than two decades ago. It refers to what someone accomplished, in the context of where they started. For instance, Kapor Klein recommends that, in their admissions processes, universities should consider not just the number of AP tests a candidate has passed but the number of AP tests divided by the total number offered at their high school. For example, if an applicant took 5 AP tests and their school offered 27, that paints a very different picture from another applicant who also took 5 AP tests when that’s the total number offered at their school. Kapor Capital uses distance traveled as one of their metrics for determining which entrepreneurs to fund. One can easily apply this concept to hiring as well. Candidates who taught themselves to code while working another job, switched careers, or repeatedly punched above their weight — promotions faster than peers, shipped ambitious side projects, contributed to hard open-source problems — tend to keep doing it. That combination of persistence and self-direction is hard to teach and worth betting on.
High-signal bullet points that read like they were written by a smart person who cares. Look for bullets that describe what they actually built or owned, not word salad.

❌“Participated in agile ceremonies" or "Worked in cross-functional teams and communicated effectively on business goals"

✅ “Built a caching layer for the internal analytics dashboard used by sales and support, so queries stopped hitting production and page loads dropped from minutes to seconds.”
Fewer than two typos or grammatical errors. In some past research, I learned that the number of typos is the most important attribute of a resume and that the resumes of the strongest candidates consistently had two or fewer typos. The number of typos/grammatical errors was way more predictive than where candidates had previously worked, for instance.

Finally, because I’d rather show than just tell, here are the profiles of three of our top-performing hires. All three are now at FAANG and FAANG+ companies, but I’ll show you what their profiles looked like before they got their first break.

Candidate 1

Non-traditional work experience and education 1st candidate

Click to see full image

Candidate 2

This candidate was kind enough to let us unmask him, so you can visit his LinkedIn and take a look for yourself!

Non-traditional work experience and education 2nd candidate

Click to see full image

Candidate 3

This candidate was also kind enough to let us unmask him, so you can visit his LinkedIn and take a look for yourself!

Non-traditional work experience and education 3rd candidate

Click to see full image

And remember, non-traditional candidates are 2X easier to close and stay at companies 15% longer.

Footnotes:

Primary study: Orchestrating Impartiality: The Impact of “Blind” Auditions on Female Musicians. Key findings: Introducing screens increased the probability that a woman would advance from preliminary rounds by ~50%. Over time, the share of women in major symphony orchestras rose from ~10% in the 1970s to ~35%–40% by the mid-1990s. ↩
Yes, we have AI as well. But, it’s trained on anonymized data, not recruiter and hiring manager preferences. I’ll share more about it in an upcoming post. ↩