Baking the Most Average Chocolate Chip Cookie

12 min read Original article ↗

Skip to main content

The Pudding

Cookie recipes written by computers.
What could go wrong?

It’s hard to mess up a chocolate chip cookie. In the 80 years since the treat’s invention, thousands of recipe variations have been written to make the treat more gooey, crispy, chewy and generally tastier than its predecessors. Yet, it’s possible that this go-to dessert’s tastiness can be pushed even further.

We wondered if there was a way to leverage computers and hundreds of pre-existing recipes to create the most average chocolate chip cookie. Would it be bland and unremarkable? Or, perhaps like averaging human facial features, the results would be even better than each of its individual parts. Maybe an average cookie would be the most delicious of them all.

But what is an average cookie? We decided to interpret this idea using three different methods: a mathematical average, predictive text algorithms, and neural networks. After feeding each algorithm over 200 chocolate chip cookie recipes, they each generated something new. And, yes, we actually baked them.

The Mathematical Average Cookie

“Chewy and very chocolatey, no one would suspect these cookies were made with everything in your pantry.”

For our first attempt, we threw our recipe-creation back to grade school and just straight up averaged the amount of every ingredient in our set of recipes. That means we calculated the average amount of flour, and the average amount of butter, and so on. Of course, that leads to some unusual complications like non-integer quantities (how do you measure 2.85 eggs?) and ingredients that appear sparingly in the data set, like molasses or black pepper, get seriously watered down in the average. We could pretend they’re not there, since who can taste 0.002 cups of applesauce in a batch of 48 cookies? But for science, we decided to keep all 60 ingredients.

Averaging numbers, like the amounts of everything in our recipes, is relatively straightforward, but things get more complicated when we think about how to average the recipe instructions. Afterall, we need to know what to do with all of our ingredients. It turns out that using a tool called word vectors we can effectively treat words as numbers. Here’s how it works:

Creating the Recipe

Step 1

Collect words used in the chocolate chip cookie recipes.

Step 2

Group words together that are used similarly in the recipes.

Step 3

If these words were grouped on a graph, we could assign a numeric value to each word based on where it falls on the graph. This number is called a “word vector”.

Step 4

Then we can use these word vectors to find the average value for an entire sentence. This is called a “sentence vector”.

Step 5

If we look at all of the sentences from our recipes, we’ll find that many have slightly different meanings, but very similar sentence vectors because they contain similar words.

Step 6

Now, if we group all of the sentences with similar sentence vectors together on a graph, we’ll end up with groups of similar sentences.

Step 7

Last, we find the sentence that is closest to the center of each group and use those sentences for our recipe instructions.

Using this method, we found about 9 distinct clusters of sentences and by picking the sentence closest to the center of each one, we ended up with recipe instructions that actually work pretty well. Although, we did have to decide when to add the 50+ ingredients that didn’t end up in our sentence vectors.

CookieIcons

3.526 cups

flour

CookieIcons

0.394 tsp

baking powder

CookieIcons

1.139 tsp

salt

CookieIcons

1.370 tsp

baking soda

CookieIcons

1.133 cups

butter

CookieIcons

1.025 cups

sugar

CookieIcons

1.194 cups

light brown sugar

CookieIcons

0.125 cups

dark brown sugar

CookieIcons

2.980 tsp

vanilla

CookieIcons

2.855 whole

eggs

CookieIcons

1.833 cups

semisweet chocolate chips

CookieIcons

0.291 cups

milk chocolate chips

CookieIcons

0.112 cups

dark chocolate chips

CookieIcons

0.049 cups

white chocolate chips

CookieIcons

0.354 cups

bittersweet chocolate chips

CookieIcons

0.014 tsp

almond extract

CookieIcons

0.011 cups

almonds

CookieIcons

0.002 cups

applesauce

CookieIcons

0.019 tbsp

bourbon

CookieIcons

0.098 cups

bread flour

CookieIcons

0.006 cups

brown rice flour

CookieIcons

0.082 cups

cake flour

CookieIcons

0.378 oz

cake mix

CookieIcons

0.019 cups

chocolate covered raisins

CookieIcons

0.028 tsp

cinnamon

CookieIcons

0.006 cups

coconut

CookieIcons

0.019 tsp

coconut extract

CookieIcons

0.128 cups

cookie mix

CookieIcons

0.001 tsp

coriander

CookieIcons

0.057 tbsp

corn syrup

CookieIcons

0.137 tsp

cornstarch

CookieIcons

0.006 tsp

cream

CookieIcons

0.009 cups

crispy rice

CookieIcons

0.1019 tsp

espresso powder

CookieIcons

0.002 cups

graham cracker crumbs

CookieIcons

0.003 cups

honey

CookieIcons

0.006 tsp

lemon juice

CookieIcons

0.096 tsp

liquer

CookieIcons

0.005 cups

macadamia nuts

CookieIcons

0.032 tbsp

maple syrup

CookieIcons

0.050 cups

margarine

CookieIcons

0.538 tbsp

milk

CookieIcons

0.005 tbsp

molasses

CookieIcons

0.002 cups

Nesquick mix

CookieIcons

0.002 tsp

nutmeg

CookieIcons

0.055 cups

nuts

CookieIcons

0.227 cups

oats

CookieIcons

0.006 cups

peanut butter

CookieIcons

0.002 cups

peanut butter chips

CookieIcons

0.062 cups

pecans

CookieIcons

0.038 oz

pudding mix

CookieIcons

0.006 cups

raisins

CookieIcons

0.160 cups

shortening

CookieIcons

0.088 tbsp

sour cream

CookieIcons

0.027 tsp

cream of tartar

CookieIcons

0.022 cups

toffee

CookieIcons

0.020 cups

vegetable oil

CookieIcons

0.019 tsp

vinegar

CookieIcons

0.326 cups

walnuts

CookieIcons

0.010 cups

water

CookieIcons

0.048 cups

wheat flour

CookieIcons

0.005 tsp

white pepper

CookieIcons

0.003 tsp

xanthan gum

CookieIcons

0.010 cup

zucchini

Bake 350°F for 8 - 10 min

The Predictive Text Cookie

“Big and flat, these cookies deliver a whopping taste of shortening and brown sugar.”

Next, we decided to try something a bit more complicated called predictive text. Essentially, predictive text is like the autosuggest feature in a messenger app: you start with one word, and it gives you suggestions for the words that might follow. The suggestions you receive on your phone most likely come from a pre-loaded program that “learns” based on your texting habits. What if predictive text only knew about the word usage in chocolate chip cookie recipes? That’s the question behind this experimental cookie.

Using our chocolate chip cookie recipe dataset, we created a big list of 4-grams: sets of 4 words or punctuation marks that appear together. Such as

using a metal spatula

carefully transfer the cookies

at least one hour

We can count how often each 4-gram appears in the text and determine how likely it is that a specific 4-gram will appear instead of another. Here’s how it works:

Creating the Recipe

Step 1

Select three words that appear in the recipe text in order.

Step 2

Find which 4-grams from our dictionary contain those 3 words, in that order.

Step 3

Imagine that we chose the first option, “combine the flour,”. Now we need to find 4-grams that overlap with our choice.

Step 4

If we choose the 3rd option, “the flour, salt” we now have added two words to our original string (technically, a word and a punctuation mark).

Step 5

To speed this up, we automate the process, but the computer needs to know which 4-gram to pick. We used a process guided by probability - so 4-grams that occur often are more likely to be chosen than 4-grams that occur only once.

Beware of never-ending loops!

Using predictive text generated a pretty follow-able recipe, but it can have some issues. If we chose the single most common 4-gram every single time, we can find ourselves stuck in an endless loop. Look what happened when our computer ran into one very unusual ingredient - cannelini beans.

...sifted 2.4 cup canned white cannelini beans, and the baking soda and 1 teaspoon salt in a large microwave safe mixing bowl...sift flour, cocoa powder, sifted 2.4 cup canned white cannelini beans, and the baking soda and 1 teaspoon salt in a large microwave safe mixing bowl...

To save our recipe from endlessly looping (and to save our tastebuds from whoever is putting beans in cookies!), we removed the cannelini-filled recipe from our dataset.

CookieIcons

4.0 cups

butter flavored shortening

CookieIcons

3.333 cups

packed brown sugar

CookieIcons

? cups

white sugar

CookieIcons

4.0 cups

all purpose flour

CookieIcons

1.143 tsp

baking soda

CookieIcons

0.738 tsp

baking powder

CookieIcons

1.0 whole

egg

CookieIcons

1.0 whole

egg yolk

CookieIcons

2.0 cups

semisweet chocolate chips

CookieIcons

0.8 tbsp

vanilla extract

Bake 350°F for 7 minutes

The Neural Network Cookie

“Like caramelized cookie brittle. It’s not terrible but it’s not a cookie.”

Our last recipe was created using deep learning, one of the most compelling recent advances in artificial intelligence. An algorithm called a neural network has changed the game in facial recognition, speech recognition, and image processing in the last few years. Neural networks train on a set of data, like a set of pictures, text documents, or cookie recipes, and can learn the patterns inherent in its input without very much guidance, if any, from humans.

Trained neural networks can even create their own works of art. So if we trained a neural network on a set of chocolate chip cookie recipes, we could ask it to generate its own rendition of the recipe. That’s the idea behind our neural network cookie. Here’s how it works:

Creating the Recipe

Step 1

Collect the ingredients and directions from lots of chocolate chip cookie recipes.

Step 2

The neural network needs to find patterns in these words, so it breaks up all of the words into individual letters.

Step 3

The algorithm looks for patterns in how a single letter is used and what other letters typically come before or after it. This is called training.

Step 4

After the neural network has trained on enough recipes, it can begin to guess how a recipe would be written. So, if you give it a randomly assigned letter, it can try to guess which letters might come next. It continues letter by letter until an entire recipe emerges.

Watch out for made-up words!

Because neural networks piece together language letter by letter, it doesn’t have any understanding of the meaning of words in recipes and so sometimes, it makes up new words.

...And repeated or missing ingredients!

Neural networks are great at learning the format of a typical recipe, but they’re not so excellent at understanding which ingredients go together. It may not realize that it already added a cup of sugar and will then suggest that you add another cup of sugar to your ingredients...and another...and another.

It also may not notice that some ingredients, like eggs, are important, so they may be left out completely.

CookieIcons

4.0 cups

all purpose flour

CookieIcons

2.0 tsp

baking soda

CookieIcons

1.0 cups

white sugar

CookieIcons

4.0 whole

eggs

CookieIcons

2.0 tsp

vanilla

CookieIcons

1.0 cup

semisweet chocolate chips

CookieIcons

? cup

walnuts

CookieIcons

0.5 cups

white sugar

CookieIcons

0.75 cups

granulated sugar

CookieIcons

0.8 cups

white sugar

CookieIcons

1.218 cups

packed brown sugar

CookieIcons

0.5 cups

white sugar

CookieIcons

1.0 cup

white sugar

CookieIcons

1.2 cups

packed brown sugar

CookieIcons

1.0 cup

white sugar

CookieIcons

2.0 tsp

baking soda

Bake for 10-12 min

Bake for 10-12 min (again)

Our Leftovers

Our computer gave it a valiant effort and did create three brand-new recipes! Whether or not we can qualify their creations as true chocolate chip cookies is still to be decided.

If you wanted to give these cookies a try, feel free to save the recipes, bake them yourself, and then let us know what you think! Oh, and if you don’t want hundreds of cookies (created from our recipes or your favorites), spread the love and that cookie-goodness with friends, family, or co-workers, have a bake sale to raise funds for hungry kids, or donate them. After all, it’s hard to go wrong with chocolate chip cookies.

Our Methods

We searched the internet, including recipe databases AllRecipes and Epicurious, for chocolate chip cookie recipes. Our search returned 915 recipes. We removed any recipes that were not representative of traditional chocolate chip cookies using several criteria: the title could not contain another dessert (e.g., “Chocolate chip cookie ice cream sandwiches” or “Chocolate chip cookie cake”), a flavor could not be mentioned in the title (e.g. “Peanut butter chocolate chip cookies” or “Banana chocolate chip cookies”), and the title could not identify the recipe as an alternative formulation of the cookie (“Vegan” or “Gluten free”). This brought our sample of traditional chocolate chip cookie recipes down to 221. Before any of the text processing or averaging methods were applied, recipes were scaled to make 48 servings.

Note that our text processing methods (specifically, the neural network and predictive text) produced instructions that were not always physically possible to complete. For example, ingredients are listed but not used in the instructions, and instructions sometimes reference ingredients that were never listed. As we attempted to make these recipes, we sought to adhere as closely to the recipe as possible, but making guesses as required.

Many thanks to Jan Diehm for design assistance on this story and for the header photograph.

Hungry for more computer-generated recipes?

We weren’t the first ones to use computers to generate new recipes and (we hope) we won’t be the last! Check out these recipes for pie, cake and much more. If you’re more of a cooking-show fan, you may enjoy watching Elle make a Thanksgiving dinner, a Valentine’s Day dessert, and a mystery meal all generated by various algorithms.