Building ShopGPT: Merchant discovery powered by the OpenAI Embeddings API

4 min read Original article ↗

At Postscript, we have a monthly stipend where team members get to purchase from any of our 10,000+ customers. This gives them a regular cadence of going through the purchase and SMS flows of our merchants regularly and helps them stay up to date with ecommerce customer journeys and strategy.

In growing past 10,000 merchants, it became harder for our team to parse through all of the shops to find products we wanted or new, fun merchants that we hadn’t previosuly bought from. I wanted to create the proof-of-concept of a tool to search these merchants and simultaneously create a good excuse for me to play around with AI APIs outside of the standard ChatGPT use-cases.

I also believe all technical leaders should figure out ways to stay sharp, so I created ShopGPT* using the OpenAI Embeddings API.

*(Please ignore the HTTPS warnings — that was a battle that I refused to fight with AWS within my time limit)

The Embeddings API takes a large amount of text and creates a compressed mathematical representation of the text. This allows for measurement of similar types of words and clustering of those words into categories.

Here is an example from OpenAI of clustering around similar topics (animal vs athletes)

The API allows for two mathematical representations of text to be compared to find similarity, even if the words aren't the same. For example, "Baby clothes for toddlers" is mathematically similar to "Shirts for 4 year olds" but none of the identifying words are the same.

Create Store Descriptions

To build our custom search interface, I needed to create paragraph-long descriptions of each store. Unfortunately, there are not great fields from the Shopify API that can be used to describe what a store sells in a sentence or paragraph. Instead, I used OG Meta Descriptions from websites or the entire HTML body of the homepage to create the store descriptions in combination with the top product names from each store

I asked GPT to create a human readable paragraph summary from the HTML web scrape, product names, and OG description. This final result is a decent description of a Shopify merchant that didn’t previously exist.

Here’s an example for the store duckcamp.com:

Duck Camp is a store that specializes in premium outdoor goods for hunting and fishing apparel. They offer a variety of products such as lightweight hunting shirts, bamboo crews, and signature fishing shirts.

Create Embeddings from Descriptions

Once I had the store descriptions, I created embeddings and caches of paragraph-long descriptions of stores. I used the OpenAI template for product recommendations as a starting point and created embeddings of store descriptions, saved them to a pickle file, and stored it in S3. When the application starts, the pickle file is loaded into the app memory.

Query Embeddings, Return Relevant Results

Finally, I created an API and frontend to allow users to search the embeddings for relevant merchants to their search.

When a user asks a query, the system creates an embedding of the user query and compares it to embeddings of the merchants. It finds the smallest distances between the user query embedding and existing embeddings, and then changes the results back to natural language descriptions of stores, which get shown to the user.

A cool note is that OpenAI GPT APIs aren’t used at all during the search, which makes the entire process incredible fast and cheap.

This was a pretty rushed ~1.5 day side project, so I could have done much better in ways I both understand and don’t. Here are some high level changes I would make with more time, based on what I know:

  • Training data was limited to 300 of our top stores, this should/could be way more.

  • Eventually with more data, this would require a vector DB to store and compare embeddings, instead of in-memory embeddings.

  • UI could be better

    • add photos of stores and products

  • More products could be used to increase accuracy of description.

  • This could be turned into product search in addition to merchant search, similar to Shopify’s implementation

  • Add product prices / reviews / etc

It’s only a matter of time before a company ends up building a killer product discovery app for consumers (whether Shopify or someone else).

With the OpenAI embeddings API and some creativity, you can create a custom search interface for any database of information. Many companies are already doing this with company Slacks / emails / codebases. It’s essentially having the power of Google that can be deployed anywhere.

This is all done with consumer hardware, and can run with minimal resources which massively lowers the barrier to entry to create amazing products.

I’m incredibly excited to see what folks build with these APIs.

Adam Turner is the CEO of Postscript.io, a Series C, 50m+ revenue startup helping Shopify merchants make SMS their #1 revenue channel.