Settings

Theme

Show HN: Aventos – An experiment in cheap AI SEO

aventos.dev

19 points by JimsonYang 21 days ago · 16 comments · 2 min read

Reader

Hi HN, we built Aventos- a cheap way to track company mentions in LLMs.

Aventos is an experiment we're doing after spending ~6 weeks working on various projects in the AI search / GEO / AEO space.

One thing that surprised us is how most tools in this category work. Traditionally, they simulate ChatGPT or Perplexity queries by attempting to reverse engineer the search process. Over the past year, many have shifted to scraping live ChatGPT results instead, since those are signficantly cheaper and reflect more real outputs.

Building and maintaining scrapers is tedious and fragile, so recently a number of SaaS products have emerged that effectively wrap a small number of third-party ChatGPT/Perplexity/Google AIO/etc scraping APIs. What felt odd to us is that many of these still tools charge $70–$200+ per month, despite largely being wrappers around the same underlying data providers.

So we wanted to test a simple idea: if the core cost is just API usage and commodity infrastructure and software costs are lower because of AI, can we be a successful startup if we price near our costs?

What we have so far:

1. Analytics similar to other tools (tracking AI citations, AI search results, and competitor mentions)

2. Content creation features (early and still being improved)

We’d love feedback- especially from a non-marketing perspective on:

* bugs

* confusing terminology or tabs

* anything that feels hand-wavy or misleading

There’s a demo account available if you want to poke around:

username: divit.endal4@gmail.com password: password

Happy to answer questions about what other things we've built in the space, how these tools work, etc.

gamegod 21 days ago

I stumbled across this a few weeks ago via Google or Kagi. One problem I have with a lot of AI tooling today is that I can't tell if it's a legit product, or just a toy that somebody vibe-coded in a day. AI can easily crap out a website that looks like this.

This is all just my 2 cents, so take it with a grain of salt, but I see trust and authenticity as a huge issue (made worse by LLMs), and it's doubly so with AI-based companies because it attracts flies.

One question I have is it's not clear to me that this all just doesn't boil down to plain old SEO. Does your platform generate recommended actions on how to improve your ChatGPT ranking? (and how is that different from just improving your PageRank?)

  • JimsonYangOP 21 days ago

    Thanks for the feedback. To answer your question is: 1) yes, you still need good content as if you're ranking for SEO. However trad seo(i'm shorthanding bcuz i'm lazy) requires you to really to be in the top 5 results. Although this has beeen made worse by google AIO, 0 click answers, and a bunch of other things. For AI SEO you need to be in the top 20-30. Easier-yes but fundamentals like writing good conent is still needed 2) The between trad SEO and AI SEO is that the content you have to create to rank well has changed-often needing multiple pieces to rank well. For instance the prompt "Which AI company has the safest AI models?" has google results show a lot of listing various companies by ranking them safe and least safe-almost promotional content. But if you look at what chatgpt is searching and getting, it's a lot more safety index, reports, and technical information.

    There's SEO experts that believe that trad seo and ai seo is the same-that's fair. We're of the group that believe it's pretty much the same, although we're starting to come at an inflection point and it'll change in the future.

    To answer your second question: we don't have an icon to give recommended actions. I think that's a mistake we made during planning, where we saw "ai recommended actions" and questioned the validity of them. For instance, it doesn't make sense to edit and change wikipedia becuase that's incredibly hard although a few platforms do recommend that. The idea was that users would improve their keyword research and blog content strategy and we just assist in figuring out what to write for, which we still think is the best approach- however it seems that wasn't very clear. It's different from improving page rank because the strategy has been shifted. Instead of opimizing for the 1st and 2nd result, you're optimizing for topics as a whole and can even appear on the second page of google. If you're saying improving your pagerank via making good content that's unique and high quality-then I fully agree that you need to improve your pagerank

  • add-sub-mul-div 21 days ago

    So many LLM submissions here are self promotion spam from accounts with no other activity, which further makes them seem low effort and decreases the average amount of trust they've earned or deserve.

satvikpendem 21 days ago

> many of these still tools charge $70–$200+ per month, despite largely being wrappers around the same underlying data providers

That is business. You should be charging a premium for features that companies would like to use, that's one of the first rules of B2B.

  • JimsonYangOP 21 days ago

    That’s an awesome point. I think it's an aversion on my end to charge a higher price. Partly because it's still a pretty green field so there’s a lot that hasn’t been built yet- features that companies would like to use and we would charge for.

    Will we charge a premium in the future? Yeah, but only where we’re confident there’s real ROI and have a great product to back that. For instance, AI attribution in this space is incredibly difficult right now. And in my opinion, without hard numbers (revenue, sign-ups, conversions, etc.), I can’t justify asking for premium pricing.

    Maybe I'm thinking about it wrong tho and making excuses for myself, I think you have a good point

    • chupchap 16 days ago

      You could create a separate pricing for demanding businesses that need to track multiple products/brands that come under its umbrella, while retaining the existing pricing for smaller businesses and individuals

n_u 16 days ago

> wrap a small number of third-party ChatGPT/Perplexity/Google AIO/etc scraping APIs

Can you explain a little bit how this works? I'm guessing the third-parties query ChatGPT etc. with queries related to your product and report how often your product appears? How do they produce a distribution of queries that is close to the distribution of real user queries?

  • JimsonYangOP 15 days ago

    How third parties query your product: For ChatGPT specifically, they open a headless browser, ask a question, and capture the results like the response and any citations. From there, they extract entities from the response. During onboarding I’m asked who my competitors are and the response is going to be recongized via the entities there. For example, if the query is “what are the best running shoes” and the response is something like “Nike is good, Adidas is okay, and On is expensive,” and my company is On, using my list of compeitotrs entity recognition is used to see which ones appear in the response in which order.

    If this weren’t automated, the process would look like this: someone manually reviews each response, pulls out the companies mentioned and their order, and then presents that information.

    2) Distribution of queries This is a bit of a dirty secret in the industry (intentional or not): usually what happens is you want to take snapshots and measure them overtime to get distribution. However a lot of tools will run a query once across different AI systems, take the results, and call it done.

    Obviously, that isn’t very representative. If you search “best running shoes,” there are many possible answers, and different companies behave differently. What better tools do like Profound is run the same prompt multiple times. From my estimates, Profound runs up to 8 times. This gives a broader snapshot of what tends to show up everyday. You then aggregate those snapshots over time to approximate a distribution.

    As a side note: you might argue that running a prompt 8 times isn’t statistically significant, and that’s partially true. However, LLMs tend to regress toward the mean and surface common answers over repeated runs and we found 8 times to be a good indicator- the level of completeness depends on the prompt(i.e. "what should i have for dinner" vs "what are good accounting software for startups", i can touch on that more if you want

    • n_u 15 days ago

      As I understand, in normal SEO the number of unique queries that could be relevant to your product is quite large but you might focus on a small subset of them "running shoes" "best running shoes" "running shoes for 5k" etc. because you assume that those top queries capture a significant portion of the distribution. (e.g. perhaps those 3 queries captures >40% of all queries related to running shoe purchases).

      Here the distribution is all queries relevant to your product made by someone who would be a potential customer. Short and directly relevant queries like "running shoes" will presumably appear more times than much longer queries. In short, you can't possibly hope to generate the entire distribution, so you sample a smaller portion of it.

      But in LLM SEO it seems that assumption is not true. People will have much longer queries that they write out as full sentences: "I'm training for my first 5k, I have flat feet and tore my ACL four years ago. I mostly run on wet and snowy pavement, what shoe should I get?" which probably makes the number of queries you need to sample to get a large portion of the distribution (40% from above) much higher.

      I would even guess it's the opposite and the number of short queries like "running shoes" fed into an LLM without any further back and forth is much lower than longer full sentence queries or even conversational ones. Additionally because the context of the entire conversation is fed into the LLM, the query you need to sample might end up being even longer

      for example: user: "I'm hoping to exercise more to gain more cardiovascular fitness and improve the strength of my joints, what activities could I do?"

      LLM: "You're absolutely right that exercise would help improve fitness. Here are some options with pros and cons..."

      user: "Let's go with running. What equipment do I need to start running?"

      LLM: "You're absolutely right to wonder about the equipment required. You'll need shoes and ..."

      user: "What shoes should I buy?"

      All of that is to say, this seems to make AI SEO much more difficult than regular SEO. Do you have any approaches to tackle that problem? Off the top of my head I would try generating conversations and queries that could be relevant and estimating their relevance with some embedding model & heuristics about whether keywords or links to you/competitors are mentioned. It's difficult to know how large of a sample is required though without having access to all conversations which OpenAI etc. is unlikely to give you.

      • JimsonYangOP 14 days ago

        short answer it depends and idk. When I was doing some testing with prompts like "what should I have for dinner" adding variations, "hey ai, plz, etc" doesn't deviate intention much. As ai is really good at pulling intent. But obv if you say "i'm on keto what should i have for dinner" it's going to ignore things like "garlic, pesto, and pasta noodles". Although it pulls a similar response to "what's a good keto dinner". From there we really assume the user can know their customers what type of prompts led them to chatgpt. You might've noticed sites asking if you came from chatgpt, i would take that a step further and asked them to type the prompt they used.

        But you do bring a good perspective because not all prompts are equal especially with personaliztion. So how do we solve that problem-I'm not sure. I have yet to see anything in the industry. The only thing that came close was when a security focused browser extension started selling data to aeo companies- that's how some companies get "prompt volume data".

        • n_u 14 days ago

          I see what you are saying, perhaps no matter the conversation before as long as it doesn't filter out some products via personalized filters (e.g. dietary restrictions) it will always give the same answers. But I do feel the value prop of these AI chatbots is that they allow personalization. And then it's tough to know if 50% of the users who would previously have googled "best running shoes" instead now ask detailed questions about running shoes given their injury history etc and that changes what answers the chatbot gives.

          I feel like without knowing the full distribution, it's really tough to know how many/what variations of the query/conversation you need to sample. This seems like something where OpenAI etc. could offer their own version of this to advertisers and have much better data because they know it all.

          Interesting problem though! I always love probability in the real world. Best of luck, I played around with your product and it seems cool.

twooclock 16 days ago

After demo login I'm stuck at Dashboard tour modal. Can't strat tour or dismiss modal. FF on mobile.

Tried it again and somehow managed to bypass it. Site is not really mobile friendly. I understand but maybe you shoud info a potentional user somewhere?

  • JimsonYangOP 15 days ago

    oh wow, didn't even noticed that. That's a good catch. sry for having it happening to you. Will be removing the tours

dyauspitr 16 days ago

I don’t understand what the business is here. Is this just scraping ChatGPT results that are public to check if your business has been mentioned in the answer?

  • JimsonYangOP 15 days ago

    you got it perfectly. Taking scraped results and presenting the data to customers. Based on these results, SEO/AEO experts then make suggestions like "write about X" to help you "improve" in the results. Is it part dark magic part truth? probably, but that's the SEO industry

    I think the biggest issue in the industry is there's no way to defintiely tell how useful are these converted users or whether performance is drastically better by showing up in chatgpt. Although attribution itself is a difficult issue in the online marketing industry

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection