Show HN: Hercules – Open-Source software testing agent

10 points by Smilinrobin a year ago · 13 comments · 2 min read

Reader

Hey HN,

After spending way too many late nights wrestling with brittle test scripts and UIs that seem to change just to spite me, we decided to do something about it. So we built Hercules, an open-source testing agent aiming to make end-to-end testing of modern web applications less of a Herculean task (pun absolutely intended).

P.S.- Paid solutions can cost upwords of $15000 and still struggle with maintenance.

What is Hercules?

Hercules is an AI-powered testing agent that turns simple Gherkin steps into fully automated end-to-end tests, eliminating the need for coding skills. It leverages Large Language Models (LLMs) to reason and perform actions based on test requirements.

Key Features-

- Gherkin In, Results Out: Just provide your tests in Gherkin format, and Hercules executes them automatically, outputting results in JUnit XML and HTML reports. - Free and Open Source: Hercules is released under the AGPL v3 license, allowing anyone to use, modify, and contribute. - Salesforce Ready: Capable of handling intricate UIs, Hercules excels in testing platforms like Salesforce. - No Code Required: Automates Gherkin features without the need for scripting or dealing with selectors. - Multilingual Support: Supports multiple languages out of the box, facilitating global collaboration. - Precisely Accurate: Records execution videos and captures network logs for reliable test results. - No Maintenance: Features auto-healing capabilities to adapt to changes without manual intervention. - CI/CD Ready: Easily integrates into your CI/CD pipeline and can run in Docker containers.

Why is Hercules Different?

Unlike traditional testing tools, Hercules acts as an intelligent agent that can think, reason, and adapt. It's built on a multi-agent architecture, allowing it to plan and execute tasks based on the provided Gherkin steps. Additionally, it comes packed with tool usage like Browsers, or APIs; and users can build/attach external tools too.

Try It Out

The source code and detailed documentation are available on GitHub: [https://github.com/test-zeus-ai/testzeus-hercules](https://g...

To get started:

- Install via PyPI:

  ```bash
  pip install testzeus-hercules
  playwright install --with-deps
  ```

- Or use the Docker image:

  ```bash
  docker pull testzeus/hercules:latest
  ```

We've included sample feature files and a quick-start guide in the README to help you get up and running quickly.

Feedback Welcome-

We'd love to hear your thoughts, experiences, or even horror stories about testing gone wrong. Feel free to open issues on GitHub or join forces on our Discord : https://discord.gg/4fyEMWVD

SmilinrobinOP a year ago

THanks HackerNews for not letting us update the post. But the Discord link had some bug (yes, we need more testing), so here's the link for joining our Discord server: https://discord.gg/JhSAGhfR55

dhorthy a year ago

salesforce-ready is an interesting specialization, but makes a ton of sense if you're going after enterprise. Stoked to give this a spin.

lemme know if/when you build the CLI agent that write my gherkin too

shriyansha a year ago

Wow, you read our minds.
Checkout our slack channel, we can discuss these ideas in the forum, I would love you hear about your feedback and experience with the runs.
https://join.slack.com/t/testzeuscommunityhq/shared_invite/z...
SmilinrobinOP a year ago

Thanks. Also, while Hercules can do Salesforce, we've also checked it for complicated scenarios on the web (and there's an eval folder too). We'd love to know about your feedback and thoughts.

deepak_heatm a year ago

Sad.. If ollama has limitations with functional calling, can we try NexusRaven-13B, Functional calling Mistral 7B ?

SmilinrobinOP a year ago

In theory , yes we could, but would it yield "good enough" results for a "testing" agent- Probably not. The LLM here is actually not just responsible for tool calling, its also doing other intricate things such as planning the next steps based on the input feature file, and generating the browser/API automation code. In our experiments we found that OpenAI 4o performs best, followed by Haiku or Grok.

tacticalglare a year ago

What would this need to be compatible with Ollama? I would love to run this with a local LLM!

shriyansha a year ago

We tried Gpt-4o, Anthropic haiku, Groq's 8b tool preview, Llama 3.2 8b tool support via Groq and all worked well.
With Ollama in our machines, we didn't get good results maybe because of model limitation on function calling.
We are experimenting with more models and smaller models so that running local with local models becomes possible.
Although if you check the readme and example config file there is a way to give it a try with ollama.
SmilinrobinOP a year ago

You can already try other LLMs. Steps in docs.

Settings

Show HN: Hercules – Open-Source software testing agent

Keyboard Shortcuts