Settings

Theme

Ask HN: I want to build my own query language

3 points by bewal416 2 months ago · 11 comments · 2 min read


Our product is starting to get more and more requests for custom reports. I’ve built some basic tables with filters and exports to Excel/PDF, but they fall short of the nuance our customers need, especially those in regulated markets.

One customer needs full first names but only the first letter of the last name. Another needs a very specific JOIN with an entity almost no other customer cares about. To accommodate, I’ve been building custom Looker reports for each customer, which won't scale well.

I started looking into how other SaaS companies solved this. Many built their own SQL-like query languages:

- Salesforce -> SOQL - Shopify -> ShopifyQL - Stripe -> Sigma

All of them seemed to address the same problem I’m seeing: customers have unique reporting needs that no-code GUIs can’t handle. A drag and drop builder is great for non-techies, but most real requests require joins and transformations, and I’m trying to avoid becoming a consulting shop for every customer.

I'm particularly impressed by Stripe Sigma because of how they combine SQL with an LLM layer. Users can ask for a report in plain English, customize it in a lightweight BI tool, and edit the query only whenever needed.

Has anyone gone through this or have advice on alternative approaches? I’m open to any direction here.

doctorzook 2 months ago

Unless your data is really unusual, I’d generally recommend that you avoid writing your own query language and processor: it’s just damn hard to make it work well. Instead, look at how to put something like DuckDB in front of your data so people can just write SQL.

  • PaulHoule 2 months ago

    Or a step up from that: build a compiler that converts queries in a human-friendly or application-specific language to SQL or something similar.

benoau 2 months ago

I'd stick with SQL, they can pull queries straight out of ChatGPT if they don't know it themselves.

If everyone lives within one database I'd throw up a per-customer read-only database in front of it for running their queries so they don't create performance issues.

  • bewal416OP 2 months ago

    We do have a single-tenant DB. That’s one of my architecture challenges- how to handle permissions and clean up the schema a bit to entities that only my users need.

    • benoau 2 months ago

      Possibly achieve that with some views or w/e the equivalent is in your database, and database accounts that can only access those views.

      Another option might be to let them ingest their data directly into the existing BI tools they use where they can do whatever they want, cool thing about that is it can entrench you into their infrastructure and it offloads a lot of this complexity you're dealing with.

      • bewal416OP 2 months ago

        Okay- just spent the whole day tinkering wit this:

        1) I create a baseline set of views I want my customers to have 2) For each new customer, I’ll run a script that create a replica of those views- filtered by their customer ID 3) I’ll allow my customers to write pure SQL- limiting them to only SELECT queries and a couple niche business rules, as well as masking any DB-level errors, because that just feels wrong

        How does that approach sound?

        • benoau 2 months ago

          I think the main thing you're missing is creating an account in the DB that only has access to those views, so for each customer you'd do something like:

              CREATE USER customer_xyz WITH PASSWORD 'foo';
          
              CREATE VIEW customer_xyz_data AS SELECT * FROM data_stuff WHERE customer_id=x;
          
              GRANT SELECT ON customer_xyz_data TO customer_xyz;
          
          So then two things are happening, SELECT-only is being enforced by the view itself no matter what, and their account is categorically unable to touch anything outside of that view too, so as long as you run their queries through that account it will always be sandboxed.

          You can enforce all of that yourself but ultimately if they're using an account that can read/write other tables you will always have to be careful to make sure you are sanitizing their input not just to selecting but like, limiting joins and nested queries too.

          • bewal416OP 2 months ago

            Gotcha. Yeah- I was thinking of working with my engineers to figure out a permissions layer, but I understand enforcing that at the DB-level would guarantee security.

            Dumb question- is creating a set of Views for each customer even efficient for my MySQL database? I could realistically see us having ~12 customer-facing views- is having 12*N views a smart and scalable way to architect this?

            • benoau 2 months ago

              A view is just a query that pretends to be a table, so it will come down to the complexity of that query. Each time you're querying the view it will be running the combination of the user's query against the view's query so the performance comes down to whether your DB is optimized around basically "SELECT field1, field2, field3 FROM (SELECT * FROM data_stuff WHERE customer_id=x)". Whether you execute that query as a view or as ad-hoc SQL doesn't make a difference itself.

              "Your side" of this can be optimized easily enough, but the user-submitted queries are likely to be inefficient or miss indexes, which is why one database per customer can be better since they each have their own resources.

              You can create the views and accounts as needed and destroy them when sessions end rather than keeping them permanently too, so when the user signs in you create the view and account, after the session or some period of inactivity you remove them.

              • bewal416OP 2 months ago

                Makes sense. The fact that my SQL Editor puts tables and views in the same section on its left sidebar was the main reason I did a double-take.

                The idea of deleting and recreating views is an interesting one. I see that as a really cool approach- considering we can go without it as a v1 then include it as we scale.

                Thank you for all your advice so far! This has been truly helpful.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection