Show HN: Omni – Open-source workplace search and chat, built on Postgres

165 points by prvnsmpth a day ago · 51 comments · 2 min read

Reader

Hey HN!

Over the past few months, I've been working on building Omni - a workplace search and chat platform that connects to apps like Google Drive/Gmail, Slack, Confluence, etc. Essentially an open-source alternative to Glean, fully self-hosted.

I noticed that some orgs find Glean to be expensive and not very extensible. I wanted to build something that small to mid-size teams could run themselves, so I decided to build it all on Postgres (ParadeDB to be precise) and pgvector. No Elasticsearch, or dedicated vector databases. I figured Postgres is more than capable of handling the level of scale required.

To bring up Omni on your own infra, all it takes is a single `docker compose up`, and some basic configuration to connect your apps and LLMs.

What it does:

- Syncs data from all connected apps and builds a BM25 index (ParadeDB) and HNSW vector index (pgvector)

- Hybrid search combines results from both

- Chat UI where the LLM has tools to search the index - not just basic RAG

- Traditional search UI

- Users bring their own LLM provider (OpenAI/Anthropic/Gemini)

- Connectors for Google Workspace, Slack, Confluence, Jira, HubSpot, and more

- Connector SDK to build your own custom connectors

Omni is in beta right now, and I'd love your feedback, especially on the following:

- Has anyone tried self-hosting workplace search and/or AI tools, and what was your experience like?

- Any concerns with the Postgres-only approach at larger scales?

Happy to answer any questions!

The code: https://github.com/getomnico/omni (Apache 2.0 licensed)

acidburnNSA a day ago

* "Self-hosted: Runs entirely on your infrastructure. No data leaves your network."

* "Bring Your Own LLM: Anthropic, OpenAI, Gemini, or open-weight models via vLLM."

With so many newbies wanting these kinds of services it might be worth adjusting the first bullet to say: "No data leaves your network, at least as long as you don't use any Anthropic, OpenAI, or Gemini models via the network of course"

prvnsmpthOP a day ago

That's a good point, it might make sense to clarify that for individuals who want to self-host. I'll make the change, thanks!
cjonas a day ago

Most organizations are going to be self hosting on aws, gcp or azure... So as long as you use their inference services as your LLM then you can keep it all within the private network
- acidburnNSA a day ago
  
  Even self-hosting on AWS, GCP, or Azure isn't local enough for certain application, such as people doing export-controlled work where any sysadmin or person with physical access to the server/data is required to be a US Person (or equivalent in other countries). This is the niche that the govcloud solutions are aimed at serving. But some people just want to build big actually-private, actually self-hosted systems and do their own physical and network security.
  - whattheheckheck 17 hours ago
    
    AWS Bedrock seems to say the inference code is only scanned for CASM and no one trains on your data.
    
    acidburnNSA 17 hours ago
    
    Are all people with physical access to the servers or network access to the hosts guaranteed to be US persons? Are all physical and network accesses logged for audits? That's the kind of thing govcloud promises that export control auditors want to see.
    I felt like "Confidential Compute" tech could solve this issue once and for all but I'm not so sure after seeing some of the attacks people can do with physical access.
    Another option of course is to not use cloud at all and have your own rack in a locked room with a good security system and/or armed US person guards.
- prvnsmpthOP a day ago
  
  Exactly, enterprise customers almost always use private model endpoints on their cloud provider for any serious deployments. Data stays within the customer's VPC, data security and privacy is guaranteed by the cloud providers.

philippemnoel a day ago

(ParadeDB maintainer here). This is super cool. Congrats on the project, and I'm excited to see ParadeDB be used to power this kind of use case. If there's anything else you need to ship Omni, don't hesitate to reach out to me!

dmix a day ago

This is a good time to be offering hybrid search extensions. I just did that myself recently with pgvector for a documentation site.
Does ParadeDB work with Render? They seem to have a whitelist of extensions https://render.com/docs/postgresql-extensions
- philippemnoel a day ago
  
  We just made a blueprint for it! https://github.com/paradedb/render-blueprint
  One-click deploy with Render, and we're directly in contact with the core team to get it added to their official docs. I hear the PR is up internally :)
  - dmix a day ago
    
    Sweet, nice work
prvnsmpthOP a day ago

Thanks Philippe! You guys have been super helpful on slack!
- philippemnoel a day ago
  
  Anytime! We have some vector search work coming in the next few weeks/months that I expect you'll find interesting. Stay tuned :)

aitchnyu 9 hours ago

Tangential, when you mentioned "Full-text (BM25) and semantic (pgvector) search", what are the significant benefits of the latter? I used to think of BM25 indexes as vectors of documents, which support search, "more like this" etc.

zaphoyd a day ago

How are you managing multiplayer and permissions? I see in the docs that you can add multiple users and that queries are filtered by the requesting user such that the user only sees what they have access to. The docs aren't particularly clear on how this is being accomplished.

Does each user do their own auth and the ingest runs for each user using stored user creds, perhaps deduplicating the data in the index, but storing permissions metadata for query time filtering?

Or is there a single "team" level integration credential that indexes everything in the workspace and separately builds a permissions model based on the ACLs from the source system API?

prvnsmpthOP a day ago

So it depends on the app - e.g., Google has domain-wide delegation where the workspace admin can provide service account creds that allow us to impersonate all users in the workspace and index all their files/email. During indexing, we determine the users/groups who have permissions file and persist that in the db. (It's not perfect, because Google Drive permission model is a bit complex, but I'm working on it.) This model is much simpler than doing per-user OAuth.
In general, the goal is to use an org-wide installation method wherever possible, and record the identify of the user we are impersonating when ingesting data in the ACL. There are some gaps in the permission-gathering step in some of the connectors, I'm still working on fixing those.

PhilippGille a day ago

How does it compare to Onyx (rebranded from Danswer, with more chat focus, while Danswer was more RAG focus on company docs/comms)?

- https://onyx.app/

- Their rebranded Onyx launch: https://news.ycombinator.com/item?id=46045987

- Their orignal Danswer launch: https://news.ycombinator.com/item?id=36667374

prvnsmpthOP a day ago

So far both projects are quite similar… the only major difference being the search index. Onyx uses vespa.ai for BM25 and vector search, I decided to go down the Postgres-only route.

Doublon a day ago

Interesting!

I also started to build something similar for us, as an PoC/alternative to Glean. I'm curious how you handle data isolation, where each user has access to just the messages in their own Slack channels, or Jira tickets from only workspaces they have access to? Managing user mapping was also super painful in AWS Q for Business.

prvnsmpthOP a day ago

Thank you!
Currently permissions are handled in the app layer - it's simply a WHERE clause filter that restricts access to only those records that the user has read permissions for in the source. But I plan to upgrade this to use RLS in Postgres eventually.
For Slack specifically, right now the connector only indexes public channels. For private channels, I'm still working on full permission inheritance - capturing all channel members, and giving them read permissions to messages indexed from that channel. It's a bit challenging because channel members can change over time, and you'll have to keep permissions updated in real-time.

keyle a day ago

I've done some RAG using postgres and the vector db extension, look into it if you're doing that type of search; it's certainly simpler than bolting another solution for it.

prvnsmpthOP a day ago

Yeah, Omni uses Postgres and pgvector for search. ParadeDB is essentially just Postgres with the pgsearch extension that brings in Tantivy, a full-text search engine (like Apache Lucene).

Lapalux a day ago

Can it connect to Teams?

patates a day ago

Tangeant: Why is integrating with teams SO difficult?
I started parsing its system logs to create entries in our system automatically to book my times - just not todeal with their silly REST api requirements.
prvnsmpthOP a day ago

Not yet, there’s a Microsoft connector implementation, but it only does Sharepoint, OneDrive, Outlook etc. and I haven’t tested it thoroughly yet. Teams required some special setup to work IIRC, so I skipped it. Will keep it on the roadmap though!

andai a day ago

Nice! Could you elaborate on "not just a basic RAG"?

prvnsmpthOP a day ago

Thank you!
Typical RAG implementations I’ve seen take the user query and directly run it against the full-text search and embedding indexes. This produces sub-par results because the query embedding doesn’t really capture fully what the user is really looking for.
A better solution is to send the user query to the LLM, and let it construct and run queries against the index via tool calling. Nothing too ground-breaking tbh, pretty much every AI search agent does this now. But it produces much better results.
- andai 14 hours ago
  
  I call this ralphgrep

vladdoster a day ago

Multiple pages link to a `API Reference` that returns a 404

prvnsmpthOP a day ago

Oops, sorry! That page is still a WIP, haven't pushed it yet. The plan was to expose the main search and chat APIs so that users can build integrations with third-party messaging apps (e.g. Slack), but haven't gotten around to properly documenting all the APIs yet.

jFriedensreich a day ago

Can we please not change the meaning of chat to mean agent interface? It was painful to see crypto suddenly meaning token instead if cryptography. Plus i really dont want to “chat” with ai. its a textual interface

prvnsmpthOP a day ago

Fair point, although I think we have OpenAI to blame for that - for buying chat.com and pointing it to the most popular textual AI interface of them all :)

octoclaw a day ago

The Postgres-only approach is a really smart call for this scale. I've run pgvector alongside BM25 (via ParadeDB) for internal search at work and it handles mid-size corpora surprisingly well. The operational simplicity of one database vs. managing Elasticsearch + a vector DB + Postgres is a huge win for small teams.

One thing I'd watch out for: HNSW index rebuild times can get painful once you cross ~5M vectors. We ended up doing incremental inserts with a background reindex job. Not a dealbreaker, just something to plan for early.

Also curious how you handle permission syncing. That's usually where self-hosted workplace search gets tricky. Google Drive permissions in particular are a nightmare to mirror accurately.

Settings

Show HN: Omni – Open-source workplace search and chat, built on Postgres

Keyboard Shortcuts