Show HN: Cowork/Codex DOCX plugin. Uses 2x fewer tokens than the docx skill

5 points by tanin 2 months ago · 6 comments · 1 min read

Reader

Hi HNers,

I'd like to share our DOCX plugin for Cowork and Codex.

It uses 2-5x fewer tokens compared to the traditional docx skill because it doesn't write any code nor execute python/node script. It is also much more reliable.

Our DOCX plugin converts docx<->html bidirectionally. This means AI only operates on HTML. AI is excellent and very efficient when it comes to HTML.

Most libraries (if not all) support docx->html, but none supports html->docx. This is what is novel about our approach.

Here's the demo: https://drive.google.com/file/d/1UNlUJYwkNX3NiANDkLLb3UoRSms...

We've been using it in-house for redlining legal documents, and we love it. If you redline docx files, please give it a try: https://github.com/LegalRabbit-AI/legalrabbit-docx-claude-pl...

xms17189 2 months ago

Interesting approach. Does keeping the model in HTML also preserve enough structure for tracked changes/comments, or do you handle those as a separate layer when converting back to DOCX?

taninOP 2 months ago

Thank you!
My thesis is that an intermediate layer would eventually end up being equivalent to the docx format, so I've decided not to have any intermediate representation.
We convert docx to html and send it AI. When AI rewrites the HTML and it back, we diff the rewritten HTML against the docx's document.xml and make the modification. This is a simplistic explanation of it. There are a bunch of validations and processing going on.
Regarding the tracked changes/comments, we simply invent new HTML tags for those things e.g. <ins>, <del>, <commentRangeStart> and etc.

StahlGuo 2 months ago

I would try it today, sounds good

taninOP 2 months ago

Please let me know if you have questions or suggestions.

dev-kdrainc 2 months ago

Thanks for sharing! I like your approach to working under the hood! Great job

taninOP 2 months ago

Thank you!

Settings

Show HN: Cowork/Codex DOCX plugin. Uses 2x fewer tokens than the docx skill

Keyboard Shortcuts