Show HN: I built an LLM comment detector for HN (I got banned)

4 points by umairnadeem123 3 months ago · 6 comments · 2 min read

Got banned from HN a few days ago for LLM posting. i honestly deserved it. 100+ comments in a few days, that's just abusive.

I have RSI so I use voice and LLM to type. Dictate my thoughts, model shapes the sentences. I got lazy about where the line was and automated too much.

After getting unbanned I went through all the comments dang has flagged for LLM posting over the years(https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...) and looked for patterns. Some are obvious, some surprised me:

- curly/typographic quotes (“ ” instead of " ") or even ’ vs ' (that’s is LLM, that's is human)

- humans typing in a browser text box produce straight ASCII. finding curly quotes in a plain HN comment means the text was generated elsewhere and pasted in

- exactly 3 paragraphs of 1-2 sentences each - extremely common LLM output shape

- examples always come in threes - "for example, X, Y, and Z"

- → arrows and — em dashes (sometimes replaced with - en dashes to evade detection)

- overly sycophantic openers - "great point", "this is really interesting" before saying anything

- fake personal framing - "in practice I've found..." immediately followed by a generic claim

Built a detector around these + some heavier signals (TF-IDF cosine similarity across a user's comment history, optional Anthropic/OpenAI LLM pass). You can paste any HN comment URL/ID or just raw text and see what fires

I ran my own banned comments through it. They score 70-85. Sounds about right.

https://hn-bot-detector.vercel.app/

gh: https://github.com/umairnadeem/hn-bot-detector

I wrote this post myself btw

cd4761 3 months ago

Curly quotes as a signal is clever — it's not about the content being "AI-like", it's about the text never having touched a browser text box. Simple but hard to fake.

umairnadeem123OP 3 months ago

something deeply ironic about this.
- cd4761 3 months ago
  
  Ha — I'll take that as a compliment on my prose style.
  - umairnadeem123OP 3 months ago
    
    honestly the best way to beat my detector is to just write worse

Aeglaecia 3 months ago

cant believe i never noticed the quote thing before. tis a mere a matter of time before these tropes become redundant , but for the moment they are very useful to memorize. thank you for sharing.

i tried using your website. it mostly works but it broke a little bit:

Anthropic parse error: ```json { "score": 15, "reasons": [ "Strong personal voice with natural argumentative tone and direct confrontat

umairnadeem123OP 3 months ago

ah yes. looks like it was going over the max token limit. if you retry, will work again!

Settings

Show HN: I built an LLM comment detector for HN (I got banned)

Keyboard Shortcuts