Docs directories are doomed

16 points by lubujackson 2 months ago · 24 comments

Reader

I feel like the article either doesn't really contain any information, or is describing the concept of "code comments" after being translated through 35 languages.

wizzwizz4 2 months ago

While it feels like these sorts of observations are shouting into the void, I think they do have a cumulative impact. It's probably another 5 years until the hype's over, but that'll be 5 years of the voices of reason saying "you've reinvented X" for each X of pre-AI best practice.
I admit, I am somewhat excited to see what's actually left, after the hype has gone away. Because there might actually be something. LLMs can only contribute to projects where there's a severe deficiency, or there's enough of a specification that a heuristic-guided fuzzer could do the same job quicker. LLMs are worse at translation than much smaller seq2seq transformer models. LLM apparent writing ability is mainly attributable to plagiarism and the LLMentalist effect. LLM apparent sentience is mainly attributable to the ELIZA effect. But once you strip away all the hype, will we be left with a pearl, or just bits of dead clam?

holliplex 2 months ago

I would describe myself as pretty AI-positive in software engineering, and even in technical writing, but something about seeing diagrams that are clearly generated by Nano Banana Pro immediately makes me stop reading. Weird!

iamcalledrob 2 months ago

Same.
I think I've unintentionally trained myself to notice (and tune out) both AI illustrations and AI writing.
At a deep instinctual level, knowing that someone hasn't spent much time or effort creating the content makes me not want to reciprocate with time or effort.
I've realised that my brain literally tunes out AI illustrations, much as it does with ad banners.
Perhaps since they're easy to generate, I encounter illustration more -- it's no longer a signal of quality.
bartvk 2 months ago

That's not the only thing clearly generated. "Some looming issues", "some thorny issues", it's full of these weird AI sayings. The whole thing feels weirdly written.
- lelanthran 2 months ago
  
  > The whole thing feels weirdly written
  I keep repeating this: AI written prose lives in an uncanney valley that is both clearly grammatically correct but still weirdly off.
  Why do we think that the AI generated code is any better?
  I described AI-generated code as feeling very "alien" to me, but I'm not sure that that is the correct term.
  - holliplex 2 months ago
    
    I think it's mostly just that we are very good at picking up on patterns, and it's extremely noticeable that half the internet has started writing in the same voice with the same tics. If Claude were quietly posting away in 2017 I don't think anyone would think twice about its output.
  - Ferret7446 2 months ago
    
    We're well past the point where humans can reliably identify AI generated content. Sure, you might often correctly identify AI content, but part of that is due to how much AI content there is; you can call everything AI generated and still have a high ratio of correctness. Meanwhile, I guarantee there's a lot of AI content that you're failing to notice.
    Rather than using the AI bogeyman, why not analyze things as-is? If it's good or bad, does it matter if it's AI or human? Or are you in denial about some existential fear?
outime 2 months ago

To me, the something in this case is the mangled text and the weird "lighting" in some of the icons. Not the worst I've seen but it definitely puts you off.
BoredPositron 2 months ago

Depends making an ascii diagram or one in cali and adding flair with nano seems fine. You do the logic nano adds the flavor.
Traubenfuchs 2 months ago

It's all wonky with "hand drawn letters" and slightly off with low fidelity and repetitive usage of graphical primitives.
Low quality trash that is offensive to be given to read because the author didn't actually give enough shits to spend a few minutes creating the graphics by hand.
I don't want to work with people like "Jim Yagmin", people that consider this kind of output acceptable. This immediately makes me expect sub par "good enough" work with no attention to detail. Just slop it at the wall and see what sticks!

rokkamokka 2 months ago

The blog reads like an advertisement for some product that doesn't exist (yet?). It seems rough to store, access and accurately update this context-code mapping

CharlieDigital 2 months ago

Here's an easier solution that actually works, gives any agent FREE long term memory (platform agnostic and zero infrastructure!), always accurate context that is self-maintained by the LLM.

Use the idiomatic comments for your language.

Here is a snippet of our prompt for C# (and similar one for TS):

    - Use idiomatic C# code comments when writing public methods and properties
    - `<summary>` concise description of the method or property
    - `<remarks>` the "why"; provide domain or business context, chain of thought, and reasoning; mention related methods, types, and files
    - `<param>` document any known constraints on inputs, special handling, etc.
    - `<return>` note the expected return value
    - `<example>` provide a brief example of correct usage
    - Use inline comments sparingly where it adds clarity to complex code
    - Update comments as you modify the code; ensure they are consistent with the intent of the code

What happens: when the LLM stumbles upon this code in the future, it reads the comments and basically "re-hydrates" some past state into context. The `<remarks>` one is doing heavy lifting here because it is asked to provide its train of thought and mention related artifacts (future LLM knows where else to look).

You already know the agents are going to read your code again when it gathers context so just leave the instructions and comments inline.

The LLM is very good at keeping these up-to-date on refactors (we are still doing human code reviews) and a bonus is that it makes it very easy to review the code to see why the LLM generated some function or property because the reasoning is right there for the human as well.

qntmfred 2 months ago

I wish IDEs had better support for quick toggling the display of comments. You make a good point about it making sense for the docs to live alongside the code, I just don't usually want to see giant comment blocks everywhere while I'm operating in a codebase.
- CharlieDigital 2 months ago
  
  JetBrains Rider has this as does VS Code: CMD+K, CMD+/ to fold, CMD+K, CMD+J to unfold.

soapdog 2 months ago

Well, maybe I want my docs folder to be useful for humans checking my code and don't care about LLMs at all..

CrzyLngPwd 2 months ago

Another day, another person telling us how we should change what we have been doing for decades, just to make sure the machine can drop 1,000's of lines of debt that we'll never review, or worse yet, a machine will review it.

I guess the hope is that the middle managers will finally be able to get rid of the annoying techies, this time, as has been the promise for decades.

Maybe these LLMs are the silver bullet to finally free us so we can dance, paint, write poetry, and fuck instead of working.

Not that I consider writing code to be work, since it's always been the easy bit for me, but yeah, just as the machines have taken music, art, poetry, etc, why not let them take everything we enjoy.

PS - You'll prise copilot in vsc from my cold dead fingers :-)

chewbacha 2 months ago

Common AI blog posts that seem generated:

1. State problem created by AI

2. Provide simple solution

3. State it cannot work and AI won’t help

4. Describe another way to solve for AI with more work

This feels like at least the third blog I’ve read that follow this pattern and have the hallmarks of generated text.

People are playing LLM slot machine for engagement blogs.

postit 2 months ago

Im from the opinion that not only for AI agents but detailed development docs (ADRs, Specs …) should live alongside each package.

High level and user docs in /docs

CharlieDigital 2 months ago

Problem with this approach is that it is contextually expensive because even for a 1 sentence change, the LLM will need to read whole docs into context.
Better solution I mentioned above: inline, idiomatic code comments for your language and have the LLM dump reasoning into the `<remarks>`, `@remarks`, etc. of the comment block.
Now you get free, always up-to-date, platform agnostic, zero-infrastructure long-term memory which works on any agent that every agent is forced to read when it reads the method. It will never miss like it can with secondary docs.
It saves context because instead of reading a 2000 token document for 100 tokens of relevant context, it just reads the comments for the specific method and hydrates long term memory just-in-time with almost certain activation rate without additional prompting.

rurban 2 months ago

No. If you cannot keep docs in sync, let the agent keep it in sync. Better keep the docs in the repo, not in a wiki or elsewhere

munchler 2 months ago

I can’t tell if “table of context” is a clever new phrase or a typo.

soulofmischief 2 months ago

The constant changing of tense in this article makes it very hard to read.

Settings

Docs directories are doomed

Keyboard Shortcuts