Just want to clarify, this is not my Substack, I’m just sharing this because I found it insightful.

The author describes himself as a “fractional CTO”(no clue what that means, don’t ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

  • MangoCats@feddit.it
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 hours ago

    Yeah, context management is one big key. The “compacting conversation” hack is a good one, you can continue conversations indefinitely, but after each compact it will throw away some context that you thought was valuable.

    The best explanation I have heard for the current limitations is that there is a “context sweet spot” for Opus 4.5 that’s somewhere short of 200,000 tokens. As your context window gets filled above 100,000 tokens, at some point you’re at “optimal understanding” of whatever is in there, then as you continue on toward 200,000 tokens the hallucinations start to increase. As a hack, they “compact the conversation” and throw out less useful tokens getting you back to the “essential core” of what you were discussing before, so you can continue to feed it new prompts and get new reactions with a lower hallucination rate, but with that lower hallucination rate also comes a lower comprehension of what you said before the compacting event(s).

    Some describe an aspect of this as the “lost in the middle” phenomenon since the compacting event tends to hang on to the very beginning and very end of the context window more aggressively than the middle, so more “middle of the window” content gets dropped during a compacting event.