Just want to clarify, this is not my Substack, I’m just sharing this because I found it insightful.

The author describes himself as a “fractional CTO”(no clue what that means, don’t ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

  • phed@lemmy.ml
    link
    fedilink
    English
    arrow-up
    24
    ·
    19 hours ago

    I do a lot with AI but it is not good enough to replace humans, not even close. It repeats the same mistakes after you tell it no, it doesn’t remember things from 3 messages ago when it should. You have to keep re-explaining the goal to it. It’s wholey incompetant. And yea when you have it do stuff you aren’t familiar with or don’t create, def. I have it write a commentary, or I take the time out right then to ask it what x or y does then I add a comment.

    • kahnclusions@lemmy.ca
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      1
      ·
      edit-2
      8 hours ago

      Even worse, the ones I’ve evaluated (like Claude) constantly fail to even compile because, for example, they mix usages of different SDK versions. When instructed to use version 3 of some package, it will add the right version as a dependency but then still code with missing or deprecated APIs from the previous version that are obviously unavailable.

      More time (and money, and electricity) is wasted trying to prompt it towards correct code than simply writing it yourself and then at the end of the day you have a smoking turd that no one even understands.

      LLMs are a dead end.

      • MangoCats@feddit.it
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        8 hours ago

        constantly fail to even compile because, for example, they mix usages of different SDK versions

        Try an agentic tool like Claude Code - it closes the loop by testing the compilation for you, and fixing its mistakes (like human programmers do) before bothering you for another prompt. I was where you are at 6 months ago, the tools have improved dramatically since then.

        From TFS > I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

        That sounds like a “fractional CTO problem” to me (IMO a fractional CTO is a guy who convinces several small companies that he’s a brilliant tech genius who will help them make their important tech decisions without actually paying full-time attention to any of them. Actual tech experience: optional.)

        If you have lost confidence in your ability to modify your own creation, that’s not a tools problem - you are the tool, that’s a you problem. It doesn’t matter if you’re using an LLM coding tool, or a team of human developers, or a pack of monkeys to code your applications, if you don’t document and test and formally develop an “understanding” of your product that not only you but all stakeholders can grasp to the extent they need to, you’re just letting the development run wild - lacking a formal software development process maturity. LLMs can do that faster than a pack of monkeys, or a bunch of kids you hired off Craigslist, but it’s the exact same problem no matter how you slice it.

        • III@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          5 hours ago

          The LLM comparison to a team of human developers is a great example. But like outsourcing your development, LLM is less a tool and more just delegation. And yes, you can dig in deep to understand all the stuff the LLM is delegated to do the same as you can get deeply involved with a human development team to maintain an understanding. But most of the time, the sell is that you can save time - which means you aren’t expecting to micro manage your development team.

          It is a fractional CTO problem but the actual issue is that developers are being demanded to become fractional CTOs by using LLM because they are being measured by expected productivity increases that limit time for understanding.

          • Upgrayedd1776@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            5 hours ago

            thats an interesting take, developers are demanded to also become fractional CTO, there is probably a larger than estimated knowledge and experience gap there and unless you have the knack for managing people you probably run into more problems than you are used to normally being just a code jockey

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 hours ago

            the sell is that you can save time

            How do you know when salespeople (and lawyers) are lying? It’s only when their lips are moving.

            developers are being demanded to become fractional CTOs by using LLM because they are being measured by expected productivity increases that limit time for understanding.

            That’s the kind of thing that works out in the end. Like outsourcing to Asia, etc. It does work for some cases, it can bring sustainable improvements to the bottom line, but nowhere near as fast or easy or cheaply as the people selling it say.

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      10
      ·
      17 hours ago

      There’s no point telling it not to do x because as soon as you mention it x it goes into its context window.

      It has no filter, it’s like if you had no choice in your actions, and just had to do every thought that came into your head, if you were told not to do a thing you would immediately start thinking about doing it.

      • kahnclusions@lemmy.ca
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        8 hours ago

        I’ve noticed this too, it’s hilarious(ly bad).

        Especially with image generation, which we were using to make some quick avatars for a D&D game. “Draw a picture of an elf.” Generates images of elves that all have one weird earring. “Draw a picture of an elf without an earing.” Great now the elves have even more earrings.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          2
          ·
          8 hours ago

          I find this kind of performance to vary from one model to the next. I definitely have experienced the bad image getting worse phenomenon - especially with MS Copilot - but different models will perform differently.

      • MangoCats@feddit.it
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 hours ago

        There’s no point telling it not to do x because as soon as you mention it x it goes into its context window.

        Reminds me of the Sonny Bono high speed downhill skiing problem: don’t fixate on that tree, if you fixate on that tree you’re going to hit the tree, fixate on the open space to the side of the tree.

        LLMs do “understand” words like not, and don’t, but they also seem to work better with positive examples than negative ones.