Just want to clarify, this is not my Substack, I’m just sharing this because I found it insightful.

The author describes himself as a “fractional CTO”(no clue what that means, don’t ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

    • theneverfox@pawb.social
      link
      fedilink
      English
      arrow-up
      20
      ·
      1 day ago

      AI isn’t good at changing code, or really even understanding it… It’s good at writing it, ideally 50-250 lines at a time

      • MangoCats@feddit.it
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        7 hours ago

        It’s good at writing it, ideally 50-250 lines at a time

        I find Claude Sonnet 4.5 to be good up to 800 lines at a chunk. If you structure your project into 800ish line chunks with well defined interfaces you can get 8 to 10 chunks working cooperatively pretty easily. Beyond about 2000 lines in a chunk, if it’s not well defined, yeah - the hallucinations start to become seriously problematic.

        The new Opus 4.5 may have a higher complexity limit, I haven’t really worked with it enough to characterize… I do find Opus 4.5 to get much slower than Sonnet 4.5 was for similar problems.

      • Evotech@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        2
        ·
        edit-2
        1 day ago

        I’m just not following the mindset of “get ai to code your whole program” and then have real people maintain it? Sounds counter productive

        I think you need to make your code for an Ai to maintain. Use Static code analysers like SonarQube to ensure that the code is maintainable (cognitive complexity)!and that functions are small and well defined as you write it.

        • theneverfox@pawb.social
          link
          fedilink
          English
          arrow-up
          8
          ·
          19 hours ago

          I don’t think we should be having the AI write the program in the first place. I think we’re barreling towards a place where remotely complicated software becomes a lost technology

          I don’t mind if AI helps here and there, I certainly use it. But it’s not good at custom fit solutions, and the world currently runs on custom fit solutions

          AI is like no code solutions. Yeah, it’s powerful, easier to learn and you can do a lot with it… But eventually you will hit a limit. You’ll need to do something the system can’t do, or something you can’t make the system do because no one properly understands what you’ve built

          At the end of the day, coding is a skill. If no one is building the required experience to work with complex systems, we’re going to be swimming in a world of endless ocean of vibe coded legacy apps in a decade

          I just don’t buy that AI will be able to take something like a set of State regulations and build a complaint outcome. Most of our base digital infrastructure is like that, or it uses obscure ancient systems that LLMs are basically allergic to working with

          To me, we’re risking everything on achieving AGI (and using it responsibly) before we run out of skilled workers, and we’re several game changing breakthroughs from achieving that

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            7 hours ago

            I think we’re barreling towards a place where remotely complicated software becomes a lost technology

            I think complicated software has been an art more than a science, for the past 30 years we have been developing formal processes to make it more of a procedural pursuit but the art is still very much in there.

            I think if AI authored software is going to reach any level of valuable complexity, it’s going to get there with the best of our current formal processes plus some more that are being (rapidly) developed specifically for LLM based tools.

            But eventually you will hit a limit. You’ll need to do something…

            And how do we surpass those limits? Generally: research. And for the past 20+ years where do we do most of that research? On the internet. And where were the LLMs trained, and what are they relatively good at doing quickly? Internet research.

            At the end of the day, coding is a skill. If no one is building the required experience to work with complex systems

            So is semiconductor design, application of transistors to implement logic gates, etc. We still have people who can do that, not very many, but enough. Not many people work in assembly language anymore, either…

      • lepinkainen@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        13
        ·
        1 day ago

        I’ve made full-ass changes on existing codebases with Claude

        It’s a skill you can learn, pretty close to how you’d work with actual humans

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 hours ago

          pretty close to how you’d work with actual humans

          That has been my experience as well. It’s like working with humans who have extremely fast splinter skills, things they can rip through in 10 minutes that might take you days, weeks even. But then it also takes 5-10 minutes to do some things that you might accomplish in 20 seconds. And, like people, it’s not 100% reliable or accurate, so you need to use all those same processes we have developed to help people catch their mistakes.

        • TheBlackLounge@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 day ago

          What full ass changes have you made that can’t be done better with a refactoring tool?

          I believe Claude will accept the task. I’ve been fixing edge cases in a vibe colleague’s full-ass change all month. Would have taken less time to just do it right the first time.

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            1
            ·
            7 hours ago

            True that LLMs will accept almost any task, whether they should or not. True that their solutions aren’t 100% perfect every time. Whether it’s faster to use them or not I think depends a lot on what’s being done, and what alternative set of developers you’re comparing them with.

            What I have seen across the past year is that the number of cases where LLM based coding tools are faster than traditional developers has been increasing, rather dramatically. I called them near useless this time last year.

    • BarneyPiccolo@lemmy.today
      link
      fedilink
      English
      arrow-up
      11
      ·
      1 day ago

      I don’t know shit about anything, but it seems to me that the AI already thought it gave you the best answer, so going back to the problem for a proper answer is probably not going to work. But I’d try it anyway, because what do you have to lose?

      Unless it gets pissed off at being questioned, and destroys the world. I’ve seen more than few movies about that.

      • MangoCats@feddit.it
        link
        fedilink
        English
        arrow-up
        3
        ·
        7 hours ago

        AI already thought it gave you the best answer, so going back to the problem for a proper answer is probably not going to work.

        There’s an LLM concept/parameter called “temperature” that determines basically how random the answer is.

        As deployed, LLMs like Claude Sonnet or Opus have a temperature that won’t give the same answer every time, and when you combine this with feedback loops that point out failures (like compliers that tell the LLM when its code doesn’t compile), the LLM can (and does) the old Beckett: try, fail, try again, fail again, fail better next time - and usually reach a solution that passes all the tests it is aware of.

        The problem is: with a context window limit of 200,000 tokens, it’s not going to be aware of all the relevant tests in more complex cases.

      • Evotech@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        1 day ago

        You are in a way correct. If you keep sending the context of the “conversation” (in the same chat) it will reinforce its previous implementation.

        The way ais remember stuff is that you just give it the entire thread of context together with your new question. It’s all just text in text out.

        But once you start a new conversation (meaning you don’t give any previous chat history) it’s essentially a “new” ai which didn’t know anything about your project.

        This will have a new random seed and if you ask that to look for mistakes etc it will happily tell you that the last Implementation was all wrong and here’s how to fix it.

        It’s like a minecraft world, same seed will get you the same map every time. So with AIs it’s the same thing ish. start a new conversation or ask a different model (gpt, Google, Claude etc) and it will do things in a new way.

        • TheBlackLounge@lemmy.zip
          link
          fedilink
          English
          arrow-up
          10
          ·
          1 day ago

          Doesn’t work. Any semi complex problem with multiple constraints and your team of AIs keeps running circles. Very frustrating if you know it can be done. But what if you’re a “fractional CTO” and you get actually contradictory constraints? We haven’t gotten yet to AIs who will tell you that what you ask is impossible.

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 hours ago

            your team of AIs keeps running circles

            Depending on your team of human developers (and managers), they will do the same thing. Granted, most LLMs have a rather extreme sycophancy problem, but humans often do the same.

            We haven’t gotten yet to AIs who will tell you that what you ask is impossible.

            If it’s a problem like under or over-constrained geometry or equations, they (the better ones) will tell you. For difficult programing tasks I have definitely had the AIs bark up all the wrong trees trying to fix something until I gave them specific direction for where to look for a fix (very much like my experiences with some human developers over the years.)

            I had a specific task that I was developing in one model, and it was a hard problem but I was making progress and could see the solution was near, then I switched to a different model which did come back and tell me “this is impossible, you’re doing it wrong, you must give up this approach” up until I showed it the results I had achieved to-date with the other model, then that same model which told me it was impossible helped me finish the job completely and correctly. A lot like people.

          • Evotech@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            18 hours ago

            Yeah right now you have to know what’s possible and nudge the ai in the right direction to use the correct approach according to you if you want it to do things in an optimized way

        • BarneyPiccolo@lemmy.today
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          2
          ·
          1 day ago

          Maybe the solution is to keep sending the code through various AI requests, until it either gets polished up, or gains sentience, and destroys the world. 50-50 chance.

          This stuff ALWAYS ends up destroying the world on TV.

          Seriously, everybody is complaining about the quality of AI product, but the whole point is for this stuff to keep learning and improving. At this stage, we’re expecting a kindergartener to product the work of a Harvard professor. Obviously, were going to be disappointed.

          But give that kindergartener time to learn and get better, and they’ll end up a Harvard professor, too. AI may just need time to grow up.

          And frankly, that’s my biggest worry. If it can eventually start producing results that are equal or better than most humans, then the Sociopathic Oligarchs won’t need worker humans around, wasting money that could be in their bank accounts.

          And we know what their solution to that problem will be.

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            2
            ·
            6 hours ago

            This stuff ALWAYS ends up destroying the world on TV.

            TV is also full of infinite free energy sources. In the real world warp drive may be possible, you just need to annihilate the mass of Jupiter with an equivalent mass of antimatter to get the energy necessary to create a warp bubble to move a small ship from the orbit of Pluto to a location a few light years away, but on TV they do it every week.