Just want to clarify, this is not my Substack, I’m just sharing this because I found it insightful.

The author describes himself as a “fractional CTO”(no clue what that means, don’t ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

  • utopiah@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    1 day ago

    FWIW that’s a good question but IMHO the better question is :

    What kind of small things have you vibed out that you needed that didn’t actually exist or at least you couldn’t find after a 5min search on open source forges like CodeBerg, Gitblab, Github, etc?

    Because making something quick that kind of works is nice… but why even do so in the first place if it’s already out there, maybe maintained but at least tested?

    • MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 hours ago

      making something quick that kind of works is nice… but why even do so in the first place if it’s already out there, maybe maintained but at least tested?

      In a sense, this is what LLMs are doing for you: regurgitating stuff that’s already out there. But… they are “bright” enough to remix the various bits into custom solutions. So there might already be a NWS API access app example, and a Waveshare display example, and so on, but there’s not a specific example that codes up a local weather display for the time period and parameters you want to see (like, temperature and precipitation every 15 minutes for the next 12 hours at a specific location) on the particular display you have. Oh, and would you rather build that in C++ instead of Python? Yeah, LLMs are actually pretty good at remixing little stuff like that into things you’re not going to find exact examples of ready to your spec.

    • ipkpjersi@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      7 hours ago

      I built a MAL clone using AI, nearly 700 commits of AI. Obviously I was responsible for the quality of the output and reviewing and testing that it all works as expected, and leading it in the right direction when going down the wrong path, but it wrote all of the code for me.

      There are other MAL clones out there, but none of them do everything I wanted, so that’s why I built my own project. It started off as an inside joke with a friend, and eventually materialized as an actual production-ready project. It’s limited more by design of the fact that it relies on database imports and delta edits rather than the fact that it was written by AI, because that’s just the nature of how data for these types of things tend to work.

    • Victor@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 day ago

      Since you put such emphasis on “better”: I’d still like to have an answer to the one I posed.

      Yours would be a reasonable follow-up question if we noticed that their vibed projects are utilities already available in the ecosystem. 👍

      • utopiah@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        Sure, you’re right, I just worry (maybe needlessly) about people re-inventing the wheel because it’s “easier” than searching without properly understand the cost of the entire process.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 hours ago

          people re-inventing the wheel because it’s “easier” than searching without properly understand the cost of the entire process.

          A good LLM will do a web search first and copy its answer from there…

          • MalMen@masto.pt
            link
            fedilink
            arrow-up
            2
            ·
            5 hours ago

            @MangoCats @utopiah exactly this… i did some small stuff out lf pastoring llms, but first searched for what I need, usually I find a small repo that kind of do what I want, then I clone it, change it a but using help of llm and if i think it is usefull I open a PR and let the mantainer decide if its good or not

      • utopiah@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        5
        ·
        edit-2
        1 day ago

        Open an issue to explain why it’s not enough for you? If you can make a PR for it that actually implements the things you need, do it?

        My point to say everything is already out there and perfectly fits your need, only that a LOT is already out there. If all re-invent the wheel in our own corner it’s basically impossible to learn from each other.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 hours ago

          There have been some articles published positing that AI coding tools spell the end for FOSS because everybody is just going to do stuff independently and don’t need to share with each other anymore to get things done.

          I think those articles are short sighted, and missing the real phenomenon that the FOSS community needs each other now more than ever in order to tame the LLMs into being able to write stories more interesting than “See Spot run.” and the equivalent in software projects.

        • ipkpjersi@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 hours ago

          And if the maintainer doesn’t agree to merge your changes, what to you do then?

          You have to build your own project, where you get to decide what gets added and what doesn’t.

        • lepinkainen@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 day ago

          These are the principles I follow:

          https://indieweb.org/make_what_you_need

          https://indieweb.org/use_what_you_make

          I don’t have time to argue with FOSS creators to get my stuff in their projects, nor do I have the energy to maintain a personal fork of someone else’s work.

          It’s much faster for me to start up Claude and code a very bespoke system just for my needs.

          I don’t like web UIs nor do I want to run stuff in a Docker container. I just want a scriptable CLI application.

          Like I just did a subtitle translation tool in 2-3 nights that produces much better quality than any of the ready made solutions I found on GitHub. One of which was an *arr stack web monstrosity and the other was a GUI application.

          Neither did what I needed in the level of quality I want, so I made my own. One I can automate like I want and have running on my own server.

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            2
            ·
            5 hours ago

            I don’t have time to argue with FOSS creators to get my stuff in their projects

            So much this. Over the years I have found various issues in FOSS and “done the right thing” submitting patches formatted just so into their own peculiar tracking systems according to all their own peculiar style and traditions, only to have the patches rejected for all kinds of arbitrary reasons - to which I say: “fine, I don’t really want our commercial competitors to have this anyway, I was just trying to be a good citizen in the community. I’ve done my part, you just go on publishing buggy junk - that’s fine.”

          • mjr@infosec.pub
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            22 hours ago

            So the claim is it’s easier to Claudge a whole new app than to make a personal fork of one that works? Sounds unlikely.

              • mjr@infosec.pub
                link
                fedilink
                English
                arrow-up
                1
                ·
                5 hours ago

                Yeah, that’s fair. In a minority of cases, with a certain app and needs to modify it to do your task, it may be true. Still rare.

                • MangoCats@feddit.it
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  5 hours ago

                  I don’t know how rare it is today. What I do know is that it’s less rare today than it was 3 months ago, and 3 months ago it was even more rare 3 months before that…

    • jj4211@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      1 day ago

      So if it can be vibe coded, it’s pretty much certainly already a “thing”, but with some awkwardness.

      Maybe what you need is a combination of two utilities, maybe the interface is very awkward for your use case, maybe you have to make a tiny compromise because it doesn’t quite match.

      Maybe you want a little utility to do stuff with media. Now you could navigate your way through ffmpeg and mkvextract, which together handles what you want, with some scripting to keep you from having to remember the specific way to do things in the myriad of stuff those utilities do. An LLM could probably knock that script out for you quickly without having to delve too deeply into the documentation for the projects.

      • MangoCats@feddit.it
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 hours ago

        I’ll put it this way: LLMs have been getting pretty good at translation over the past 20 years. Sure, human translators still look down their noses at “automated translations” but, in the real world, an automated translation gets the job done well enough most of the time.

        LLMs are also pretty good at translating code, say from C++ to Rust. Not million line code bases, but the little concepts they can do pretty well.

        On a completely different tack, I’ve been pretty happy with LLM generated parsers. Like: I’ve got 1000 log files here, and I want to know how many times these lines appear. You’ve got grep for that. But, write me a utility that finds all occurrences of these lines, reads the time stamps, and then searches for any occurrences of these other lines within +/- 1 minute of the first ones… grep can’t really do that, but a 5 minute vibe coded parser can.

        • jj4211@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 day ago

          It’s certainly a use case that LLM has a decent shot at.

          Of course, having said that I gave it a spin with Gemini 3 and it just hallucinated a bunch of crap that doesn’t exist instead of properly identifying capable libraries or frontending media tools…

          But in principle and upon occasion it can take care of little convenience utilities/functions like that. I continue to have no idea though why some people seem to claim to be able to ‘vibe code’ up anything of significance, even as I thought I was giving it an easy hit it completely screwed it up…

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 hours ago

            I tried using Gemini 3 for OpenSCAD, and it couldn’t slice a solid properly to save its life, I gave up on it after about 6 attempts to put a 3:12 slope shed roof on four walls. Same job in Opus 4.5 and I’ve got a very nicely styled 600 square foot floor plan with radiused 3D concrete printed walls, windows, doors, shed roof with 1’ overhang, and a python script that translates the .scad to a good looking .svg 2D floorplan.

            I’m sure Gemini 3 is good for other things, but Opus 4.5 makes it look infantile in 3D modeling.

          • PoliteDudeInTheMood@lemmy.ca
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 day ago

            Having used both Gemini and Claude… I use Gemini when I need to quickly find something I don’t want to waste time searching for, or I need a recipe found and then modified to fit what I have on hand.

            Everytime I used Gemini for coding has ended in failure. It constantly forgets things, forgets what version of a package you’re using so it tells you to do something that is deprecated, it was hell. I had to hold its hand the entire time and talk to it like it’s a stupid child.

            Claude just works. I use Claude for so many things both chat and API. I didn’t care for AI until I tried Claude. There’s a whole whack of novels by a Russian author I like but they stopped translating the series. Claude vibe coded an app to read the Russian ebooks, translate them by chapter in a way that prevented context bleed. I can read any book in any language for about $2.50 in API tokens.

            • jj4211@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 day ago

              I’ve been using Claude to mediocre results, so this time I used Gemini 3 because everyone in my company is screaming “this time it works, trust us bro”. Claude has not been working so great for me for my day job either.

              • PoliteDudeInTheMood@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                ·
                20 hours ago

                I think it really depends on the user and how you communicate with the AI. People are different, and we communicate differently. But if you’re precise and you tell it what you want, and what your expected result should be it’s pretty good at filling in the blanks.

                I can pull really useful code out of Claude, but ask me to think up a prompt to feed into Gemini for video creation and they look like shit.

                • jj4211@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  17 hours ago

                  The type of problem in my experience is the biggest source of different results

                  Ask for something that is consistent with very well trodden territory, and it has a good shot. However if you go off the beaten path, and it really can’t credibly generate code, it generates anyway, making up function names, file paths, rest urls and attributes, and whatever else that would sound good and consistent with the prompt, but no connection to real stuff.

                  It’s usually not that that it does the wrong thing because it “misunderstood”, it is usually that it producea very appropriate looking code consistent with the request that does not have a link to reality, and there’s no recognition of when it invented non existent thing.

                  If it’s a fairly milquetoast web UI manipulating a SQL backend, it tends to chew through that more reasonably (though in various results that I’ve tried it screwed up a fundamental security principle, like once I saw it suggest a weird custom certificate validation and disable default validation while transmitting sensitive data before trying to meaningfully execute the custom valiidation.