• tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    47
    ·
    edit-2
    1 day ago

    Not the position Dell is taking, but I’ve been skeptical that building AI hardware directly into specifically laptops is a great idea unless people have a very concrete goal, like text-to-speech, and existing models to run on it, probably specialized ones. This is not to diminish AI compute elsewhere.

    Several reasons.

    • Models for many useful things have been getting larger, and you have a bounded amount of memory in those laptops, which, at the moment, generally can’t be upgraded (though maybe CAMM2 will improve the situation, move back away from soldered memory). Historically, most users did not upgrade memory in their laptop, even if they could. Just throwing the compute hardware there in the expectation that models will come is a bet on the size of the models that people might want to use not getting a whole lot larger. This is especially true for the next year or two, since we expect high memory prices, and people probably being priced out of sticking very large amounts of memory in laptops.

    • Heat and power. The laptop form factor exists to be portable. They are not great at dissipating heat, and unless they’re plugged into wall power, they have sharp constraints on how much power they can usefully use.

    • The parallel compute field is rapidly evolving. People are probably not going to throw out and replace their laptops on a regular basis to keep up with AI stuff (much as laptop vendors might be enthusiastic about this).

    I think that a more-likely outcome, if people want local, generalized AI stuff on laptops, is that someone sells an eGPU-like box that plugs into power and into a USB port or via some wireless protocol to the laptop, and the laptop uses it as an AI accelerator. That box can be replaced or upgraded independently of the laptop itself.

    When I do generative AI stuff on my laptop, for the applications I use, the bandwidth that I need to the compute box is very low, and latency requirements are very relaxed. I presently remotely use a Framework Desktop as a compute box, and can happily generate images or text or whatever over the cell network without problems. If I really wanted disconnected operation, I’d haul the box along with me.

    EDIT: I’d also add that all of this is also true for smartphones, which have the same constraints, and harder limitations on heat, power, and space. You can hook one up to an AI accelerator box via wired or wireless link if you want local compute, but it’s going to be much more difficult to deal with the limitations inherent to the phone form factor and do a lot of compute on the phone itself.

    EDIT2: If you use a high-bandwidth link to such a local, external box, bonus: you also potentially get substantially-increased and upgradeable graphical capabilities on the laptop or smartphone if you can use such a box as an eGPU, something where having low-latency compute available is actually quite useful.

    • cmnybo@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      16
      ·
      1 day ago

      There are a number of NPUs that plug into an m.2 slot. If those aren’t powerful enough, you can just use an eGPU.
      I would rather not have to pay for an NPU that I’m probably not going to use.

    • Euphoma@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      17 hours ago

      Phones have already came with ai processors for a long time, specifically for speech recognition and camera features, its not advertised because its from before the bubble started

    • Goodeye8@piefed.social
      link
      fedilink
      English
      arrow-up
      5
      ·
      21 hours ago

      I’m not that concerned with the hardware limitations. Nobody is going to run a full-blown LLM on their laptop, running one on a desktop would already require building a PC with AI in mind. What you’re going to see being used locally are going smaller models (something like 7B using INT8 or INT4). Factor in the efficiency of an NPU and you could get by with 16GB of memory (especially if the models are used in INT4) with little extra power draw and heat. The only hardware concern would be the technological advancement speed of NPUs, but just don’t be an early adopter and you’ll probably be fine.

      But this is where Dells point comes in. Why should the consumer care? What benefits do consumers get by running a model locally? Outside of privacy and security reasons you’re simply going to get a better result by using one of the online AI services because you’d be using a proper model instead of the cheap one that runs with limited hardware. And even for the privacy and security minded people you can just build your own AI server (maybe not today but when hardware prices get back to normal) that you run from home and then expose that to your laptop or smartphone. For consumers to desire running a local model (actually locally and not in a selfhosting kind of way) there would have to be some problem that the local model solve that the over the internet solution can’t solve. So far such a problem doesn’t exist today and there doesn’t seem to be a suitable problem on the horizon either.

      Dell is keeping their foot in the door by still implementing NPUs into their laptops, so if by some miracle some magical problem is found that AI solves they’re ready, but they realize that NPUs are not something they can actually use as a selling point because as it stands, NPUs solve no problems because there’s no benefit to running small models locally.

      • jj4211@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        20 hours ago

        More to the point, the casual consumer isn’t going to dig into the nitty gritty of running models locally and not a single major player is eager to help them do it (they all want to lock the users into their datacenters and subscription opportunities).

        On the Dell keeping NPUs in their laptops, they don’t really have much of a choice if they want modern processors, Intel and AMD are all-in on it still.

        • Goodeye8@piefed.social
          link
          fedilink
          English
          arrow-up
          2
          ·
          19 hours ago

          Setting up a local model was specifically about people who take privacy and security seriously because that often requires sacrificing convenience, which in this case would be having to build a suitable server and learning the necessary know-how of setting up your own local model. Casual consumers don’t really think about privacy so they’re going to go with the most convenient option, which is whatever service the major players will provide.

          As for Dell keeping the NPUs I forgot they’re going to be bundled with processors.

          • jj4211@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            18 hours ago

            My general point is that discussing the intricacies of potential local AI model usage is way over the head of the people that would even in theory care about the facile “AI PC” marketing message. Since no one is making it trivial for the casual user to actually do anything with those NPUs, then it’s all a moot point for this sort of marketing. Even if there were an enthusiast market that would use those embedded NPUs without a distinct more capable infrastructure, they wouldn’t be swayed/satisfied with just ‘AI PC’ or ‘Copilot+’, they’d want to know specs rather than a boolean yes/no for ‘AI’.

    • Nighed@feddit.uk
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      1 day ago

      I think part of the idea is: build it and they will come… If 10% of users have NPUs, then apps will find ‘useful’ ways to use them.

      Part of it is actually battery life - if you assume that in the life of the laptop it will be doing AI tasks (unlikely currently) an NPU will be wayyyy more efficient than running it on a CPU, or even a GPU.

      Mostly though, it’s because it’s an excuse to charge more for the laptop. If all the high end players add NPUs, then customers have no choice but to shell out more. Most customers won’t realise that when they use chat got or copilot one one of these laptops, it’s still not running on their device.