okr765
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
juergen@feddit.org to Technology@lemmy.worldEnglish · 7 months ago

OpenAI's move to allow generating "Ghibly stlye" images isn't just a cute PR stunt. It is an expression of dominance and the will to reject and refuse democratic values. It is a display of power

tante.cc

external-link
message-square
158
fedilink
948
external-link

OpenAI's move to allow generating "Ghibly stlye" images isn't just a cute PR stunt. It is an expression of dominance and the will to reject and refuse democratic values. It is a display of power

tante.cc

juergen@feddit.org to Technology@lemmy.worldEnglish · 7 months ago
message-square
158
fedilink
Vulgar Display of Power
tante.cc
external-link
Hayao Miyasaki is the co-founder of Studio Ghibli, a Japanese animation studio known worldwide for their stunning, emotional, beautiful stories and movies. At the core of Studio Ghibli’s work is a deep engagement with questions of humanity. About what it means to be a human, about how to care for one another and the world […]
  • Terrasque@infosec.pub
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    5
    ·
    7 months ago

    OpenAI is so lagging behind in terms of image generation it is comical at this point.

    You’re the one lagging behind. OpenAI’s new image model is on a different level, way ahead of the competition

    • YarHarSuperstar@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      7 months ago

      How so?

      • Terrasque@infosec.pub
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        5
        ·
        edit-2
        7 months ago
        • Autoregressive model
        • Multimodal with the LLM
        • Can keep consistency between images
        • Much better at text rendering
        • Can combine images, like you have one image and you upload a picture of a jacket and say “put this on him” and it does it
        • Can upload a picture of yourself and say “put me on the beach”, and then for example if you don’t like it you can tell it to do a different type of beach, and then say “and put me on a white horse and give me some nice beach wear” for example.

        It understands what you’re telling it, and can generate images from vague descriptions, combine things from different images just by telling it, modify it and understand the context - like knowing that “me” is the person in the image, for example.

        Edit: From OpenAI - “4o image generation is an autoregressive model natively embedded within ChatGPT”

        • YarHarSuperstar@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          7 months ago

          Okay so how does that compare to whatever competition you’re referencing

          • Terrasque@infosec.pub
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            edit-2
            7 months ago

            No other model on market can do anything like that. The closest is diffusion based where you could train a lora with a person’s look or a specific clothing, then generate multiple times and / or use controlnet to sorta control the output. That’s fast hours or days of work, plus it’s quite technical to set it up and use.

            OpenAI’s new model is a paradigm shift in both what the model can do and how you use it, and can easily and effortlessly produce things that was extremely difficult or impossible without complicated procedures and post processing in Photoshop.

            Edit Some examples. Try to make any of this in any of the existing image generators

            • https://www.reddit.com/r/ChatGPT/comments/1jl36h6/gpt_was_also_able_to_help_me_make_a_comic_ive/
            • https://www.reddit.com/r/ChatGPT/comments/1jkl5m2/i_work_in_ecommerce_the_new_gpt_image_update_has/
            • https://www.reddit.com/r/ChatGPT/comments/1jlewya/by_god_what_have_i_done/
            • https://www.reddit.com/r/ChatGPT/comments/1jm8ddg/im_not_the_first_to_figure_this_trick_out_am_i/
            • https://www.reddit.com/r/ChatGPT/comments/1jjsfkb/starting_today_gpt4o_is_going_to_be_incredibly/
            • https://www.reddit.com/r/ChatGPT/comments/1jn2kpy/i_created_a_character_with_chatgpt_and_send_her/
            • https://www.reddit.com/r/ChatGPT/comments/1jkaaxh/gpt4o_image_generation_is_absolutely_insane/
            • FauxLiving@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              ·
              7 months ago

              All diffusion and language models are autoregressive. That just means that the output is fed back in as input until the task is complete.

              With diffusion models this means that it is fed an image that is 100% noise and it removes some small percentage of the noise and then then the denoised image is fed back in and another small percentage is removed. This is repeated until a defined stopping points (usually a set number of passes).

              Combining images and using one image to control the generation of another has been available for quite a while. Controlnet and IPAdapters let you do exactly that: ‘Put this coat on this person’ or ‘Take this picture and do it in this style’. Here’s an 11 month old YouTube video explaining how to do this using open source models and software: https://www.youtube.com/watch?v=gmwZGC8UVHE

              It’s nice for non-technical people that OpenAI will sell you a subscription in order to access an agent that can perform these kinds of image generation abilities, but it’s not doing anything new in terms of image generation.

              • Terrasque@infosec.pub
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                7 months ago

                I know them, and used them a bit. I even mentioned them in an earlier comment. The capabilities of OpenAI’s new model is on a different level in my experience.

                https://www.reddit.com/r/StableDiffusion/comments/1jlj8me/4o_vs_flux/ - read the comments there. That’s a community dedicated to running local diffusion models. They’re familiar with all the tricks. They’re pretty damn impressed too.

                I can’t help but feel that people here either haven’t tried the new openai image model, or have never actually used any of the existing ai image generators before.

                • ZeroOne@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  7 months ago

                  I cannot take you seriously with all that reddit comments.

                  But then why am I even surprised, you shill for a proprietary-AI

                  • Terrasque@infosec.pub
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    arrow-down
                    1
                    ·
                    7 months ago

                    ah yes, I forgot we live in post-truth society where reality doesn’t matter and only your feelings are important. And since your feelings say AI bad, proprietary bad, and reddit bad, you don’t have to actually think or take into consideration reality.

        • Ilixtze@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 months ago

          It is really sad that the most advanced model can only aspire to make derivative shit for techbro loosers,

        • mad_djinn@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          3
          ·
          7 months ago

          you know enough about the model for me to immediately distrust your opinion on the matter. why don’t you head back to ycombinator or whatever hole you crawled out of

Technology@lemmy.world

technology@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


  • @L4s@lemmy.world
  • @autotldr@lemmings.world
  • @PipedLinkBot@feddit.rocks
  • @wikibot@lemmy.world
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 1.87K users / day
  • 8.03K users / week
  • 14.2K users / month
  • 33.3K users / 6 months
  • 1 local subscriber
  • 76.3K subscribers
  • 11.4K Posts
  • 402K Comments
  • Modlog
  • mods:
  • L3s@lemmy.world
  • enu@lemmy.world
  • Technopagan@lemmy.world
  • L4sBot@lemmy.world
  • L3s@hackingne.ws
  • L4s@hackingne.ws
  • BE: 0.19.9
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org