A new tool lets artists add invisible changes to the pixels in their art before they upload it online so that if it’s scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways.

The tool, called Nightshade, is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.
[…]
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows.

  • Margot Robbie@lemmy.world
    link
    fedilink
    English
    arrow-up
    115
    arrow-down
    1
    ·
    11 months ago

    It’s made by Ben Zhao? You mean the “anti AI plagerism” UChicago professor who illegally stole GPLv3 code from an open source program called DiffusionBee for his proprietary Glaze software (reddit link), and when pressed, only released the code for the “front end” while still being in violation of GPL?

    The Glaze tool that promised to be invisible to the naked eyes, but contained obvious AI generated artifacts? The same Glaze that reddit defeated in like a day after release?

    Don’t take anything this grifter says seriously, I’m surprised he hasn’t been suspended for academic integrity violation yet.

    • P03 Locke@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      8
      ·
      11 months ago

      who illegally stole GPLv3 code from an open source program called DiffusionBee for his proprietary Glaze software (reddit link), and when pressed, only released the code for the “front end” while still being in violation of GPL?

      Oh, how I wish the FSF had more of their act together nowadays and were more like the EFF or ACLU.

    • 0xD@infosec.pub
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      1
      ·
      11 months ago

      I don’t see a problem with it training on all materials, fuck copyright. I see the problem in it infringing on everyone’s copyright and then being proprietary, monetized bullshit.

      If it trains on an open dataset, it must be completely and fully open. Everything else is peak capitalism.

    • ElectroVagrant@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      2
      ·
      11 months ago

      Until the law catches up with the technology, people need ways of protecting themselves.

      I agree, and I wonder if the law might be kicked into catching up quicker as more companies try to adopt these tools and inadvertently infringe on other companies’ copyrighted material. 😅

    • 9thSun@midwest.social
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      6
      ·
      11 months ago

      How is training AI with art on the web different to a person studying art styles? I’d say if the AI is being monetized in some capacity, then sure maybe there should be laws in place. I’m just hard-pressed to believe that anyone can have sole control of anything once it gets on the Internet.

      • realharo@lemm.ee
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        edit-2
        11 months ago

        How is training AI with art on the web different to a person studying art styles?

        Human brains clearly work differently than AI, how is this even a question?

        The term “learning” in machine learning is mainly a metaphor.

        Also, laws are written with a practical purpose in mind - they are not some universal, purely philosophical construct and never have been.

        • V H@lemmy.stad.social
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          11 months ago

          Human brains clearly work differently than AI, how is this even a question?

          It’s not all that clear that those differences are qualitatively meaningful, but that is irrelevant to the question they asked, so this is entirely a strawman.

          Why does the way AI vs. the brain learn make training AI with art make it different to a person studying art styles? Both learn to generalise features that allows them to reproduce them. Both can do so without copying specific source material.

          The term “learning” in machine learning is mainly a metaphor.

          How do the way they learn differ from how humans learn? They generalise. They form “world models” of how information relates. They extrapolate.

          Also, laws are written with a practical purpose in mind - they are not some universal, purely philosophical construct and never have been.

          This is the only uncontroversial part of your answer. The main reason why courts will treat human and AI actions different is simply that they are not human. It will for the foreseeable future have little to do whether the processes are similar enough to how humans do it.

          • realharo@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            3
            ·
            edit-2
            11 months ago

            Now you’re just cherry picking some surface-level similarities.

            You can see the difference in the process in the results, for example in how some generated pictures will contain something like a signature in the corner, simply because it resembles the training data - even though there is no meaning to it. Or how it is at least possible to get the model to output something extremely close to the training data - https://gizmodo.com/ai-art-generators-ai-copyright-stable-diffusion-1850060656.

            That at least proves that the process is quite different to the process of human learning.

            The question is how much those differences matter, and which similarities you want to focus on.

            Human learning is similar in some ways, but greatly differs in other ways.

            The fact that you’re picking and choosing which similarities matter and which don’t is just your arbitrary choice.

            • V H@lemmy.stad.social
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              2
              ·
              edit-2
              11 months ago

              You can see the difference in the process in the results, for example in how some generated pictures will contain something like a signature in the corner

              If you were to train human children on an endless series of pictures with signatures in the corner, do you seriously think they’d not emulate signatures in the corner?

              If you think that, you haven’t seen many children’s drawings, because children also often pick up that it’s normal to put something in the corner, despite the fact that to children pictures with signatures is a tiny proportion of visual input.

              Or how it is at least possible to get the model to output something extremely close to the training data

              People also mimic. We often explicitly learn to mimic - e.g. I have my sons art folder right here, full of examples of him being explicitly taught to make direct copies as a means to learn technique.

              We just don’t have very good memory. This is an argument for a difference in ability to retain and reproduce inputs, not an argument for a difference in methods.

              And again, this is a strawman. It doesn’t even begin to try to answer the questions I asked, or the one raised by the person you first responded to.

              That at least proves that the process is quite different to the process of human learning.

              Neither of those really suggests that all (that diffusion is different to humans learn to generalize images is likely true, what you’ve described does not provide even the start of any evidence of that), but again that is a strawman.

              There was no claim they work the same. The question raised was how the way they’re trained is different from how a human learns styles.

      • rhombus@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        11 months ago

        The real issue comes in ownership of the AI models and the vast amount of labor involved in the training data. It’s taking what is probably hundreds of thousands of hours of labor in the form of art and converting it into a proprietary machine, all without compensating the artists involved. Whether you can make a comparison to a human studying art is irrelevant, because a corporation can’t own an artist, but they can own an AI and not have to pay it.

  • TheWiseAlaundo@lemmy.whynotdrs.org
    link
    fedilink
    English
    arrow-up
    30
    ·
    11 months ago

    Lol… I just read the paper, and Dr Zhao actually just wrote a research paper on why it’s actually legally OK to use images to train AI. Hear me out…

    He changes the ‘style’ of input images to corrupt the ability of image generators to mimic them, and even shows that the super majority of artists even can’t tell when this happens with his program, Glaze… Style is explicitly not copywriteable in US case law, and so he just provided evidence that the data OpenAI and others use to generate images is transformative which would legally mean that it falls under fair use.

    No idea if this would actually get argued in court, but it certainly doesn’t support the idea that these image generators are stealing actual artwork.

    • Flambo@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      ·
      edit-2
      11 months ago

      So tl;dr he/his team did two things:

      1. argue the way AI uses content to train is legal
      2. provide artists a tool to prevent their content being used to train AI without their permission

      On the surface it sounds all good, but I can’t help but notice a future conflict of interest for Zhao should Glaze ever become monetized. If it were to be ruled illegal to train AI on content without permission, tools like Glaze would be essentially anti-theft devices, but while it remains legal to train AI this way, tools like Glaze stand to perhaps become necessary for artists to maintain the pre-AI status quo w/r/t how their work can be used and monetized.

  • Blaster M@lemmy.world
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    1
    ·
    11 months ago

    Oh no, another complicated way to jpeg an image that an ai training program will be able to just detect and discard in a week’s time.

    • V H@lemmy.stad.social
      link
      fedilink
      English
      arrow-up
      7
      ·
      11 months ago

      They don’t even need to detect them - once they are common enough in training datasets the training process will “just” learn that the noise they introduce are not features relevant to the desired output. If there are enough images like that it might eventually generate images with the same features.

  • lloram239@feddit.de
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    11 months ago

    “New snake oil to give artists a false sense of security” - The last of these tools I tried had absolutely zero effect on the AI, which is not exactly surprising given that there are hundreds of different ways to make use of image data as well as lots of completely different models. You’ll never cover that all with some pixel twisting.

  • Uriel238 [all pronouns]@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    11 months ago

    I remember in the early 2010s reading an article like this one on openai.com talking about the dangers of using AI for image search engines to moderate against unwanted content. At the time the concern was CSAM salted to prevent its detection (along with other content salted with CSAM to generate false positives).

    My guess is since we’re still training AI with pools of data-entry people who tag pictures with what they appear to be, so that AI reads more into images than their human trainers (the proverbial man inside the Iron Turk).

    This is going to be an interesting technology war.

  • afraid_of_zombies@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    2
    ·
    11 months ago

    I am waiting for the day that some obsessed person starts finding ways to do like code injection in pictures.

  • gregorum@lemm.ee
    link
    fedilink
    English
    arrow-up
    11
    ·
    11 months ago

    Ooo, this is fascinating. It reminds me of that weird face paint that bugs out facial-recognition in CCTV cameras.

    • seaQueue@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      11 months ago

      Or the patterned vinyl wraps they used on test cars that interferes with autofocus in most cameras.

  • wizardbeard@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    2
    ·
    11 months ago

    This is already a concept in the AI world and is often used while a model is being trained specifically to make it better. I believe it’s called adversarial training or something like that.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      8
      ·
      11 months ago

      There’s trivial workarounds for Glaze, which this is based off of, so I wouldn’t be surprised.

    • Meowoem@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      11 months ago

      It doesn’t even need a work around, it’s not going to affect anything when training a model.

      It might make style transfer harder using them as reference images on some models but even that’s fairly doubtful, it’s just noise on an image and everything is already full of all sorts of different types of noise.

    • Hannes@lemm.ee
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      11 months ago

      The problem is identifying it. If it’s necessary to preprocess every image used for training instead of just feeding it is a model that already makes it much more resources costly

    • ElectroVagrant@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      11 months ago

      I generally don’t believe in intellectual property, I think it creates artificial scarcity and limits creativity. Of course the real tragedies in this field have to do with medicine and other serious business.

      But still, artists claiming ownership of their style of painting is fundamentally no different. Why can’t I paint in your style? Do you really own it? Are you suggesting you didn’t base your idea mostly on the work of others, and no one in turn can take your idea, be inspired by it and do with it as they please? Do my means have to be a pencil, why can’t my means be a computer, why not an algorythm?

      Limitations, limitations, limitations. We need to reform our system and make the public domain the standard for ideas (in all their forms). Society doesn’t treat artists properly, I am well aware of that. Generally creative minds are often troubled because they fall outside norms. There are many tragic examples. Also money-wise many artists don’t get enough credit for their contributions to society, but making every idea a restricted area is not the solution.

      People should support the artists they like on a voluntary basis. Pirate the album but go to concerts, pirate the artwork but donate to the artist. And if that doesn’t make you enough money, that’s very unfortunate. But make no mistake: that’s how almost all artists live. Only the top 0.something% actually make enough money by selling their work, and that’s is usually the percentile that’s best at marketing their arts, in other words: it’s usually the industry. The others already depend upon donations or other sources of income.

      We can surely keep art alive, while still removing all these artificial limitations, copying is, was and will never be in any way similar to stealing. Let freedom rule. Join your local pirate party.

      Reformatted for easier readability.