Beta testing Stad.social

@vidarh@stad.social

  • 9 Posts
  • 31 Comments
Joined 9 months ago
cake
Cake day: October 1st, 2023

help-circle
  • You can see the difference in the process in the results, for example in how some generated pictures will contain something like a signature in the corner

    If you were to train human children on an endless series of pictures with signatures in the corner, do you seriously think they’d not emulate signatures in the corner?

    If you think that, you haven’t seen many children’s drawings, because children also often pick up that it’s normal to put something in the corner, despite the fact that to children pictures with signatures is a tiny proportion of visual input.

    Or how it is at least possible to get the model to output something extremely close to the training data

    People also mimic. We often explicitly learn to mimic - e.g. I have my sons art folder right here, full of examples of him being explicitly taught to make direct copies as a means to learn technique.

    We just don’t have very good memory. This is an argument for a difference in ability to retain and reproduce inputs, not an argument for a difference in methods.

    And again, this is a strawman. It doesn’t even begin to try to answer the questions I asked, or the one raised by the person you first responded to.

    That at least proves that the process is quite different to the process of human learning.

    Neither of those really suggests that all (that diffusion is different to humans learn to generalize images is likely true, what you’ve described does not provide even the start of any evidence of that), but again that is a strawman.

    There was no claim they work the same. The question raised was how the way they’re trained is different from how a human learns styles.


  • Human brains clearly work differently than AI, how is this even a question?

    It’s not all that clear that those differences are qualitatively meaningful, but that is irrelevant to the question they asked, so this is entirely a strawman.

    Why does the way AI vs. the brain learn make training AI with art make it different to a person studying art styles? Both learn to generalise features that allows them to reproduce them. Both can do so without copying specific source material.

    The term “learning” in machine learning is mainly a metaphor.

    How do the way they learn differ from how humans learn? They generalise. They form “world models” of how information relates. They extrapolate.

    Also, laws are written with a practical purpose in mind - they are not some universal, purely philosophical construct and never have been.

    This is the only uncontroversial part of your answer. The main reason why courts will treat human and AI actions different is simply that they are not human. It will for the foreseeable future have little to do whether the processes are similar enough to how humans do it.










  • We actually had to step over still wet seaweed at the bottom of the path (the darker bit at the end of the path) to even get onto the beach at this spot, so the normal tidal variations even during a relatively calm summer weekend still got all the way to foot of the walls. I would not particularly like to see the insurance costs or list of things that are not covered for the closest properties. Most of the properties along that stretch are set above fairly sizable seawalls.

    Ryde is facing the Solent, the strait between Isle of Wight and “mainland” England, so I guess it’s more sheltered than some other parts of the island, but they must still get plenty of nasty weather during the winter.









  • The thing, is realistically it won’t make a difference at all, because there are vast amounts of public domain data that remain untapped, so the main “problematic” need for OpenAI is new content that represents up to data language and up to date facts, and my point with the share price of Thomson Reuters is to illustrate that OpenAI is already getting large enough that they can afford to outright buy some of the largest channels of up-to-the-minute content in the world.

    As for authors, it might wipe a few works by a few famous authors from the dataset, but they contribute very little to the quality of an LLM, because the LLM can’t easily judge during training unless you intentionally reinforce specific works. There are several million books published every year. Most of them make <$100 in royalties for their authors (an average book sell ~200 copies). Want to bet how cheap it’d be to buy a fully licensed set of a few million books? You don’t need bestsellers, you need many books that are merely sufficiently good to drag the overall quality of the total dataset up.

    The irony is that the largest benefactor of content sources taking a strict view of LLMs will be OpenAI, Google, Meta, and the few others large enough to basically buy datasets or buy companies that own datasets because this creates a moat for those who can’t afford to obtain licensed datasets.

    The biggest problem won’t be for OpenAI, but for people trying to build open models on the cheap.


  • It won’t really matter, because there will continue to be other sources.

    Taken to an extreme, there are indications OpenAI’s market cap is already higher than Tomson Reuters ($80bn-$90bn vs <$60bn), and it will go far higher. Getty, also mentioned, has a market cap of “only” $2.4bn. In other words: If enough important sources of content starts blocking OpenAI, they will start buying access, up to and including if necessary buying original content creators.

    As it is, while BBC is clearly not, some of these other content providers are just playing hard to get and hoping for a big enough cash offer either for a license or to get bought out.

    The cat is out of the bag, whatever people think about it, and sources that block themselves off from AI entirely (to the point of being unwilling to sell licenses or sell themselves) will just lose influence accordingly.

    This also presumes OpenAI remains the only contender, which is clearly not the case in the long run given the rise of alternative models that while mostly still not good enough, are good enough that it’s equally clearly just a matter of time before anyone (at least, for the time being, for sufficiently rich instances of “anyone”, with the cost threshold dropping rapidly) can fine-tune their own models using their own scraped data.

    In other words, it may make them feel better, but in the long run it’s a meaningless move.

    EDIT: What a weird thing to downvote without replying to. I’ve taken no stance on whether BBC’s decision is morally right or not, just addressed that it’s unlikely to have any effect, and you can dislike that it won’t have any effect but thinking it will is naive.


  • Heh, yeah, that’s part of what’s currently keeping me on X. I use little more than a bunch of shells and Chrome, so there’s not many incentives for me to switch. All of my Ruby X tools are very light on the X11 API use, so they’ll eventually be fairly simple to migrate over, but the window manager vs. compositor situation is frustrating.

    I’m somewhat tempted to hack together some FrankenCompositor based on wlroots that implements the bare minimum of the X11 protocol to allow an X11 window manager to to manage the windows. The X11 protocol itself is simple, and while making every WM run would be a ton of work, if you first have a Wayland compositor making it possible to run simpler WMs wouldn’t actually necessarily be so bad. Not likely to happen anytime soon, though, it’s not exactly necessary and I’m not that much of a masochist :)

    A somewhat more sane variant might be FFI bindings for wlroots so it’s possible to use it to build a compositor, but that too seems an awful lot more work than an X window manager.



  • That’s an interesting one I’d missed. Thanks :)

    It might just tempt me to ditch bspwm, or at least experiment. I use little enough of bspwm capabilities, so it might be feasible. I have also lightly toyed with the idea of writing my own, as since I don’t use menu bars etc. even on my floating screen (the “menu bars” in my desktop manager are just client rendered titles) I really need very few capabilities. Basically pretty much just a placement function similar-ish to bspwm, and the ability to move and resize and float windows.

    On the other hand, a truly minimalist WM is <100 lines, so I might consider writing one from scratch too (I’d need to update the Ruby X11 binding to handle StructureNotify events and add a few more calls, but that’s pretty trivial). Though at this point we’re quickly approaching zealotry :) It would be fun, though. Maybe when I’m done replacing the terminal fully…




  • When they say “can’t be blocked” I presume they mean “can’t be blocked with the block function in X/Twitter”. They also say it can’t be liked or retweeted.

    So far ads have been treated as sort-of regular posts that are just shown according to the ad rules rather than because they belong in the timeline under normal criteria, and you could like, retweet and block them just like any other post.

    So this is basically them treating ads as a fully separate thing rather than just a different post type.

    Though the article suggests they’ll still try to make them look mostly like posts, except without showing a handle etc. though, which is extra scummy