• layer8 3 days ago
    > Over 190,000 copyrighted books obtained from pirated websites.

    While that’s a lot, the estimated number of just English published novels is roughly an order of magnitude above that [0], so “every novel that has ever been published” isn’t anywhere near correct, in all likelihood.

    [0] https://litlab.stanford.edu/how-many-novels-have-been-publis...

    [-]
    • xvxvx 3 days ago
      [flagged]
      [-]
      • layer8 3 days ago
        I feel that hyperboles like in the present submission title are working against the cause, because they are putting up a strawman. Absolutely do fight the unethical practices, but do not misrepresent reality, otherwise you are denigrating your own position. We need to be accurate about what’s the case, to have credibility.
      • solarkraft 3 days ago
        That’s not what they are doing. Accuser made a false claim and they are refuting it.
  • pwdisswordfishy 3 days ago
    This tweet is misleading (shocker, I know; Twitter and misleading ragebait—who could have guessed?).

    It claims that "Up to 90%" (accurate, or at least plausible—but unsurprising) of "Every book you have ever read" (just untrue) is "sitting inside ChatGPT right now".

    Meanwhile, 100% of the books that have been scanned into Google Books's scanned books collection are sitting "inside" Google Books's scanned books collection. And 100% of the web pages that Google Search has crawled and indexed are sitting "inside" Google Search's index of the pages it has crawled and indexed.

  • avian 3 days ago
    [-]
    • Vaslo 3 days ago
      Do they have something like this for Bluesky
  • xvxvx 3 days ago
    I truly wish to see Google/Alphabet be absolutely annihilated by lawsuits. Bad enough that YouTube was built on pirated material, and still is to this day, but now this?

    Every single penny they’ve ever generated should be awarded to the authors they stole from, and Alphabet should be bankrupt into oblivion. The sheer number of people involved in this mass piracy event means it’s fully systemic. Shut them down!

    [-]
    • fractallyte 3 days ago
      I don't really understand this. What was stolen?

      The computer doesn't "enjoy" books – it assimilates them as data. That data isn't stored verbatim; it's used to train a system. In return, ChatGPT is made freely available, and millions of people benefit from it.

      What is morally wrong about this?

  • PufPufPuf 2 days ago
    Is it just me or is this full of LLMisms? Threes, "not X, Y", etc.