“If your full-time, eight-hours-a-day, five-days-a-week job were to look at each image in the dataset for just one second, it would take you 781 years.”
– German data journalist
Christo Buschek and Canadian software artist
Jer Thorp, on the scale of generative AI datasets preventing human curation.
LAION-5B, for example, contains 5.8 billion image and text pairs that are selected automatically, and it shows: “It contains less about how humans see the world than it does about how search engines see the world. It is a dataset that is powerfully shaped by commercial logics.”