Back in 2014, we brought to your attenÂtion an image archive rivalÂing the largest of its kind on the web: the InterÂnet Archive Book Images colÂlecÂtion at Flickr. There, you’ll find milÂlions of “pubÂlic domain images, all extractÂed from books, magÂaÂzines and newsÂpaÂpers pubÂlished over a 500 year periÂod.”
At the time, the colÂlecÂtion conÂtained 2.6 milÂlion pubÂlic domain images, but “evenÂtuÂalÂly,” we notÂed in a preÂviÂous post, “this archive will grow to 14.6 milÂlion images.” Well, it has almost douÂbled in size since our first post, and it now feaÂtures over 5.3 milÂlion images, thanks again to Kalev LeeÂtaru, who headÂed the digÂiÂtiÂzaÂtion project while on a Yahoo-sponÂsored felÂlowÂship at GeorgeÂtown UniÂverÂsiÂty.
Rather than using optiÂcal charÂacÂter recogÂniÂtion (OCR), as most digÂiÂtiÂzaÂtion softÂware does to scan only the text of books, Leetaru’s code reversed the process, extractÂing the images the InterÂnet Archive’s OCR typÂiÂcalÂly ignores. ThouÂsands of graphÂic illusÂtraÂtions and phoÂtographs await your disÂcovÂery in the searchÂable dataÂbase. Type in “records,” for examÂple, and you’ll run into the 1917 ad in “ColomÂbia Records for June” (top) or the creepy 1910 phoÂtoÂgraph above from “Records of big game: with their disÂtriÂbÂuÂtion, charÂacÂterÂisÂtics, dimenÂsions, weights, and horn & tusk meaÂsureÂments.” Two of many gems amidst utilÂiÂtarÂiÂan images from dull corÂpoÂrate and govÂernÂment record books.
Search “library” and you’ll arrive at a fasÂciÂnatÂing assemÂblage, from the fashÂionÂable room above from 1912’s “Book of Home BuildÂing and DecÂoÂraÂtion,” to the rotund, mournÂful, soon-to-be carved pig below from 1882’s “The AmerÂiÂcan Farmer: A ComÂplete AgriÂculÂturÂal Library,” to the nifty NauÂtilus drawÂing furÂther down from an 1869 British MuseÂum of NatÂurÂal HisÂtoÂry pubÂliÂcaÂtion. To see more images from any of the sources, simÂply click on the title of the book that appears in the search results. The orgaÂniÂzaÂtion of the archive could use some improveÂment: as yet milÂlions of images have not been orgaÂnized into theÂmatÂic albums, which would greatÂly streamÂline browsÂing through them. But it’s a minor gripe givÂen the numÂber and variÂety of free, pubÂlic domain images availÂable for any kind of use.
MoreÂover, LeeÂtaru has planned to offer his code to instiÂtuÂtions, telling the BBC, “Any library could repeat this process. That’s actuÂalÂly my hope, that libraries around the world run this same process of their digÂiÂtized books to conÂstantÂly expand this uniÂverse of images.” ScholÂars and archivists of book and art hisÂtoÂry and visuÂal culÂture will find such a “uniÂverse of images” invaluÂable, as will ediÂtors of Wikipedia. “What I want to see,” LeeÂtaru also said, “is… Wikipedia have a nationÂal day of going through this [colÂlecÂtion] to illusÂtrate Wikipedia artiÂcles.”
Short of that, indiÂvidÂual ediÂtors and users can sort through images of all kinds when they can’t find freely availÂable picÂtures of their subÂject. And, of course, sites like Open Culture—which rely mainÂly on pubÂlic domain and creÂative comÂmons images—benefit greatÂly as well. So, thanks, InterÂnet Archive Book Images ColÂlecÂtion! We’ll check back latÂer and let you know when they’ve grown even more.
RelatÂed ConÂtent:
DownÂload for Free 2.6 MilÂlion Images from Books PubÂlished Over Last 500 Years on Flickr
The GetÂty Adds AnothÂer 77,000 Images to its Open ConÂtent Archive
Josh Jones is a writer and musiÂcian based in Durham, NC. FolÂlow him at @jdmagness
BibÂlioÂphile HeavÂen. What a magÂnifÂiÂcent mediÂum this is. Thank you to all who made this posÂsiÂble. PosÂterÂiÂty owes you an un-repayable debt.