From the desk of Dustin Larmore via the Library of Congress…
The public can now explore more than 1.5 million historical newspaper images online and free of charge. The latest machine learning experience from Library of Congress Labs, Newspaper Navigator allows users to search visual content in American newspapers dating 1789-1963.
The user begins by entering a keyword that returns a selection of photos. The user can then choose photos to search against, allowing the discovery of related images that were previously undetectable by search engines.
For decades, partners across the United States have collaborated to digitize newspapers through the Library's Chronicling America website, a database of historical U.S. newspapers. The text of the newspapers is made searchable by character recognition technology, but users looking for specific images were required to page through the individual issues. Through the creative ingenuity of Innovator in Residence Benjamin Lee and advances in machine learning, Newspaper Navigator now makes images in the newspapers searchable by enabling users to search by visual similarity.
To create Newspaper Navigator, Lee trained computer algorithms to sort through 16 million Chronicling America newspaper pages in search of photographs, illustrations, maps, cartoons, comics, headlines and advertisements. The idea for Lee's groundbreaking project began with a Library crowdsourcing experiment by 2017 Innovator in Residence Tong Wang called Beyond Words, which invited members of the public to help identify cartoons, illustrations, photographs and advertisements in World War I-era newspapers. Users could draw boxes around visual content on a page, transcribe captions or review other users' transcriptions.
For more information, see the Library of Congress's full press release.