The manifests of the images, which include the description and the books they were taken from, are already available on GitHub. This is a wonderful project that collaborates with the internet in its most basest form: recreation. I’m excited to see what will happen from this project and I think the British Library is awesome for having released this. The images came from only 60,000 volumes–merely a fraction of the millions of books the British Library currently holds.
There are maps, geological diagrams, beautiful illustrations, comical satire, illuminated and decorative letters, colourful illustrations, landscapes, wall-paintings and so much more that even we are not aware of.
We plan to launch a crowdsourcing application at the beginning of next year, to help describe what the images portray. Our intention is to use this data to train automated classifiers that will run against the whole of the content. The data from this will be as openly licensed as is sensible (given the nature of crowdsourcing) and the code, as always, will be under an open licence.
The manifests of images, with descriptions of the works that they were taken from, are available on github and are also released under a public-domain ‘licence’. This set of metadata being on github should indicate that we fully intend people to work with it, to adapt it, and to push back improvements that should help others work with this release.
There are very few datasets of this nature free for any use and by putting it online we hope to stimulate and support research concerning printed illustrations, maps and other material not currently studied. Given that the images are derived from just 65,000 volumes and that the library holds many millions of items.
If you need help or would like to collaborate with us, please contact us on email, or twitter (or me personally, on any technical aspects)