A course on digital libraries and building digital collections.
View the Project on GitHub jawalsh/z652-Digital-Libraries-FA23
We will learn about the digitization of text-based media, or “text objects,” like books, manuscripts, comics, notebooks, etc. We will learn about character encodings, optical character recognition (OCR), and typical workflows for text digitization.
brew install tesseract
.