OCR stands for Optical Character Recognition. OCR is a software tool that can extract print text from some documents.
When will OCR work well?
OCR does not work on handwriting. It only works for printed or typed text, meaning text created by a typewriter, printing press, or other mechanical means. OCR will do best on consistent and clear images of modern typefaces.
Do I still need to review pages started with OCR?
Yes! OCR is imperfect. It may not work well for some or all parts of a typed page, but it can be a great starting point. If you start a page with OCR, you should read the text closely before submitting. If you are reviewing a OCR-ed page, you also still need to review.
We always want to use volunteer time effectively. When the Library of Congress digitizes a large group of printed pages, it will usually OCR them. The materials in By the People campaigns are not good candidates for applying OCR at scale, either because they are handwritten, a mixed collection of handwritten and print materials, or printed on paper or in a typeface that does not produce accurate OCR results. However, OCR can still be a useful starting point for some typed pages. Use it if it if you like it or skip it if you don’t!
You can help by transcribing a new page, adding tags to this page, or coming back later to review this page's transcription.
Nice Job!
This page has been submitted for review.
What do you want to do next?
Nice Job!
Thanks for your help - we've saved your decision.
What do you want to do next?
Are you sure?
Clicking "Transcribe with OCR" will remove all existing transcription text and replace it with automatically generated text. We recommend saving existing text in a separate document if you may want to revisit it.
Campaign Tips
Transcribing Teddy
Transcribe Theodore Roosevelt's incoming letters to discover what the writers had to say about the issues of their day.
Throughout this campaign you will see bleed-through, a common issue for thin pages. Ink seeps through the paper and appears as backward text.
Bleed-through can make deciphering more difficult, but try your best. Ignore backward mirror image text and go to the previous page to view and transcribe it.
The example of backwards text above (from a Clara Barton letterbook) is bleed-through and should not be transcribed.
Shorthand
Many letters received by Teddy Roosevelt (especially in his first presidency (September 1901- March 1905) contain secretarial notes in shorthand, likely based on TR’s dictated replies.
If you find shorthand, do not transcribe it. Instead type out [[shorthand]] where you see it appear.
You can also add a "shorthand" tag.
Illegible text
You may not be able to read some text in this collection due to blurred ink, cross outs, or poor image quality. In place of illegible words or letters, transcribe a pair of square brackets around a question mark [?].
If you can read any part of a word, transcribe what letters you can and use question marks for the rest (Example: [L????n])
Image filters can help!
This campaign contains some difficult-to-read pages. Our viewer filters may help you read light, dark, or blurry pages by allowing you to adjust the brightness and contrast.
Access the filters by clicking on the icon at the top of the image viewer (located between "flip horizontally" and "toggle full page").
The filters build upon each other, so you can apply more than one at a time.
Need more help? Check out the How-To Guide
You can access full instructions at any time while transcribing or reviewing. Just click the blue How-To Guide button above the transcription box.
The guide also includes campaign descriptions and other helpful context under "About This Campaign".
View or print instructions in a separate webpage by visiting How-To at the top of your screen on any page.