How to transcribe
There are many ways to transcribe documents, and different crowdsourcing projects ask volunteers to transcribe in different ways. This section explains the By the People transcription conventions.
Our primary goals are to improve the searchability, readability, and accessibility of these documents for people who use screen readers or other assistive technology. We also want to honor the creators' historical reality by preserving the original spelling, grammar, and punctuation of the documents. The instructions were created with the Library's website search functionality in mind, and with the intention of making these pages a pleasure to hear aloud.
These instructions will answer most issues you will encounter as you transcribe, but we can't cover everything! For questions or clarifications, post in our History Hub discussion forum or contact us directly.
Transcribe text in the order it appears on the page. If you're unsure of order, transcribe the text in the way that it would make sense to read aloud.
If there is more than one page in an image, transcribe all pages, one after the other, in the order they appear. You can use two hard returns (using the "enter" or "return" key) to leave space between pages. This may make it easier for another volunteer to review.
Some letters have "cross-writing" where the author has written text in two directions to save paper or the cost of postage. Transcribe these letters in the order they were written or would make the most sense to read. You can also add the tag "cross-writing."
Spelling and punctuation
Preserve original text spelling, punctuation, grammar, word order, and any page numbers or catalog letters or numbers. Do not paraphrase the original text, just type what you see. Some writers use an equals sign as a dash or to add emphasis. You can use an equals sign to represent this feature. Dashes and other punctuation can be a little unusual in the early twentieth century and earlier, so just make your best guess on whether something is an en dash, em dash, period or something else.
If a misspelling will impact the searchability of the document, use a tag to add the correct spelling. Only registered users can add tags and review documents, so create an account if you'd like to do either of those activities.
Don't leave notes in text
You may be tempted to leave notes about the document, styling, or research you have done in the text. Please don't! Only type the original text of the page into the transcription box. Helpful information or context you want to leave for others can be added as a tag or posted in our discussion form on History Hub.
Use the "Nothing to Transcribe" button for blank pages, images or printed templates.
Preserve line breaks to make it easier for someone to review your transcription. To create a line break hit "enter" or "return" at the end of a line. Don't worry if your text in the transcription box spills over two lines before you hit enter. If you have not inserted a hard return, no line break will be recorded.
The exception is when words are broken over two lines or two pages. When the word is broken over two lines on the same page, type the word on the first line that it appears. In the case of a word that breaks across two pages, transcribe it on the first page.
Do not expand abbreviations, just type what you see. You can use the tagging function to record the expanded text of an important abbreviation such as a proper noun that otherwise does not appear in the text.
Formatting: Bold, underline, italic, indents, superscript, etc.
We ask that you not try to capture formatting, such as underlining. The Library website cannot search for bold, italic, underlined, superscript or indented text. So even when you see these features transcribe the words without any styling. Please also do not make note of the formatting in the text.
When text has been inserted over a line or otherwise added later, but should be read as part of a sentence, bring it down into the original text and type it in the order you would read it aloud. Do not use caret symbols or brackets to indicate that the text has been inserted.
Illegible or unclear text
Illegible text is anything you can’t read because a page is damaged, the text is crossed out, or you can’t tell what the author has written. If there is a word or a string of words you cannot read, transcribe as a pair of square brackets around a question mark [?]. Example:
- "I have [?] loved coffee ice cream"
If you can read any letters or parts of words transcribe what you can and use question marks for the remaining letters or words. Examples:
- "I have [a?????] loved coffee ice cream"
If you cannot read a word or phrase that’s ok! Another volunteer may be able to and can update your transcription. If there is a lot of text you cannot read consider saving your transcription and looking for another page you can better decipher .
If you can read crossed out or otherwise deleted text, transcribe the deleted words within square brackets. Example:
- “I have always loved [vanilla] coffee ice cream.”
Marginalia is text written in the space around the main block of text. It is often a comment on the main body text but may also be unrelated. It differs from an insertion, because it cannot be directly inserted into the main text and still make sense when read aloud. Examples include notes on drafts and cataloging stamps on documents. Put a pair of square brackets and asterisks [* *] around marginalia text and order it within the transcription where it makes the most sense (or at the end of the transcription if it appears unrelated). Example:
- I have always loved coffee ice cream. Last summer I made my own. [*In 2017, Brazil was the largest coffee producing country*]
Printed or typed text
Some material in By the People is typed or printed. This text still needs to be transcribed as it is not yet machine-readable. For various reasons, the Library has been unable to automate transcription using Optical Character Recognition (OCR) technology. If you would like to try using dictation software or OCR, you are welcome to do so, but please check the output for accuracy and insert linebreaks. Read how other volunteers have used these technologies, and join in the conversation on History Hub. Please transcribe letterhead, including names, places, and any words that are in the letterhead.
When not to transcribe printed text
Some mass-produced calendars and diaries contain many pages of pre-printed almanacs or other text that should not be transcribed as part of this project. This is not the core text we are aiming to capture. However, if you want to transcribe it, feel free. Alternatively, if a page is blank other than pre-printed template text, you can click "Nothing to transcribe".
Some documents will contain tables of data. Transcribe these in a way that will preserve the relationships between columns and rows, and reflect the meaning of the original documents. Try to make your transcription relatively easy for a reviewer to check, but don't try to capture the exact layout of the data. You can use spaces and hard returns, but please do not add any additional characters such as the pipe symbol or slashes to divide the data.
Long s or "funny" f
Some historical handwriting and printing uses the "long s" form, which looks like a lowercase "f". Transcribe this as a lowercase "s".
Don't describe images or other visual elements within the transcription box. If you would like to describe images, watermarks, stamps, or any other non-text features, use the tagging function. Register for an account to tag!
Non-English languages, characters, and translation
If you can transcribe the original language of a document, please do so! Other languages can be found throughout our campaigns. We want to make sure these materials are also transcribed wholly and accurately.
Please use the correct characters when transcribing non-English text. You can change your language input settings in your browser, and may need to use a foreign language keyboard or shortcuts for non-English characters.
For our Herencia campaign we created guides and cheatsheets to help you transcribe Spanish and Latin. Find those resources here!
Please do not translate non-English text in the transcription space. If you have translated a By the People document, we would love for you to share it in History Hub!
Other special characters
Transcribe special characters when they are utilized in the original document. These include ampersands (&) and currency symbols. You can learn about British Colonial currency here.
We're often asked "can I do research?" -- of course! If you are stumped about a word such as the name of a person or place, it is often helpful to do a little research. We suggest starting by visiting the original document on the Library of Congress website. Do this by clicking the button "View original on www.loc.gov", located above the transcription interface. We've also linked helpful resources on each campaign page. Additional information or historical context can be found through general web searches, maps, books, and more.
Saving work in progress
Saving a transcription stores what is in the transcription box; it does not reserve that page for a user. Saved transcriptions move to the status of "In Progress" and can be edited by another user once you leave that page. Saving and remaining on a page for longer than 2 hours will also result in that page being released for editing by another user.
Key commands to flip or rotate images
You can use keyboard commands to manipulate the image viewer. Press the question mark button above the image to see the keystroke combinations or refer to this guide. If you click into the viewer and then typed "f" or "r" it may cause the image to flip or rotate unexpectedly. Click the viewer and type the letter again to right the image.