Text Transcription Issues: Difference between revisions

Latest revision as of 16:31, 17 January 2013

About Standards for Transcribing Text

Content here begins with resources put together by Jason Best (thank you Jason) in an email sent to the AOCR wg on 19 December 2012.

In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I [Jason Best] briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.

If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like:
- Markdown (http://daringfireball.net/projects/markdown/syntax) or
- Textile (http://txstyle.org).

Below I've listed some projects that either establish transcription or markup standards or have published guidelines or suggestions about how to transcribe text.
- TEI elements for representing primary documents (in particular, errors, corrections, alterations, ambiguity, etc) - http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHCH
- FreeReg, register transcription - http://www.freereg.org.uk/howto/transcribe.htm
- Transcribe Bentham Guidelines (seems to be based on TEI) - http://www.transcribe-bentham.da.ulcc.ac.uk/td/Help:Transcription_Guidelines
- New York Public Library Menu transcription guidelines - http://menus.nypl.org/help
- National Archives Transcription tips - http://transcribe.archives.gov/tips
- Leiden+ notation used by classicists for marking damage and unclear readings in Greek papyrus standards - http://papyri.info/editor/documentation?docotype=text (In use since the mid-1930s, updated and translated to TEI by the Integrating Digital Papyrology group.)

Projects that might have additional approaches to transcription
- http://scripto.org http://www.uscript.org
- http://transcriptorium.eu http://t-pen.org

Back to the 2013 AOCR Hackathon Wiki

@@ Line 1: / Line 1: @@
 == About Standards for Transcribing Text  ==
 <br>
-*In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.
+*Content here begins with resources put together by Jason Best (thank you Jason) in an email sent to the AOCR wg on 19 December 2012.
+*In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I [Jason Best] briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.
 *If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like:
@@ Line 13: / Line 15: @@
 **New York Public Library Menu transcription guidelines - http://menus.nypl.org/help
 **National Archives Transcription tips - http://transcribe.archives.gov/tips
+**Leiden+ notation used by classicists for marking damage and unclear readings in Greek papyrus standards - http://papyri.info/editor/documentation?docotype=text (In use since the mid-1930s, updated and translated to TEI by the Integrating Digital Papyrology group.)
 *Projects that might have additional approaches to transcription
 **http://scripto.org http://www.uscript.org
 **http://transcriptorium.eu http://t-pen.org
+Back to the [[2013 AOCR Hackathon Wiki]]

Text Transcription Issues: Difference between revisions

Latest revision as of 16:31, 17 January 2013

About Standards for Transcribing Text

Navigation menu

Search