Customising the TEI for your project

Dr James Cummings

@jamescummings
http://slides.com/jamescummings/customisingtei

Customising the TEI

Possibilities of the

TEI Framework

Project B

Project A

New Elements

What does the underlying XML look like?

  • @include = never get any new elements
  • @except = get new elements if you regenerate schema

Why Customise?

  • Enforce consistency between one or more encoders
  • Increase speed of encoding with set value lists and descriptions
  • Generate internationalised, project-specific, documentation
  • Record decisions and relationship with the TEI in a machine-processable form
  • Create local encoding manual with embedded schema specifications
  • Provide long-term archival documentation for your projects outputs

Do we have to?

  • While customization of the TEI is not required it is a good idea as a form of documentation
  • Customizations help remove human error amongst a group of encoders or even a single encoder over time
  • While the TEI-C provides off-the-shelf customizations like tei_all, tei_lite, and tei_simplePrint it is very unlikely that these perfectly match any particular project
  • All good TEI projects customize the TEI to not only reap the benefits of a a local encoding manual and schema, but as documentation for the future

TEI is standardisation by not saying
Do what I do
but instead by saying
Do what you need to do but tell me about it in a language I understand

Freedom to Constrain

  • If you merely constrain the TEI to be:
    • smaller
    • more precise
    • have specified attributes 
    • project-specific examples
    • localised documentation
    • then interoperability is less of a problem
  • But there are still issues:
    • different practices in various communities
    • different element choices
    • variation in attribute values

These problems are multiplied if new elements are added

  • The ability to interchange many documents improves significantly with a common interchange format
  • Customisation can document the differences in a machine processable format so tools can compare different corpora

Unmediated Interoperability Fantasy

  • True interoperability only happens because of mediating factors between resources (e.g. crosswalks, normalisation scripts, understanding of the differences)
  • If seamless interoperability happens without these then it is lowest common denominator interchange instead:
    •  the initial data structures are trivial, limited or of only structural granularity,
    • the method of interoperation or combined processing is superficial,
    • there has been a loss of intellectual content, or
      the resulting interoperation is not significant.

TEI Processing Model

  • The TEI processing model documentation in TEI ODD Customisations enable record of processing intentions
  • Software developers can read TEI ODD Customisation file and generate processing streams based on documented behaviours
  • Testing and early adopters show that this 'build a factory rather than a car' approach saves significantly in code-length and complexity
  • eXist-db Native XML Database has incorporated this to produced their TEI Publisher eXist-db App

Let's go customise the TEI!

 

http://www.tei-c.org/Roma/

Customising the TEI for your project

By James Cummings

Customising the TEI for your project

An introductory workshop talk on customising the TEI.

  • 2,010