Textual editors should beware those talking about digital editions primarily in terms of their presentation and layout for consumption by readers
The power of digital editions comes not from their shiny front-ends (which should look different for different sorts of users) but the data model and APIs that power their back-ends
These data models and APIs need to fully model the intellectual understanding of the text; it is the data model, with a documented API which are the real digital editions, presentation layers are only views of the edition
Digital editors only need to understand those technologies which affect the intellectual content of the edition
What is the TEI?
An international consortium of institutions, projects and individual members; and a community of users and volunteers
A freely available manual of set of regularly maintained and updated recommendations: 'The Guidelines'
Definitions, examples, and discussion of over 540 markup distinctions for textual, image facsimile, genetic editing etc.
A mechanism for producing customized schemas for validating your project's digital texts
A set of free and openly licensed, customizable tools and stylesheets for transformations to many formats (e.g. HTML, Word, PDF, Databases, RDF/LinkedData, Slides, ePub, etc.)
A simple consensus-based way of organizing and structuring textual (and other) resources
A format for documenting your interpretation and understanding of a text (and how text functions)
Whatever you make it! It is a community-driven standard
Some myths about the TEI
The TEI is too big (or complicated)
There is no way to change the TEI
The TEI is too small (or doesn't have <mySpecialElement>)
The TEI is XML (and XML is broken or dead)
You can't get from TEI to $myPreferredFormat
You can't do stand-off markup in XML (or TEI)
XML (and TEI) can't handle overlapping hierarchies
There are no tools that understand the TEI
TEI is only for Anglo/Western works
Interoperability is impossible with the TEI
The TEI is only for a digital edition
If you do a TEI-based edition you must learn other $tech
"The TEI is too big"
The TEI is a modular framework that allows you, a project, or a sub-community to choose precisely what elements are available (c.f. EpiDoc)
You customise the TEI in a TEI ODD customisation file where you include (and document) the choices you are making
This enforces consistency amongst a group of encoders (or just yourself), but also serves as machine processable documentation for long-term preservation
Your TEI ODD customisation is then a meta-schema source not only to generate your schema (to validate your documents) but also for your local encoding manual
Module element references by @include = only ever get these elements
Modue element references by @except = get any new elements when regenerating schema
Although there are web-based tools to create TEI customisations for you, what they create is TEI XML underneath
In this case we are changing the 'name' element from the core modue
"There is no way to change the TEI"
The <constraintSpec> element enables us to provide additional constraints (e.g. in SchemaTron)
The <model> element enables us to record our intended processing model(s)
Adding project-specific examples and notes is easy
Your TEI ODD file is also able to contain as much prose description, examples, etc. as you want outside the schema specification
(And you can change the TEI in other ways of course!)
The TEI is an open source community-developed standard
You can submit bugs/feature-requests at http://github.com/TEIC/TEI/issues/
You may get (or give) free support on the TEI-L mailing list (often on textual editing as much as TEI)
You can join Special Interest Groups and lobby for your particular view on critical apparatus (or something else)
Although everything it makes is free, you can also get your projects or institutions to join as a member and vote in elections, get discounts on software, archiving, etc.
"The TEI is too small"
The TEI has over 540 elements detailing various textual phenomena, although it does not have <mySpecialElement> the chances are it can cope with what you need in a more general manner
But even if you can't -- unlike most other standards -- you can add new elements, and do so in a manner that fully integrates and documents them (your TEI ODD customisation file)
You can also ask the TEI to add <mySpecialElement> and the elected group of volunteers will debate it (on the issue or council mailing list, both openly visible)
People's feature requests are usually eventually accepted
"The TEI is XML"
The TEI is not XML
Although it currently uses XML as a serialization format, previously it was SGML
When a better format arises (and so far in terms of clarity for long-term preservation, expressiveness, validation, integration, and mass adoption, nothing has come close), it may move away from XML
TEI conformance is governed by the TEI abstract model instantiated in the prose of the TEI Guidelines
If the prose and generated schemas differ, it is the prose that is considered normative
We have constraints in the prose that cannot be modelled in any existing schema language (hence development of Pure ODD)
"(And XML is Broken or Dead)"
The death of XML is highly over-forecast by those who fall victim to technology hype cycles and those who want to push $theirSpecialFormat or technology
Their are limitations with XML, but usually these either don't matter, are solved, or are a misunderstanding
Preferring a different format doesn't mean you need to denigrate existing formats; This is not, and should not be, a religious war
You can use XML, JSON, RDF, LaTeX, DocX, Markdown, and many other formats (and generate them from your TEI if you wish
Don't believe zealots: your choice of format should be about the appropriate format for rich encoding suitable to those particular circumstances not about technology fads (but for critical editing TEI is a very good choice)
"You can't get from TEI to $myPreferredFormat"
XML is easily processable with dozens of programming languages
The TEI Consortium provides XSLT stylesheets for transformations to/from around 40 other formats
Including, for example: bibtex, cocoa, csv, docbook, docx, dtd, epub, html(5), xsl-fo, json, InDesign, latex, markdown, mediawiki, nlm, odd, pdf, rdf, relaxng, slides, txt, wordpress, xlsx, xsd, and many more
Tools like OxGarage pipeline together these and other conversions
Rolling your own XSLT, or profiles of the TEIC XSLT, is fairly easy (compared with other academic skills)
Important thing is granularity of information
"You can't do stand-off markup in XML (or TEI)"
This myth shows a misunderstanding of XML and unfamiliarity with TEI
While lots of TEI users favour embedded markup, there are lots of elements in the TEI specifically designed for stand-off markup (c.f. <link>, <join>)
Your edition could be a very flat text and you could point into it (using URIs, XPointers, etc.) to provide stand-off markup
A critical apparatus can be completely separate from a base text and point into it using many of the URI datatyped attributes
There could be more documentation and explanation in the TEI Guidelines about this, but there are proposals to improve this; more general tools needed
The TEI Guidelines have a whole chapter (#20) about how to handle non-hierarchical structures
While it is true the TEI users often prefer to privilege the intellectual content over the physical construct, there are ways to mark both of these (e.g. milestones)
Revisions to TEI's <app> element enable <lem> and <rdg> to allow paragraphs, divisions, and thus it isn't limited to phrase-level textual variance
Having multiple hierarchies is handled with forms of stand-off or out-of-line markup which are perfectly reasonably done in XML (and TEI)
It would be good to have more tools (there are some) specifically for this kind of work though
"There are no tools that understand the TEI"
(Of course, we'd be happy if there were more!)
"TEI is only for Anglo/Western works"
"Interoperability is impossible with the TEI"
The necessary ability to customise, constrain, extend the standard does pose a challenge for interoperability, but it is certainly possible
Usually people interoperate (rather than interchange) through lowest-common denominator subsets or pre-existing TEI subsets (like TEI Lite or TEI Simple)
More complex forms of markup interoperability may need some mediating influence (e.g. someone to understand both uses of the TEI)
The solution is proper documentation (by which I mean machine-processable TEI ODD customisation files with lots of prose as well).
The ability to interchange many documents improves significantly with a common interchange format
Customisation can document the differences in a machine processable format so tools can compare different corpora
- @louburnard
"TEI is only for a digital edition"
The TEI is for many forms of output
There isn't a one-to-one relationship between a TEI file and 'The Digital Edition' -- if you are using the format to its potential then you can create many aspects of the edition, supplementary files, indices
From a well-encoded TEI file you can create not only a digital edition, but camera-ready print copy, interactive graphic visualisations of encoded information, and many other formats
A single encoded TEI file can be used to produce multiple forms of edition (e.g. for different audiences; or diplomatic, eclectic, etc. )
"If you do a TEI-based edition you must learn other $tech"
When people create digital editions they often take it on themselves to learn not only TEI, but the technologies to transform and manipulate this
Great for those who can do so, or want to learn, but only need those which affect the intellectual content
Increasingly, tools like TEI Boiler Plate, eXist-db's TEI Publisher, in addition to the TEIC Stylesheets give editors more independent control
The new introduction of TEI Processing Model documentation inside TEI ODD gives tool-makers a way to generate software based on implementation-agnostic instructions that an editor (or editorial assistant) could modify
Conclusions
The TEI is as big or small as you want it to be -- the community helps users, projects, disciplines to change it
XML, and the TEI, are alive and well
You can use stand-off markup in the TEI and it is one of the recommended ways to handling overlapping hierarchies
With good markup you can get to/from almost any format (and many conversions already exist)
There are tools that understand XML and TEI, but more generalised ones are always good
TEI is used for texts of any language, any time period, and any writing system
Interoperability is always a challenge, but easier when you converge on a format
The TEI is for many outputs, not just digital editions
What editors need to learn is TEI, others depend on needs