Prof. Eleanor Robson
Raquel Alegre
Dr. James Hetherington
Dr. Jens Nielsen
UCL History Department
UCL Research Software Development Group
Metadata: project info, lang, protocols...
Transliteration and lemmatization
Translation
Comments
Descriptions:
rulings, blank, ...
Sections:
object, parts. ...
Breaks the input text into a stream of tokens and matches with RE:
#
project
:
cams/gkab
+
+
+
+
[new line]
t_HASH
r'\#'
PROJECT
t_COLON
r'\:'
t_ID
t_NEWLINE
r'\/n'
r'[a-zA-Z0-9]+[/]?[a-zA-Z0-9]+'
Yacc parses and does semantic processing on the stream of tokens produced by Lex, following a grammar description:
expression : expression + term
| expression - term
| expression * term
| expression / term
| term
3 * 5 + 1
def p_document(self, p):
"""document : text
| object
| composite"""
def p_text_language(self, p):
"text : text language_protocol"
p[0] = Text()
p[0].language = p[2]
def p_language_protocol(self, p):
"language_protocol : ATF LANG ID newline"
p[0] = p[3]
Graphical User Interface for edition of ATF texts
Feedback and suggestions are welcome!
Researcher
Software Developer
Research Software Developer
@uclrcsoftdev
rc-softdev@ucl.ac.uk
r.alegre@ucl.ac.uk
Questions?