digital humanities technology specialist @ nyu it & libraries
dream lab x minimal computing
working on metadata & item data for your collection
- i.e., collection assets
- comprises the digital files representing the source object
- e.g., a JPEG, PDF, MKV, CAD
- many file formats include their own technical metadata within the spec; boundaries are porous!
- i.e., collection records
- "data about data"
- generally text-based
- can be descriptive, structural, administrative
- e.g., label or keyword; page number or file path; ISSN or copyright
- can be stored and restored in many formats, e.g., XML, CSV, JSON, MARC
"collections as data"
Wax tries to follow a "collections as data" model by separating canonical data from prose and styling and by using open formats for re-usability.
See examples at
wax data & metadata
what does a CSV *actually* look like?
tips & gotchas
- Gotcha: Avoid Microsoft Excel! It uses custom, hidden newline characters than can break your character encoding
- Gotcha: Avoid styling within metadata as much as possible (e.g, a bold phrase within a description)
- Gotcha: Use UTF-8 whenever possible for your metadata and text editing. It'll support diacritics and special characters with less chance of breaking something
- Tip: Pay attention to quotation marks! They are functionally significant in many file formats
- Tip: Use linters like CSVLint, JSONLint, and YAMLlint to troubleshoot formatting problems
preview: what will wax do with this data & metadata?
- Use your high quality images and PDFs to create derivatives: i.e., copies and transformations of the files that are optimized for online viewing & sharing
- Use your metadata records to:
a. automatically generate 1 web page per item
b. add alt text where appropriate
c. populate the search index
d. and more
Dream Lab x Minimal Computing 1