Daina Bouquin
Harvard-Smithsonian Center for Astrophysics
daina.bouquin@cfa.harvard.edu
Harvard University
Smithsonian Institution
Some things that I work on:
The relationships between signifiers
and what they stand for in reality.
How we understand what something means.
Vocabulary of a person, language, or branch of knowledge.
(contains the signifiers)
Copernicus, N. (1543). Nicolai Copernici Torinensis De revolutionibus orbium cœlestium libri vi. Norimbergae: Apud Ioh. Petreium.
Galilei, G. (1610). Osservazioni e calcoli relativi ai Pianeti Medicei.
Galileo (67 years later)
Threatened with torture
Imprisoned for life
Burned his books
(Largely seen as the birth of observational astronomy and the scientific method)
(It was easy to dismiss)
Meaning is collective agreement about a specific thing at a specific time.
Humphrey, S.D. Multiple Exposures of the Moon: Nine Exposures, daguerreotype, 1849.
Sometimes it's more about privilege.
Earliest image of the moon extant.
There could have been other images of the moon.
Gift to the President of Harvard at the time.
(This is it on my desk.)
means context
Daguerreotype "Recipe book"
Matters because of its relationship to the daguerreotype.
Provenance guides prioritization for curation.
Curation is work.
Everything will break.
Things need to be reformatted.
Entire fields are being developed in response:
Stabilizing and recovering data from digital media.
The creators of these objects did not need to care about the historic meaning of their work.
Provenance could be determined so we gave these things meaning and prioritized them for curation.
We know what to call these things and
we know how to take care of them.
Knowledge is more than books and articles.
When does something like this matter?
Who decides?
How do we semantically link this to anything?
How would someone find it?
(What do I call it?)
Mechanisms for modeling relationships between the information gathered from provenancial sources.
Logical framework where
semantic metadata can be recorded.
I can describe this thing but give it little meaning.
Cultural norms prevent me from throwing this away.
(I would feel bad)
A paper could provide some provenance.
Our schema should definitely have a field
where we can identify a relevant paper.
Remember though:
Who didn't?
Is the "author" of the paper identical to
the "author" of this thing?
Who gets credit?
We need to be able to directly identify the object to distinguish between the object and our sources of provenance.
Software will be the foundation on which future generations must build new knowledge.
Just means it's in a place right now.
Unambiguous way to point at a specific thing in a specific place at a specific time.
Where the thing you are pointing at is at a specific time.
Exists in many ways
in many places over time.
The daguerreotype is also on Pinterest.
This page doesn't exist there anymore.
It also didn't tell me where the real thing is.
Is it on my desk or in a vault?
URL
Uniform Resource Locator
Locations change.
Provenance changes.
Meaning changes.
Identification
attached to machine actionable metadata
Identifier
DOI
URI
Bibcode
arXiv ID
etc.
Locator
URL
https://github.com/dfm/corner.py
was
Changes over time.
The meaning you are trying to express now will be different from what will be located at this URL later.
This is not what you cite because this has no unambiguous meaning.
https://github.com/dfm/triangle.py
Cite the DOI for the specific version of the thing you want to cite.
You already do this with papers.
This page has a URL: https://zenodo.org/record/53155
This page is an interface where metadata is displayed.
The metadata is stored
with the identifier (DOI).
The URL is just another piece of metadata.
DOIs are resolvable.
They are bound to metadata.
Minted by a registry responsible for curating location metadata.
Resolves to a tombstone.
Summary: Identifiers let us unambiguously point and assign semantic meanings with metadata.
They can only work with the metadata they are given.
When we enrich metadata new connections are possible.
Libraries and archives aren't the direct
stewards of your work anymore.
We need to be able to find your work though.
You need to be able to make informed choices about it.
Our bibliographies represent your work.
We need to work together.
We can give you tools but you need to make choices.
Two different papers.
(Not the code)
Software DOIs don't guarantee software citation
complicated / conflicting author instructions
Systems need to change.
People who write software
need to decide what matters.
But we have started to define our lexicon.
human- and machine-readable file format that provides citation metadata for software.
more than citation metadata
Lets us translate our lexicon from one schema to another.
Enables interoperability and further contextualization.
Identifiers can be mapped to other identifiers.
e.g. SigMF (Signal Metadata Format)
Hardware is provenance
Jackson, M. (2018b). Software Deposit: What to deposit (Version 1.0). http://doi.org/10.5281/zenodo.1327325
Bouquin, D., Hou, S., Benzing, M., Wilson, L. (2019). Jupyter Notebooks: A Primer for Curators (Version v1.0).
Working on Guidance
(building discipline specific resources too)
We have a complete history of nothing.
Some things get a legacy and some things don't.
Your work matters.