Metadata, Meaning, and Erasure in Astronomy
Daina Bouquin
Semantics
The relationships between signifiers
and what they stand for in reality.
How we understand what something means.
Metadata
Mechanisms for modeling semantic meaning.
Copernicus, N. (1543). Nicolai Copernici Torinensis De revolutionibus orbium cœlestium libri vi. Norimbergae: Apud Ioh. Petreium.
Copernicus
-
Published towards the end of his life
-
Work was dedicated to Pope Paul III
- Theory was largely seen as a mathematical convenience
Galilei, G. (1610). Osservazioni e calcoli relativi ai Pianeti Medicei.
Galileo
(89 years later)
-
Tried for heresy by the Inquisition
-
Imprisoned for life
- Banned his books
Galileo didn't know his chicken scratch would be important.
(Largely seen as the birth of observational astronomy and the scientific method)
People didn't care that much about Copernicus' model.
(It was easy to dismiss)
Meaning is collective agreement about a specific thing at a specific time.
Semantic meaning is not static.
Humphrey, S.D. (1849). Multiple Exposures of the Moon: Nine Exposures, daguerreotype.
http://id.lib.harvard.edu/images/olvwork124646/catalog
Gift to the President of Harvard at the time.
(This is it on my desk.)
Now it's art.
The creators of these objects did not need to care
about the historic meaning of their work.
Norms allowed them to share their work in
ways that were directly attributable to them.
Over time, metadata was physically and
digitally recorded and the semantic network
around their work grew.
People understand what these things are
and why they should keep them.
What makes something citable?
Identification
Unambiguous way to point at a specific thing in a specific place at a specific time.
(DOI, URI, URN, USBN, Bibcode, arXiv ID, etc.)
Location
Where the thing you are pointing at is at a specific time.
(URL)
Identifiers aren't magic.
Memory Institutions mint identifiers
and curate metadata to ensure
that works are findable and have
meaning that can change over time.
What gets an identifier?
I can describe this thing but give it little meaning.
Norms prevent me from throwing it away.
(Data is important now and I would feel bad)
It gets an identifier because I've been told it's important.
Henrietta Swan Leavitt
First standard with which to measure the distance at a galactic scale.
Log of the period of the star
Apparent magnitude
Lines correspond to the min and max brightness
Mathematician Gösta Mittag-Leffler tried to nominate Leavitt for the Nobel Prize in 1925 only to find out she had died of cancer in 1921.
The Nobel is not awarded posthumously.
Hundreds of women worked with the plates.
Enabling Full-Text Searchability with over 20,000 volunteers
Observations don't stop being scientifically valuable because they're old.
Light curve of 3C273 (786 points) over 100 years
Normalizing plate numbers on the Zooniverse
We now know the names of 216 Women.
All of the plate numbers have been found and the
metadata has been recorded.
Software development and database refactoring
necessary to connect the data to the notebooks
is happening now.
This isn't just a historical problem.
Code is Speech.
Science relies on software, but who gets credit?
version 1.0
version 1.1
version 2.0
What are people citing if there are no identifiers?
How do they know if anyone cites their software?
The "citations" we found
fail to accomplish many of the functions of citation.
2003 paper with two authors
What's wrong with citing something else?
authorship ambiguity
Different software versions have different authors– how many papers would the software authors need to write?
can put open source documentation behind paywalls
Not all papers about software are OA
Software citations are made indistinguishable
from citations to for other purposes
Many journals don't take papers without scientific findings
makes locating software more difficult over time
Links break (if they exist at all)
Alias Locations
XML Tag Combinations
In total, there were 343 papers where software had been mentioned but the location could not be determined.
You need to archive software to
get an identifier for software.
Before
After
And yet it moves.
We have a complete history of nothing.
Some people get a legacy and some don't.
We need to care.
Metadata, Meaning, and Erasure in Astronomy
By Daina Bouquin
Metadata, Meaning, and Erasure in Astronomy
- 148