SOLUTIONS LOOKING FOR PROBLEMS LOOKING FOR SOLUTIONS

mike nason
@unb libraries
@pkp

introduction(s)

it is quite possible that i am not known to you! i am the open scholarship & publishing librarian at the university of new brunswick.

i currently serve as chair of the orcid-ca governing committee with crkn. i know all these lovely panelists from our time together on carl's orwg. plus many other things! i'm bad at saying no.

i also work for pkp as a member of their publishing services team, where i am the crossref/metadata liaison.

i am a [white, cis] settler from the unceded (aka, stolen) territory of the mi'kmaq-wolastoquey peoples just a short hop from the wolastoq river, a much cooler name than the settler-crowned “saint john river”, if you ask me.

narratives
Migrations
Metadata

in the beginning

i started my career as an academic librarian in 2013. at the time, repositories weren't exactly new.

setting aside the obvious activism present in the oa movement, i would categorize this era of repositories as solutions looking for problems.

for us, the utility was evident!

for researchers, maybe less so!

11 years after budapest oa initiative
unb had been running dspace
prior to tri-agency mandates
5 years out from plan s
fresh out of the mlis (go mustangs)
- not so fresh into libraries
- fresh onto the picket line
  - politics, baybee!
repositories were poppin'
imposter syndrome was also poppin'

folks were making a ton of really cool, bespoke collections and unique features

librarians were getting their hands dirty, writing new modules, learning to code, and contributing back to open source projects

the best of times

we had a new platform
we already had some materials
we had an experienced dev group
it was an institutional priority

i felt a bit like a wizard getting to talk shop and work on code and learn more about the guts of platforms and metadata.

it felt super cool to ride this cutting edge of repo work in canada while everyone spun up new, sexy websites with neat, sexy projects.

the blurst of times

institutional pressures to manage
building planes as we flew them
so (so, so, so) much customization
platform silos
schema silos
technical silos
enormous need to prove value
enormous need to feel valued

and, i don't think momentum for oa crested higher than "seems nice, but flawed".

on the other hand, it often felt a little like it was my job to change academic publishing and everyone else fit one of four categories:

an ally
indifferent
mad about change
annoyed that i hadn't done more

we were working hard to demonstrate value! this resulted in poor boundaries!

i was relatively confident that i was alone in this

Then, in 2014, the Tri-Agency OA policy for publications finally provided a problem for my solution

obviously, i don't think of oa as a problem! but, some people definitely saw the policy as one more thing. i could address a practical need!

fast forward

it wasn't too long after this that i really wanted my job not to be the repository itself, but the work of getting it populated and supported.

running a platform is a lot of work!

maintaining code is an actual commitment!

or, saas costs real money, and might give you an illusion of manufactured flexibility!

unmaintained extensions/modules
custom code
technical debt
concessions
dependency eol
the downsides to software "stacks"
unique one-off solutions for political problems (usually metadata)

it turns out that a lot of cool modules and custom work developed by librarians weren't super likely to be maintained by them, because they are not software developers, and they have a ton of other competing responsibilities!

what i needed was a box that worked! i no longer had time (or, honestly, patience) for the box itself to be the project.

I am increasingly confident that i am not alone in this

also, i'm a little jealous that the rdm people get to just skip past all of this and agree that dataverse is good

(╯°□°）╯︵ ┻━┻

so, we have a lot of people looking at their old repos and asking, "what's next"? Dspace 7? something else? national IR?

clock's tickin'

a number of the platforms we've invested significant time and money into are at (or past) the end of the line.

drupal 7 eol in november 2022 means that islandora 7.x is in the wind
dspace 7 launch (finally) means dspace 5 and 6 platforms are now community maintained
sunk cost fallacy is a hell of a thing

maybe we should sink less costs?

borealis, frdr, openaire...

you don't have to go home but you can't stay here.

narratives
Migrations
Metadata

it feels eerily appropriate that i am drafting this deck on groundhog day

unb libraries is currently in the midst of migrating to our 3rd repository platform in 10 years. the difference between 2013 and now is that i know a lot more things and also i am tired, mad, and better at advocating for myself.

this means i have overseen two full repo migrations in the last nine years!

2012/2013 // dspace to islandora
2022/2023 // islandora to dspace

lol

we out here, migratin'

dspace?!

dspace! we investigated a few options and figured, you know... why fight it?

we moved from dspace because it was hard to customize.

we moved back so we could have a great excuse not to.

repos absolutely don't need to be sexy
repos absolutely do have to work
dspace is the danny devito of repository platforms. low centre of gravity. sturdy. hard to knock down. well known!
openaire compliant out of the box
huge community
less intense stack
we do not want customizations!

i am happy to answer any and all questions about this migration and why, specifically, we went with dspace. but, i'd like to talk a bit about what this migration has taught me and it's highly likely i am already over my allotted time

narratives
Migrations
Metadata

jeez louise have we taken some liberties in the metadata space, hey?

low standards

islandora used a schema called mods. mods is, basically, like marc21 xml. it is extensive, granular, and precise.

dspace uses a schema called "dublin core" and it is very old, frustrating, and kind of stupid.

enjoying a classic do more with less scenario for academia, hey? ha ha. lmao.

<mods:name type="personal">
	<mods:role>
		<mods:roleTerm authority="marcrelator" type="text">author</mods:roleTerm>
	</mods:role>
	<mods:namePart type="given">Crystal Lynn</mods:namePart>
	<mods:namePart type="family">Radtke</mods:namePart>
</mods:name>

<dc.contributor>Radtke, Crystal Lynn</dc.contributor>

<dc.contributor.author>Radtke, Crystal Lynn</dc.contributor.author>

low standards

so, i looked around at what everyone else was doing with dublin core.

i'll say this, for a "standard", people have really played fast and loose.

people have made some wildly divergent and incompatible decisions. and, lots of folks have inherited repositories or workflows full of questionable metadata.

¯\_(°⊱,°)_/¯

dc.rights

some schools record this as license information.

dc.rights

some schools record this as access rights information.

<dc.rights>Attribution-NonCommercial-NoDerivatives 4.0 International</dc.rights>
<dc.rights.uri>http://creativecommons.org/licenses/by-nc-nd/4.0/</dc.rights>

<dc.rights>http://purl.org/coar/access_right/c_abf2</dc.rights>
<!-- this is the uri for COAR controlled vocab for access rights -->

<oaire:licenseCondition>http://creativecommons.org/licenses/by-nc-nd/4.0/</oaire:licenseCondition>

and they put the license in an openaire namespace

who cares?!

you, probably, when you find out that you want to push metadata to crossref, datacite, openaire, library & archives canada, or whatever else and you're neck-deep in custom crosswalks

dc.wrongs

dublin core adheres to this thing called the "dumb down principle" wherein any qualifiers (custom or otherwise) must also be appropriate for the metadata element you're recording.

there are a lot of custom qualifiers out there, but sometimes this is completely independent of the downstream usability of that metadata.

dc.metadata.example

dc = namespace (dublin core)
metadata = element (from dc schema)
example = qualifier (can be custom)

anything that is recorded as: dc.metadata.example

must also work within:
dc.metadata

dc.wrongs

and sometimes people just staple stuff to dublin core that doesn't exist at all!

the incredible majority of this stuff is custom metadata to describe our organizational structures or student status and not the actual work. this is as understandable as it is actively bad.

dc.degree.campus
dc.year.graduated
dc.initial
dc.faculty.program

none of this describes a work! it is metadata about schools and students.

why metadata shame?

we are moving into a space where the ability to distribute/disseminate our repo holdings to open infrastructure is vital

if your metadata is only legible or usable for your institution, you end up having to create workarounds and crosswalks. or, worse, this metadata may go completely unused.

you might as well not be using a standard at all! just use a custom namespace!

i am not convinced that recording irrelevant or illegible metadata is a good use of our finite time!

you can read about my path to madness here
https://www.notion.so/cdsunb/Comparing-Potential-Schema-994df4ef8330408f90d60d2c17921e6c

i probably should have just done a whole talk on metadata, but this stuff makes me feel crazy

anyway

the takeaway(s) here is that, in having to migrate, i've learned the following:

people have spent a profound amount of time mangling metadata standards in order to improve the browsability of repositories.

almost no one whose opinion you should care about browses repositories. in open scholarly infrastructure, the metadata is doing the heavy lifting.

metadata is discoverability

boring is fine. no one cares.

probably we'd all be a lot further ahead if the original pitches for repositories weren't so focused on, like, marketing-adjacent rhetoric

i think we're well past the era where we need to say "yes" just to make people happy

if you spend too much time staring at metadata schema, you will start to lose grip on reality

SOLUTIONS LOOKING FOR PROBLEMS LOOKING FOR SOLUTIONS

introduction(s)

narratives Migrations Metadata

in the beginning

folks were making a ton of really cool, bespoke collections and unique features

librarians were getting their hands dirty, writing new modules, learning to code, and contributing back to open source projects

the best of times

the blurst of times

Then, in 2014, the Tri-Agency OA policy for publications finally provided a problem for my solution

fast forward

what i needed was a box that worked! i no longer had time (or, honestly, patience) for the box itself to be the project.

I am increasingly confident that i am not alone in this

so, we have a lot of people looking at their old repos and asking, "what's next"? Dspace 7? something else? national IR?

clock's tickin'

you don't have to go home but you can't stay here.

narratives Migrations Metadata

we out here, migratin'

dspace?!

narratives Migrations Metadata

low standards

low standards

dc.rights

dc.rights

who cares?!

dc.wrongs

dc.wrongs

why metadata shame?

anyway

metadata is discoverability

i'm sorry

narratives
Migrations
Metadata

narratives
Migrations
Metadata

narratives
Migrations
Metadata