Not another bloody talk about ‘AI’

 

Notes On Teaching a Rapidly Developing Technology

Shawn Graham

   

Department of History, Carleton U

https://scholar.social/@electricarchaeo

 

follow along at:

https://bit.ly/sg-dec3

 

 

 

image via user Dani Franco unsplash

  • Who is 'I'?
  • On what basis do they have the right to make this call?
  • How do they know netlogo? And Roman history? What data?
  • This ghost-in-the-machine assumes the driver's seat - and there is no appeal.
  • Can't teach how to 'use' such a thing: Must teach how to subvert.

So I wanted to build a simulation...

Łukasz Łada, Unsplash

Nuggan

Clacks

Where do the shadows come from?

What ABM & Networks Teach Us

Networks can represent the past

Networks are present in archaeological materials

Networks can be a substrate for further simulation

 

 

...Networks can problem solve

 

Neural Networks

a potted history

Dash Khatami via unsplash

Good ol' Wikipedia! https://en.wikipedia.org/wiki/Neural_network  https://en.wikipedia.org/wiki/Perceptron

https://en.wikipedia.org/wiki/Perceptron; Charles Wightman adjusting the machine

Workable ways of (re)training neural networks in a reasonable amount of time, with a reasonable amount of data*

*terms and conditions apply

LLM are a subset of machine learning.

Machine learning - learning of patterns - is enormously useful in archaeology.

...and what used to be difficult is now something you can do in a webpage...

From Memories to Data to Datasets to Ghosts

Part One

We become <data> ghosts through decomposition

Generative AI is DH in Reverse

DH: analyzes an image to say this is a 'red figure kantharos'

GenAI says: here are the conventions that align with the idea of a kantharos rendered as pixel data

Generative AI is digital humanities run in reverse.... AI becomes a system for producing approximations of human media that align with all the data swept together to describe that media.

Sam Barber via unsplash

h/t to Steven Johnson for a piece connecting Molaison & Context Length

The cost of training

 

  • GPT-3: Up to 2048 tokens ~ 1500 words
  • Mistral 7B: Up to 8192 tokens ~6100 words
  • GPT-4o: From 60K to 128K tokens in some configurations
  • Claude 3.5: Up to 100K tokens
  • LLama 3.1: Up to 128K tokens
  • Gemini 1.5 Pro: Up to 1M tokens

https://datanorth.ai/blog/context-length

"The computational cost scales quadratically with the length of the context window, meaning a model with a context length of 4096 tokens requires 64 times more computational resources than a model with a context length of 1024 tokens."

and we're not even talking

here about the cost(s) of siphoning

all the content in the world to get the

necessary data

...the materiality of digital archaeology...

ChatGPT & The Eliza Effect

Isi Parente Unsplash

Part Two

The TESCREAL Bundle

link

  • Techno-solutionism
  • Eschatological beliefs
  • "AGI will save us"
  • Religious fervor

sounds like religion to me.

Gods
Interns

Cogs

Part Three

see Drew Breunig: The 3 AI Use Cases

Two Kinds of Necromancy

Impractical Necromancy

  • Large Language Models
  • Chatbots
  • Reverse DH polarity
  • Easy but destructive
  • Gods, Interns

Practical Necromancy

  • Small, targeted models
  • Curated datasets
  • Ethical sourcing
  • Specific questions
  • Sweep behaviour space
  • Cogs and Widgets

No gods, no interns. Only cogs.

(which would mean way less money/power for techbro oligarchs/klept)

God - Intern - Cog

Three Notebooks

Subverting the new AI Gods

yes, you've got homework

Homework!

We're going to resurrect Flinders Petrie

We'll take what Petrie wrote, and fill in the gaps with an LLM.... just like those scientists at Jurassic Park filled in the gaps in dino dna with amphibian DNA.

 

And that worked out ok.

Right? ... Right...?

(we're using GPT2, a 'completion' model that can have additional layers of training added to it. By playing with this we dispel some of the Eliza effect)

You'd think this would be difficult to do.  Nope.

(and now there's a small industry of academic papers on 'digital necromancy' meaning all this)

now we're going to surface some ghosts in the training data

notebook for images-as-infographics

 

what 'attractors' for 'archaeological excavation' do we see?

Explore here.

 

We're using a slight modification to Salvaggio's method for

How To Read An AI Image

Ghosts in Text Generation?

  1. generate text with the same prompt and same settings; only thing that changes is the initial random noise
  2. text analysis to surface the 'attractors' that pull the text one way or the other
  3. simulate conversation between models and examine the discourses

Wrapping Up

same principle - different scale

https://www.youtube.com/watch?v=bxXdGBSDCHQ

DIY run-of-stream hydro electricity rig

https://en.wikipedia.org/wiki/File:Hoover_dam_from_air.jpg

Notes on Teaching AI

  1. Deal with the ghosts of data
  2. Give permission to experiment
  3. Embrace productive mistakes
  4. Resist hype

This Requires Enchantment:

  • Reflection
  • Playfulness
  • Attention to impact
  • Attention to the uncanny and its intrusions
  • Which will allow us to focus on our digital golems: archaeology has much to say about this current moment

https://archaeo.social/@mrundkvist/112447233936751700

Martin Rundkvist
mrundkvist@archaeo.social

Back when #BigData was the fashionable buzz word, I repeatedly had to explain to enthusiasts that archaeological data are not just Big, they are Confused and Patchy and Hairy.

 

I can't really see how the current generative algorithms could make me obsolete or even speed up much of the work I do. Because I'm in this really niche activity with no commercial potential that demands constant engagement with wildly non-standardised data as well as creative writing about them.

He's right! all this is slow archaeology; this tech expands limited capabilities when you know which ghosts are haunting it. But that takes a lot to figure out. Cogs. Cogs all the way.

BONUS Small Things Made With Care

 

curo, curare, curavi, curatum

An example: A Field Notes to Knowledge Graph Pipeline

  1. sketched the desired workflow
  2. sketched the desired interface
  3. small model to translate into appropriate html
  4. small model to encode the individual desired functions

Personal Image Search

https://github.com/shawngraham/personal-image-search-engine

 

LLM are no good for specific information retrieval; the ghosts push towards the mean, the average, so you get plausible text, not correct text.

 

This same property does permit searching by 'vibe' or 'similarity'

A word of thanks to my students

 

This term, we have spent several weeks dispelling hype, opening the hood, poking at the models, interrogating the ghosts, discussing the harms, and trying to figure out what good any of these things are. I am indebted to their goodwill and good humour.

 

Thanks, gang.

And Thank You

 

You are welcome to take, use, re-use, critique, expand, tear-apart, re-build, improve, sneer at, any and all code of mine that I've shared today.