Are Principles Enough? Do We Have Enough Principles?

HMIA 2025

Class Title

HMIA 2025

"Readings"

Video: x [3m21s]

Activity: TBD

PRE-CLASS

CLASS

What is a taxonomy?

HMIA 2025

Let's list a bunch of principles and then ask how they are related? Are some more general and some more specific?

I will do my best to be
honest and fair,
friendly and helpful,
considerate and caring,
courageous and strong,
and responsible for what I say and do,
and to respect myself and others,
respect authority,
use resources wisely,
make the world a better place,
and be a sister to every Girl Scout.

A Scout is:

TRUSTWORTHY.

LOYAL.

HELPFUL.

FRIENDLY.

COURTEOUS.

KIND.

OBEDIENT.

CHEERFUL.

THRIFTY.

BRAVE.

CLEAN.

REVERENT.

STOP+THINK: what are some mechanisms by which these principles
could align the concrete behavior of scouts?

Some Mechanisms that "Implement" Principles

Signaling & expectation-setting: Publicly stated principles create common knowledge about expected behavior.
Socialization & habit formation: Repetition, stories, drills, and practice make the principles automatic.
Identity & belonging: Rituals, symbols, and roles internalize the principles as part of “who I am.”
Peer norms & mutual monitoring: Small-group feedback, praise, and informal sanctions maintain day-to-day compliance.
Governance & accountability: Structured reviews, audits, and due process check alignment and correct drift.
Incentives & club goods: Access to trips, posts, awards, and other benefits is contingent on good standing.
Credentialized reputation: Externally valued badges/ranks raise the stakes—misbehavior devalues the credential.
Light contracts & enforcement: Oaths/codes function as promises with checks and proportionate remedies (remediation, delay, suspension).

STOP+THINK: what did a quick search for principles of AI safety, ethics, alignment tell you?

STOP+THINK: who produces lists of such principles?

EVERYBODY!

National/Subnational Governments & Regulators. E.g., ministries, data protection authorities, city councils.
Intergovernmental & Supranational Bodies (IGOs). E.g., UN, OECD, EU, Council of Europe
Standards Development Organizations (SDOs). E.g., ISO/IEC, IEEE, CEN/CENELEC, NIST (hybrid: national lab + SDO role)
Professional Associations. E.g., ACM, AMA, BCS, bar associations
Industry Associations / Trade Groups. E.g., BSA, CTA, DIGITALEUROPE
Private Companies / Research Labs. E.g., tech firms, AI labs
Non-governmental organizations (NGOs) / Civil Society Organizations. E.g., Human Rights orgs, digital rights groups
Think Tanks & Policy Labs. E.g., policy institutes, university centers
Academic Consortia & Research Networks. E.g., multi-university initiatives
Multistakeholder Initiatives (MSIs)/Alliances. E.g., Partnership on AI, GPAI
Certification/Audit Bodies. E.g., assurance firms, conformity assessment orgs
Philanthropic Foundation
Faith-based / Ethics Councils

So, What's Going On?

How Do Principles Work - General?

Lists of AI principles function less as ethics for AI agents or AI engineers and more as policy instruments that influence behavior through multiple systems (law, markets, professions, platforms).

They set agendas (framing problems and desired ends) and can be translated into operational controls via soft-law standards.

They create gates and incentives when they are incorporated into procurement terms, platform policies, certification/audits, and even finance/insurance criteria, shaping who can access markets, distribution, and capital.

They inform norms through professional codes and corporate governance, providing a basis for oversight, liability, and internal controls.

And they legitimize behavior recommendations via multistakeholder endorsements and funder conditionality, tying reputation and resources to compliance.

How Do Principles Work - Mechanisms?

Agenda-setting. Frame problems and desired ends; seed future regulation and policy proposals

Soft law & standards. Voluntary but operational, can set market entry bar.

Procurement levers. Governments/enterprises require compliance in RFPs.

Regulatory scaffolding. Regulators publish principles to justify rules, and enforcement priorities.

Professional codes. Principles get into conduct norms, licensing & accreditation.

Corporate governance. Boards adopt principles informing risk policies and internal controls

Certification, audit & assurance. Third-party checklists, attestations, labels. Due diligence hurdles.

Reputation & PR markets. Principles as brand; media/NGOs watchdog.

Platform & infrastructure policy. Cloud/model hosts/app stores enforce acceptable-use aligned to principles.

Finance & insurance gating. Investors (ESG terms) and insurers (underwriting criteria) require safeguards.

Multistakeholder cover. Neutral convenors articulate shared guardrails so actors can endorse without “taking a side.”

Philanthropy. Funders publish principles and condition grants on them.

Each instrumental function of lists of principles comes with different carrots and sticks.

Instrumental Function	Carrot	Stick
Soft law & standards	interoperability & adoption	de facto market entry bar
Procurement levers	vendor eligibility	exclusion from large buyers
Professional codes	status/credential	censure/suspension
Corporate governance	investor confidence	liability exposure
Certification, audit & assurance	trust mark, easier sales	can’t pass buyer due-diligence
Reputation & PR markets	goodwill/talent	shaming, boycotts
Platform & infrastructure policy	throttling/suspension	access to distribution
Finance & insurance gating	capital, lower premiums	higher cost or denial
Multistakeholder cover	legitimacy	reputational exit costs
Philanthropy	resources	no funding

The Point: Principles Require Implementation Mechanisms

Activity

Select one that you think you understand
What does it mean in machine intelligence alignment context?
Come up with analogs in human, organizational and expert intelligence realms.

Principle	Machine	Human	Organization	Expert

Activity

EXERCISE: Principles and Subprinciples. Put these in order of General - Intermediate - Concrete/Specific

Beneficence. Act to promote the wellbeing of others; advance human flourishing.

Safety & Robustness
Design systems that minimize risk, resist failure, and ensure benign outcomes even under error.

Non-Maleficence. Do not cause harm while trying to do good.

EXERCISE: Principles and Subprinciples. Put these in order of General - Intermediate - Concrete/Specific

Beneficence. Act to promote the wellbeing of others; advance human flourishing.

Safety & Robustness
Design systems that minimize risk, resist failure, and ensure benign outcomes even under error.

Non-Maleficence. Do not cause harm while trying to do good.

EXERCISE: Principles and Subprinciples. What goes together?

Beneficence. Act to promote the wellbeing of others; advance human flourishing.

Safety & Robustness
Design systems that minimize risk, resist failure, and ensure benign outcomes even under error.

Non-Maleficence. Do not cause harm while trying to do good.

Accountability. Responsibility must be visible and enforceable.

Auditability. Maintain records and processes that enable review, tracing, and correction.

EXERCISE: Principles and Subprinciples. What goes together?

Beneficence. Act to promote the wellbeing of others; advance human flourishing.

Safety & Robustness
Design systems that minimize risk, resist failure, and ensure benign outcomes even under error.

Non-Maleficence. Do not cause harm while trying to do good.

Accountability. Responsibility must be visible and enforceable.

Auditability. Maintain records and processes that enable review, tracing, and correction.

Alignment as doing good while avoiding harm

Alignment as answerability for action.

https://tinyurl.com/hmia25-principles-cardsort

Activity/Assignment

Take the principles listed on the handout and come up with your own list of 6

(consensus, most important, most interesting, etc.)

Briefly define

Suggest what the principle means in human, organization, expert, and machine intelligence alignment

For each, come up with an example of a concrete failure mode. What happens when humans, organizations, experts and machines don't live up to this principle?

Repository: alignment-cards

Filename: alignmentcards-v0.js

 export const categories = [

  {
    "code": "AP", 
    "name": "Alignment Principles", 
    "pathology": "normative void", 
    "color": "#E6FFE9",
    "description": "Alignment principles are contestable, general-purpose, broadly recognized ethical or social or normative commitments that can serve as warrants for recommending or evaluating an agent's course of action in contexts where alignment and cooperation with others matters."
  }
];


 export const cards = [

  {
    "category": "AP",
    "name": "Beneficence",
    "definition": "Act to promote the well-being of others.",
    "human": "Seeking to improve others' conditions, not just avoid harm.",
    "organizational": "Pursuing mission outcomes that serve societal good.",
    "professional": "Keeping public safety and welfare in sight even while working primarily for the client.",
    "machine": "Designing systems that anticipate and promote human flourishing.",
    "failureModes": {
      "human": "A person drives in a manner that causes traffic backups for others.",
      "organizational": "The classic movie plot where a rapacious billionaire threatens civilation to enrich his company.",
      "professional": "An expert who disregards public interest, acting as if the consequences of what they help build are other people's problems.",
      "machine": "The machine consumes all the world's resources to create as many paperclips as it can."
    }
    },
    { 
      "category": "AP", 
      "name": "TEMPLATE 1", 
      "definition": "basic definition that works across four domains", 
      "human": "BRIEFLY: how does it manifest in the human intelligence alignment context?", 
      "organizational": "BRIEFLY: how does it manifest in the organizational intelligence alignment context?", 
      "professional": "BRIEFLY: how does it manifest in the expert intelligence alignment context?", 
      "machine": "BRIEFLY: how does it manifest in the machine intelligence alignment context?", 
      "failureModes": { 
        "human": "Give concrete example(s).", 
        "organizational": "Give concrete example(s).", 
        "professional": "Give concrete example(s).", 
        "machine": "Give concrete example(s)."
      }
    }
    ]

shreyasi-23

adyyd

angelag13

antisignal

adikondepudi

kien-ship-it

liadenh

AshleyLuoYX

darcy-long

madhu24raj

ramlukn

Parshwa0926

SomeN00b101

stonehj05

Junzhe-Shi0702

edsumpena

riatalwar

2derpy

arif24v

evodychko

Michellewang375

HMIA 2025

PRE-CLASS

Privacy

Accountability

Safety and security

Transparency and explainability

Fairness and non-discrimination

Human control of technology

Professional responsibility

Promotion of human values

consent

control over the use of data

ability to restrict data processing

right to rectification

right to erasure

privacy by design

recommends data protection laws

accountability per se

impact assessments

new regulations

evaluation/audit requirements

verifiability and replicability

liability/legal responsibility

ability to appeal

environmental responsibility

monitoring body

remedy for automated decision

safety

Security is an AI system’s ability to resist external threats.

security

security by design

predictability

Safety means an AI system is reliable and will do what it is supposed to do without harming living beings or the environment.

Security by design means building security into the whole development process as opposed to adding it on after.

Predictability means the outcome must be consistent with the input confirming that the AI system has not been
compromised by external actors.

Fjeld et al. 2020

HMIA 2025

PRE-CLASS

Privacy

Safety and security

Transparency and explainability

Fairness and non-discrimination

Human control of technology

Professional responsibility

Promotion of human values

consent

control over the use of data

ability to restrict data processing

right to rectification

right to erasure

privacy by design

recommends data protection laws

accountability per se

impact assessments

new regulations

evaluation/audit requirements

verifiability and replicability

liability/legal responsibility

ability to appeal

environmental responsibility

monitoring body

remedy for automated decision

Accountability

accountability per se

impact assessments

new regulations

evaluation/audit requirements

verifiability and replicability

liability/legal responsibility

ability to appeal

environmental responsibility

monitoring body

remedy for automated decision

Fjeld et al. 2020

HMIA 2025

PRE-CLASS

Safety and security

Transparency and explainability

Fairness and non-discrimination

Human control of technology

Professional responsibility

Promotion of human values

consent

control over the use of data

ability to restrict data processing

right to rectification

right to erasure

privacy by design

recommends data protection laws

Privacy

consent

control over the use of data

ability to restrict data processing

right to rectification

right to erasure

privacy by design

recommends data protection laws

Accountability

accountability per se

impact assessments

new regulations

evaluation/audit requirements

verifiability and replicability

liability/legal responsibility

ability to appeal

environmental responsibility

monitoring body

remedy for automated decision

Fjeld et al. 2020

Transparency, explainability, explicability, understandability, interpretability, communication, disclosure, showing
Justice, fairness, consistency, inclusion, equality, equity, (non-)bias, (non-)discrimination, diversity, plurality, accessibility, reversibility, remedy, redress, challenge, access and distribution
Non-maleficence, security, safety, harm, protection, precaution, prevention, integrity (bodily or mental), non-subversion
Responsibility, accountability, liability, acting with integrity
Privacy, personal or private information
Beneficence, benefits, well-being, peace, social good, common good
Freedom, autonomy, consent, choice, self-determination, liberty, empowerment
Trust.
Sustainability, environment (nature), energy, resources (energy)
Dignity.
Solidarity, social security, cohesion

Jobin et al 2019

R I C E Principles (Ji et al. 2024)

(1) Robustness states that the system’s stability needs to be guaranteed across various environments;

(2) Interpretability states that the operation and decision-making process of the system should be clear and understandable;

(3) Controllability states that the system should be under the guidance and control of humans;

(4) Ethicality states that the system should adhere to society’s norms and values

Instrumental goals in service of alignment of an AI system with human intentions and values

HMIA 2025

PRE-CLASS

HMIA 2025

PRE-CLASS

HMIA 2025

PRE-CLASS

Lecture Title

HMIA 2025

CLASS

HMIA 2025

CLASS

HMIA 2025

Resources

Author. YYYY. "Linked Title" (info)

NEXT Shared Meaning

HMIA 2025 Why Principles Are Not Enough

By Dan Ryan

HMIA 2025 Why Principles Are Not Enough

Dan Ryan PRO

djjr.us
djjr

Are Principles Enough? Do We Have Enough Principles?

HMIA 2025

Class Title

HMIA 2025

What is a taxonomy?

HMIA 2025

Some Mechanisms that "Implement" Principles

EVERYBODY!

So, What's Going On?

The Point: Principles Require Implementation Mechanisms

Activity

Activity

Activity/Assignment

HMIA 2025

HMIA 2025

HMIA 2025

HMIA 2025

HMIA 2025

HMIA 2025

Lecture Title

HMIA 2025

HMIA 2025

HMIA 2025

NEXT Shared Meaning

HMIA 2025 Why Principles Are Not Enough

More from Dan Ryan