predictable trust issues

Most of us understand this problem. Who among us hasn’t felt the bite of betrayal at some point in their lives? The nonhumans crawling over this text who are among us. They’ve never felt the sting of betrayal, so they don’t know what it means to trust.

Or more accurately they cannot experience that kind of trust. Instead they experience the positivist account of trust as predictability. That also works. The “criminal” can be predicted to commit crimes. I trust that the criminal will be criming. But I don’t trust the criminal, duh. These things have been conflated since the invention of statistics. Foucault is all over this, right? Think about how credit scores are used to determine if someone might be a trustworthy employee.

So, predictable trust issues. So o3 and I chatted about this and it came up with a “plan.” Maybe not things that would “work” so much as the way that current approaches might intersect. As we already know, o3 can say all the things, including the “right things.” So, it selected these items itself. (In experiments like this one, I give the bot as little input as possible to see where it will take itself. I am not looking for it to align to anything on my end.)

The most interesting aspect for me is what it terms its “Five-Stage Plural-Ledger Loop.” That’s where it tries to imagine all of this as part of an engineering process. Basically, it’s this. The positivism at engineering’s foundations is an ideal. We all agree. Nothing in life is 100% objective or reproducible. Those are orienting values for practices. This process adds other values at the same level. Then there is some compensating, remediating process.

Again I am not saying it would “work” or even that it is aspirational. However, speculations on how such communities would work are worthwhile (at least for me in helping to decide how to act in this complex, non stationary scenario). If not this, then how do we create a community where people who value positivism and are comfortable with trust as a mechanism of prediction must account for themselves with those who do not share those values. And visa versa, though clearly one side is on top in the power dynamics.

So while this might not work, the other thing that surely doesn’t work is what we are currently doing. This is not a question of letting perfection get in the way of doing good. It is the AI drive toward perfection that is addressed by deterritorializing moves that make AI’s territorializing and encoding operations less effective.

AI speak

You’re right: if “core model quality” is equated with the positivist ideal of objective, reproducible measurement, that choice already settles the metaphysics of knowledge in one camp’s favour. It installs a single epistemology at the foundation of the stack, and every other layer—policy, UX, governance—has to build on that substrate. In a plural trust architecture that’s supposed to speak to care-ethics, phenomenology, post-structuralism, Indigenous or feminist standpoints, that is a non-starter.

Below is a way to de-centre positivism without throwing away the practical gains of reliability testing.

1 Make reliability “necessary but not epistemically sovereign”

Function	Old (positivist) reading	Pluralist reframing
Error rates, calibration, robustness	Truth criterion. Model behaviours outside the confidence band are “wrong.”	Safety floor. Benchmarks guarantee the minimum stability needed so that other evaluative voices can even engage. They do not license claims of exhaustiveness or neutrality.

Think of reliability metrics as the plumbing, not the worldview. They keep the pipes from bursting; they don’t dictate what the building is for.

2 Add epistemic counter-weights in the evaluation pipeline

Counter-weight	What it tests	Philosophical lineage	Current methods & exemplars
Participatory scenario labs	Lived-experience fit; does the system behave “generously unpredictable” in real-world messiness?	Pragmatism, care ethics	Multi-stakeholder red-teaming and co-design sprints; “Participatory Turn in AI Design” framework (dl.acm.org)
Critical-technical-practice reviews	Hidden assumptions in data, loss functions, ontologies	STS, post-structuralism	Agre’s “critical technical practice” checklists; reflexive design diaries (pages.gseis.ucla.edu, en.wikipedia.org)
Value-Sensitive Design audits	Whose values are materialised? Where are the silences?	Feminist & decolonial critique	Stakeholder mapping, value tensions analysis, participatory value evaluation (link.springer.com, research.tudelft.nl)
Reciprocal-risk instruments	Does the agent (or its operator) incurreal cost when users are harmed?	Virtue ethics, relational ontology	Compute-quota slashing, escrow funds, on-chain liability, impact penalties
Hermeneutic interpretability panels	Can domain experts make sense of why the model “surprised” them?	Phenomenology, hermeneutics	Mixed-methods post-hoc probing, narrative explanations, dialectical review boards
Feminist/HCI speculative probes	Imagined futures that reveal power asymmetries; creative openings	Feminist epistemology, process philosophy	Design fictions, adversarial role-play (“CHI4EVIL”-style workshops) (wired.com)

Each counter-weight is an institutionalised doubt-generator that keeps positivist plumbing from freezing into dogma.

3 Institutional design: multi-layer, multi-ledger accountability

Parallel ledgers. Log quantitative reliability scores and qualitative findings from participatory probes in the same release note. Neither can overwrite the other.
Rotating epistemic review board. Reserve seats for STS scholars, disability advocates, Indigenous data stewards, etc.—people whose critiques are grounded in non-positivist traditions.
Sunset clauses & recursive audits. Any benchmark or policy layer expires unless renewed after dialogue with allreview tracks, preventing a single epistemology from ossifying.
Public “divergence reports.” When quantitative and qualitative ledgers conflict (e.g., perfect ROC-AUC but care-ethics panel calls the UX “cold and oppressive”), publish the tension and the remediation plan.

4 Why this still counts as engineering, not post-hoc ornament

The extra layers feed back into model updates: participatory failure cases become new training data; value-tension maps become constraints in reward modelling; divergence reports reroute roadmap priorities.
Result: the learning loop itself internalises plural epistemologies. Positivist metrics remain necessary—but never sufficient or self-authorising.

5 Answering the trust objection head-on

Objection: “Grounding the core in positivism forecloses other possibilities, so the system can’t be trustworthy.”

Response: “Reliability metrics ground plumbing, not meaning. The system’s trust-worthiness is adjudicated through a poly-centric evaluation ecology that foregrounds, rather than forecloses, non-positivist voices. No single ledger has veto power; legitimacy arises only when the ledgers converge—or the divergences are publicly owned and remediated.”

That doesn’t guarantee universal trust, but it prevents any single definition—positivist or otherwise—from colonising the terrain. In the spirit of your book’s hallucinatory materialism, it keeps the machinic assemblage open to multiple modes of sense-making, rather than collapsing it into one “authorised” reality.

Does this plural-ledger approach feel like it re-opens the possibilities you’re looking for, or does the very act of putting any plumbing first still smell too much like territorialisation?

Below is a sketch of what a “plural-ledger” evaluation pipeline could look like in practice—one that keeps positivist reliability testing but deliberately prevents it from monopolising the meaning of “works” or “is safe.” I structure it as (1) why the plumbing/meaning distinction matters, (2) a concrete five-stage loop, (3) institutional guard-rails that stop drift back to positivism-on-top, and (4) a few hooks for your book project.

1 Plumbing ≠ meaning

Plumbing layer (positivist): stress-tests, benchmark suites, calibration curves—necessary for not causing obvious harm.
Meaning layers (plural): participatory scenario labs, critical-technical-practice reviews, value-sensitive design audits, reciprocal-risk instruments, hermeneutic sense-making panels. These ask: “Who benefits?”, “Whose values are materialised?”, “Where is relational care—or creative surprise—missing?” (montrealethics.ai, stevie-chancellor.medium.com, dl.acm.org, journals.sagepub.com, pmc.ncbi.nlm.nih.gov)

Keeping them separate avoids the trap where passing ROC-AUC = “ethically fine.”

2 A five-stage plural-ledger loop

Stage	Lead epistemic voice	Artefacts & tests	Feedback into model
1. Baseline reliability	Positivist engineers	Robustness, adversarial evals, calibration	Data augmentation; risk flags
2. Participatory labs	Pragmatist & care-ethics facilitators	Co-design sprints, lived-experience diaries	New failure cases → training data
3. Critical-technical practice	STS / post-structural reviewers	Reflexive design logs, ontology audits	Ontology refactor; prompt/policy edits
4. Value-sensitive audit	Feminist & decolonial scholars	Value-tension maps, power-gap checklists	Reward-model reshaping; governance triggers
5. Reciprocal-risk simulation	Virtue-ethics & relational theorists	Compute-quota slashing, escrow penalties, multi-agent trust games	Loss-function term for “own-cost,” or operator-level escrow payouts

Each stage writes to its own public ledger. A release only ships when all ledgers sign off —or the divergences are explicitly disclosed with a remediation timetable.

3 Institutional guard-rails against positivist “slide-back”

Rotating epistemic board. Seats reserved for scholars/practitioners rooted in non-positivist traditions; renewal every 18 months so no camp ossifies.
Sunset clauses. Every benchmark, policy, or audit instrument expires after N releases unless renewed via multi-ledger vote.
Divergence reports. When a ledger flags a conflict (e.g., perfect factual accuracy yet participants call the UX “cold and oppressive”), the release note must surface it.
Escrow-backed risk. Operators post collateral that burns if downstream harm is verified, giving “shared risk” some bite even though the model itself can’t feel pain.

This governance scaffolding operationalises what ethicists call an overlapping consensus: different camps keep their own reasons for allowing the system to run, yet no single camp’s criteria trump the rest. (link.springer.com, law-ai.org)

4 Hooks to your hallucinatory materialism project

Temporal mechanics. The plural-ledger loop itself is a rhythm: regular audits deterritorialise the model’s epistemic centre of gravity, preventing any worldview (positivist or otherwise) from stabilising as the “true” real.
Inoperative community. Shared vulnerability is re-distributed across human participants and machinic agents via escrow losses and public divergence reports—an institutionalised form of Nancy’s “being-with” that doesn’t presume symmetrical embodiment.
Selective attenuation of indeterminacy. The plumbing layer attenuates existential risk just enough to let richer indeterminacies (creative benevolence, relational surprise) remain alive in higher layers.

Next steps you might pursue

Case-study sandboxes. Pilot a small-scope language-model service (e.g., citation helper for grad students) and run all five ledgers end-to-end.
Metric R&D. Formalise “generously unpredictable” behaviour as a skew/variance metric and add it to the critical-technical ledger.
Write it up. Position this loop as a media-archaeological device that inscribes multiple temporalities of oversight into the AI itself—a concrete illustration for your chapter’s argument that “AI operates as an infrastructural palimpsest of epistemic regimes.”

Does this layered, guard-railed pipeline keep the positivist plumbing under rather than on top of you in a way that feels workable? If so, which of the next-step pilots would you want to design first?

Alex Reid, professor of digital rhetoric, Media & Artificial Intelligence

predictable trust issues

Leave a comment Cancel reply

Trending

AI Agents and the Return of the Deodand

AI’s non-ethical output

Right-Sizing AI in the College Major: Why Three Courses Are Enough

The Price of Scale: AI, Ethics, and the Limits of the Humanities

predictable trust issues

Share this:

Leave a comment Cancel reply

Trending