See also: paperclip optimizer

Scientific and technical work is made invisible by its own success. When a machine runs efficiently, one need only to focus on its inputs and outputs and not on its internal complexity. Thus, paradoxically, the more science and technology succeed, the more opaque and obscure they become

If we start to disect the black box and understand that it

  • is made by people
  • substitutes their actions
  • is a permanent delegate of the work
  • shapes human action by prescribing what sorts of people can pass through it

Then this is called “opening the black box” or “infrastructural inversion” for larger scale infrastructures

Jim Johnson: building and rebuilding walls everytime you use it is a waste, thats why we have doors as infrastructure that saves a lot of this repetitive work.

Computational Reliabilism (CR)

Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI by Juan Manuel Durán, Karin Rolanda Jongsma

On trust in black box algorithmic decision making systems

Black boxes are algorithms that humans cannot survey: they are epistemically opaque systems that no human or group of humans can closely examine in order to determine its inner states. Physicians have a hard time offering accounts of how the algorithm came to its recommendation or diagnosis


  • transparency: algorithmic procedures that make the inner workings of a block box algorithm interpretable to humans
    • transparency is an epistemic manoeuvre intended to offer reasons to believe that certain algorithmic procedures render a reliable output and that the output of the algorithm is interpretable by humans
  • opacity: inherent impossibility of humans to survey an algorithm both understood as a script as well as a computer process. Burrell proposes 3 types of opacity
    1. Intentional corporate or state secrecy
    2. Technical illiteracy
    3. Arising out of the scale of machine learning algorithms

Claim: transparency will not provide solutions to opacity, and therefore having more transparent algorithms is not a guarantee for better explanations, predictions, and overall justification of our trust in the results of an algorithm

Computational reliabilism (CR)

  • offers epistemic justification for the belief that the algorithm is reliable and its results are trustworthy
  • main claim: researchers are justified in believing the results of AI systems because there is a reliable process that yields, most of the time, trustworthy results.
  • formal definition “the probability that the next set of results of a reliable (AI system) is trustworthy is greater than the probability that the next set of results is trustworthy given that the first set was produced by an unreliable process by mere luck”
    • in regular language: given two results are the same, we should consider the one generated by a reliable system to be more trustworthy
  • reliability indicators
    1. verification and validation methods: building and measuring dev confidence in the computer system. Verification is assessment of accuracy with comparison to known solutions, validation is the assessment of accuracy with comparison to experimental data
    2. robustness analysis: figure out whether results of a given model are an artefact of the model or related to the core features of the model
    3. a history of (un)successful implementations: scientific and engineering methodologies and practices related to designing, coding, and running algorithms
    4. expert knowledge: experts’ judgements, evaluations, and sanctioning

Responsibility gaps

  • a physician cannot be held responsible for results of algorithms they don’t understand though, we do generally accept ex-post explanations and deem these sufficient of human actors in decision making
  • physicians typically operate other technologies and machinery which they do not fully understand or cannot fully explain the inner working of (e.g. MRI scans)
    • Debatable; because they are not making decisions, just presenting information. Additionally, these other technologies generally can be understood by an expert. This is not the case for AI systems

Counterpoints raised:

  • Black box algorithms can hide normative assumptions: we often know nothing about the priors of the black box algorithm
  • Model and data drift: computationally reliable black box algorithms can be reliable in one setting and time and not everywhere and forever