aises_4_7

<style type="text/css">
    table.tableLayout{
        margin: auto;
        border: 1px solid;
        border-collapse: collapse;
        border-spacing: 1px
    }

    table.tableLayout tr{
        border: 1px solid;
        border-collapse: collapse;
        padding: 10px;
    }

    table.tableLayout th{
        border: 1px solid;
        border-collapse: collapse;
        padding: 5px;
    }

    table.tableLayout td{
        border: 1px solid;
        padding: 10px;
    }
</style>

<h1 id="tail-events">4.7 Tail Events and Black Swans</h1>
<p>In the first few sections of this chapter, we discussed failure modes
and hazards, equations for understanding the risks they pose, and
principles for designing safer systems. We also looked at methods of
analyzing systems to model accidents and identify hazards, and explored
how different styles of analysis can be more helpful for complex
systems.<p>
The classic risk equation from the start of the chapter tells us that
the level of risk of a specific adverse event depends on both the
probability it will happen and its severity. A particular class of
events, called <em>tail events</em>, have a very low probability of
occurrence but a very high impact upon arrival. Tail events pose some
unique challenges for assessing and reducing risk, but any competent
form of risk management must attempt to address them. We will now
explore these events and their implications in more detail.</p>
<h3 id="introduction-to-tail-events">4.7.1 Introduction to Tail Events</h3>
<p>Tail events are events that happen rarely, but have a considerable
impact when they do. Consider some examples of historical tail
events.<p>
<em>The 2007-2008 financial crisis</em>: Fluctuations happen continually
in financial markets, but crises of this scale are rare and have a large
impact, with knock-on effects for banks and the general
population.<p>
<em>The COVID-19 pandemic</em>: There are many outbreaks of infectious
diseases every year, but COVID-19 spread much more widely and killed
many more people than most. It is rare for an outbreak to become a
pandemic, but those that do will have a much larger impact than the
rest.<p>
<em>The Internet</em>: There are many technologies being developed all
the time, but very few become so widely used that they transform society
as much as the Internet has. This example illustrates that some tail
events happen more gradually than others; the development and global
adoption of the internet unfolded over decades, rather than happening as
suddenly as the financial crisis or the pandemic. However, “sudden” is a
relative term. If we look at history on the scale of centuries, then the
transition into the Internet age can also appear to have happened
suddenly.<p>
<em>ChatGPT</em>: After being released in November 2022, ChatGPT gained
100 million users in just two months <span class="citation"
data-cites="hu2023chatgpt">[1]</span>. There are many consumer
applications on the internet, but ChatGPT’s user base grew faster than
those of any others launched before it. Out of many deep learning
models, ChatGPT was the first to go viral in this way. Its release also
represented a watershed moment in the progress of generative AI, placing
the issue much more firmly in the public consciousness.</p>
<p><strong>We need to consider the possibility of harmful tail events in
risk management.</strong> The last two examples—–the Internet and
ChatGPT—–illustrate that the impacts of tail events are not always
strictly negative; they can also be positive or mixed. However, <em>tail
risks</em> are what we need to pay attention to when trying to engineer
safer systems.<p>
Since tail events are rare, it can be tempting to think that we do not
need to worry about them. Indeed, some tail events have not yet happened
once in human history, such as a meteorite strike large enough to cause
global devastation, or an intense solar storm knocking out the power
grid. Nonetheless, tail events have such a high impact that it would be
unwise to ignore the possibility that they could happen. As noted by the
political scientist Scott Sagan: “Things that have never happened before
happen all the time.” <span class="citation"
data-cites="sagan1993limit">[2]</span></p>
<p><strong>AI-related tail events could have a severe impact.</strong>
As AIs are increasingly deployed within society, some tail risks we
should consider include the possibility that an AI could be used to
develop a bioweapon, or that an AI might hack a bank and wipe the
financial information. Even if these eventualities have a low
probability of occurring, it would only take one such event to cause
widespread devastation; if one such event happened, it could largely
define the overall outcome of AI deployment. For this reason, competent
risk management must involve serious efforts to prevent tail events,
however rare we think they might be.<p>
</p>
<h2 id="tail-events-can-greatly-affect-the-average-risk">4.7.2 Tail Events Can
Greatly Affect the Average Risk</h2>
<div>
<figure>
    <img
    src="https://raw.githubusercontent.com/WilliamHodgkins/AISES/main/images/median_and_tail_v2.png"
    style="width:70%" class="tb-img-full"/>
    <p class="tb-caption">Figure 4.10: The occurrence of a tail event can dramatically
shift the mean but not the median of the event type’s impact.</p>
</figure>
</div>
<p><strong>A tail event often changes the mean but not the
median.</strong> Figure 4.10 can help us
visualize how tail events affect the wider risk landscape. The graphs
show data points representing individual events, with their placement
along the x-axis indicating their individual impact.<p>
In the first graph, we have numerous data points representing frequent,
low-impact events: these are all distributed between 0 and 100, and
mostly between 0 and 10. The mean impact and median impact of this
dataset have similar values, marked on the x-axis.<p>
In the second graph we have the same collection of events, but with the
addition of a single data point of much higher impact—a tail event with
an impact of 10,000. As indicated in the graph, the median impact of the
dataset is approximately the same as before, but the mean changes
substantially and is no longer representative of the general population
of events.</p>
<p><strong>We can also think about tail events in terms of the classic
risk equation.</strong> Tail events have a low probability, but because
they are so severe, we nonetheless evaluate the risk they pose as being
large: <span
class="math display">Risk(hazard) = <em>P</em>(hazard) × severity(hazard).</span>
Depending on the exact values of probability and severity, we may find
that tail risks are just as large as—–or even larger than–—the risks
posed by much smaller events that happen all the time. In other words,
although they are rare, we cannot afford to ignore the possibility that
they might happen.</p>
<p><strong>It is difficult to plan for tail events because they are so
rare.</strong> Since we can hardly predict when tail events will happen,
or even if they will happen at all, it is much more challenging to plan
for them than it is for frequent, everyday events that we know we can
expect to encounter. It is often the case that we do not know exactly
what form they will take either.<p>
For these reasons, we cannot plan the specific details of our response
to tail events in advance. For these reasons, we cannot plan the specific 
details of our response to tail events in advance. Instead, we must plan to plan. 
This involves organizing and developing an appropriate response, if and when it is 
necessary---how relevant actors should convene to decide on and coordinate the most 
appropriate next steps, whatever the precise details of the event. Often, we need to
 figure out whether some phenomena even present tail events, for which we need to consider
 their frequency distributions. We consider this concept next.</p>
<h2 id="frequency-distributions">4.7.3 Tail Events Can Be Identified From Frequency Distributions</h2>
<p><strong>Frequency distributions tell us how common instances of
different magnitude are.</strong> To understand the origins of tail
events, we need to understand frequency distributions. These
distributions tell us about the proportion of items in a dataset that
have each possible value of a given variable. Suppose we want to study
some quantity, such as the ages of buildings. We might plot a graph
showing how many buildings there are in the world of each age, and it
might look something like the generic graph shown in Figure 4.11.<p>
The x-axis would represent building age, while the y-axis would indicate
the number of buildings of each age—the frequency of a particular age
appearing in the dataset. If our graph looked like figure 4.11, it would tell us that there are
many buildings that are relatively new, perhaps only a few decades or a
hundred years old, fewer buildings that are several hundred or a
thousand years old, and very few buildings, such as the Pyramids at
Giza, that are several thousand years old.<p>
</p>
<figure id="fig:exponential">
<embed src="https://raw.githubusercontent.com/WilliamHodgkins/AISES/main/images/head_and_tail_v2.png" class="tb-img-full" style="width: 70%"/>
<p class="tb-caption">Figure 4.11: Many distributions have a head (an area where most of the probability is concentrated)
and one or two tails (extreme regions of the distribution).</p>
<!--<figcaption>Head and tail of a distribution</figcaption>-->
</figure>
<p><strong>Many real-world frequency distributions have long
tails.</strong> We can plot graphs like this for countless variables,
from the size of different vertebrate species to the number of countries
different people have visited. Each variable will have its own unique
distribution, but many have the general property that there are lots of
occurrences of a low value and relatively few occurrences of a high
value. There are many vertebrate species with a mass of tens of
kilograms, and very few with a mass in the thousands of kilograms; there
are many people who have visited somewhere between 1-10 countries, and
few people who have visited more than 50.
<p>We can determine whether we are likely to observe tail events of a particular type by examining whether its frequency distribution has thin tails or long tails.
 In thin-tailed distributions, tail events do not exist. Examples of thin-tailed distributions include human characteristics such as height, weight, and intelligence.
 No one is over 100 meters tall, weighs over 10,000 kilograms, or has an IQ of 10,000. By contrast, in long-tailed distributions, tail events are possible. Examples of
 long-tailed distributions include book sales, earthquake magnitude, and word frequency. While most books only sell a handful of copies, most earthquakes are relatively
 harmless, and most words are rare and infrequently used, some books sell millions or even billions of copies, some earthquakes flatten cities, and some words (such as 
`the' or `I') are used extremely frequently. Of course, not all distributions neatly fit into a dichotomy of thin-tailed or long-tailed, but may be somewhere in between.</p>

<h2 id="a-caricature-of-long-tails-and-thin-tails">4.7.4 A Caricature of Long
Tails and Thin Tails</h2>
<p>To illustrate the difference between long-tailed and thin-tailed
distributions, we will now run through some comparisons between the two
categories. Note that, with these statements, we are describing
simplified caricatures of the two scenarios, for pedagogical
purposes.<p>
<strong>Under thin tails, the top few receive quite a proportionate
share of the total.</strong> If we were to measure the heights of a
group of people, the total height of the tallest 10% would not be much
more than 10% of the total height of the whole group.<p>
Under long tails, the top few receive a disproportionately large share
of the total. For example, in the music industry, the revenue earned by
the most successful 1% of artists represents around 77% of the total
revenue earned by all artists.</p>
<p><strong>Under thin tails, the total is determined by the whole
group.</strong> The total height of the tallest 10% of people is not a
very good approximation of the total height of the whole group. Most
members need to be included to get a good measure of the total. This is
called “tyranny of the collective.”</p>
<p><strong>Under long tails, the total is determined by a few extreme
occurrences.</strong> As discussed above, the most successful 1% of
artists earn 77% of the total revenue earned by all artists. 77% is a
fairly good approximation of the total. In fact, it is a better
approximation than the revenue earned by the remaining 99% of artists
would be. This is called “tyranny of the accidental.”</p>
<p><strong>Under thin tails, the typical member of a group has an
average value.</strong> Almost no members are going to be much smaller
or much larger than the mean.</p>
<p><strong>Under long tails, the typical member is either a giant or a
dwarf.</strong> Members can generally be classified as being either
extreme and high-impact or relatively insignificant.<p>
Note that, under many real-world long-tailed distributions, there may be
occurrences that seem to fall between these two categories. There may be
no clear boundary dividing occurrences that count as insignificant from
those that count as extreme. However, with these statements, we are
describing a caricature of long tails, for pedagogical purposes.</p>
<p><strong>Under thin tails, the impact of an event is not
scalable.</strong> A single event cannot escalate to become much bigger
than the average.</p>
<p><strong>Under long tails, the impact of an event is
scalable.</strong> A single event can escalate to become much bigger
than many others put together.</p>
<p><strong>Under thin tails, individual data points vary within a small
range that is close to the mean.</strong> Even the data point that is
furthest from the mean does not change the mean of the whole group by
much.</p>
<p><strong>Under long tails, individual data points can vary across many
orders of magnitude.</strong> A single extreme data point can completely
change the mean of the sample.</p>
<p><strong>Under thin tails, we can predict roughly what value a single
instance will take.</strong> We can be confident that our prediction
will not be far off, since instances cannot stray too far from the
mean.</p>
<p><strong>Under long tails, it is much harder to predict even the rough
value that a single instance will take.</strong> Since data points can
vary much more widely, our best guesses can be much further off.</p>

<p>Having laid the foundations for understanding tail events in general, we will now consider an important subset of tail events: black swans.</p>

<h2 id="sec:black-swans">4.7.5 Introduction to Black Swans</h2>
<p>So far, we have discussed tail events in general. We will now look at
an important subset of tail events, called black swans. In addition to
being rare and high-impact, black swans are also unanticipated, seeming
to happen out of the blue. The term “black swan” originates from a
historical event that provides a useful analogy.<p>
It was long believed in Europe that all swans were white because all
swan sightings known to Europeans were of white swans. For this reason,
the term “black swan” came to denote something nonexistent, or even
impossible, much as today we say “pigs might fly.” The use of this
metaphor is documented as early as Roman times. However, in 1697, a
group of Dutch explorers encountered a black species of swan in
Australia. This single, unexpected discovery completely overturned the
long-held axiom that all swans were white.<p>
This story offers an analogy for how we can have a theory or an
assumption that seems correct for a long time, and then a single,
surprising observation can completely upend that model. Such an
observation can be classed as a tail event because it is rare and
high-impact. Additionally, the fact that the observation was unforeseen
shows us that our understanding is incorrect or incomplete.<p>
From here on we will use the following working definition of black
swans: A black swan is a tail event that was largely unpredictable to
most people before it happened. Note that not all tail events are black
swans; high-magnitude earthquakes, for example, are tail events, but we
know they are likely to happen in certain regions at some point–—they
are on our radar.<p>
</p>
<h2 id="known-unknowns-and-unknown-unknowns">4.7.6 Known Unknowns and Unknown
Unknowns</h2>
<p><strong>Black swans are “unknown unknown” tail events <span
class="citation" data-cites="taleb2007blackswan">[5]</span>.</strong> We
can sort events into four categories, as shown in the table below.<p>
</p>
<br>
<table class="tableLayout">
<thead>
<tr class="header">
<td style="text-align: left;"><strong>Known knowns</strong>: things we
are aware of and understand.</td>
<td style="text-align: left;"><strong>Unknown knowns</strong>: things
that we do not realize we know (such as tacit knowledge).</td>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: left;"><strong>Known unknowns</strong>: things we
are aware of but which we don’t fully understand.</td>
<td style="text-align: left;"><strong>Unknown unknowns</strong>: things
that we do not understand, and which we are not even aware we do not
know.</td>
</tr>
</tbody>
</table>
<br>
<p>
In these category titles, the first word refers to our awareness, and
the second refers to our understanding. We can now consider these four
types of events in the context of a student preparing for an exam.</p>
<ol>
<li><p><strong>We know that we know.</strong> Known knowns are things we
are both aware of and understand. For the student, these would be the
types of questions that have come up regularly in previous papers and
that they know how to solve through recollection. They are aware that
they are likely to face these, and they know how to approach
them.</p></li>
<li><p><strong>We do not know what we know.</strong> Unknown knowns are
things we understand but may not be highly aware of. For the student,
these would be things they have not thought to prepare for but which
they understand and can do. For instance, there might be some questions
on topics they hadn’t reviewed; however, looking at these questions, the
student finds that they know the answer, although they cannot explain
why it is correct. This is sometimes called tacit knowledge or
unaccounted facts.</p></li>
<li><p><strong>We know that we do not know.</strong> Known unknowns are
things we are aware of but do not fully understand. For the student,
these would be the types of questions that have come up regularly in
previous papers but which they have not learned how to solve reliably.
The student is aware that they are likely to face these but is not sure
they will be able to answer them correctly. However, they are at least
aware that they need to do more work to prepare for them.</p></li>
<li><p><strong>We do not know that we do not know.</strong> Unknown
unknowns are things we are unaware of and do not understand. These
problems catch us completely off guard because we didn’t even know they
existed. For the student, unknown unknowns would be unexpectedly hard
questions on topics they have never encountered before and have no
knowledge or understanding of.</p></li>
</ol>
<p><strong>Unknown unknowns can also occur in AI safety and
risk.</strong> Researchers may be diligently studying various aspects of
AI and its potential risks, but new and unforeseen risks could emerge as
AI technology advances. These risks may be completely unknown and
unexpected, catching researchers off guard. It is important to
acknowledge the existence of unknown unknowns because they remind us
that there are limits to our knowledge and understanding. By being aware
of this, we can be more humble in our approach to problem-solving and
continuously strive to expand our knowledge and prepare for the
unexpected.</p>
<p><strong>We struggle to account for known unknowns and unknown
unknowns.</strong> We have included the first two categories—known
knowns and unknown knowns—for completeness. However, the most important
categories in risk analysis and management are the last two: known
unknowns and unknown unknowns. These categories pose risks because we do
not fully understand how best to respond to them, and we cannot be
perfectly confident that we will not suffer damage from them.</p>
<p><strong>Unknown unknowns are particularly concerning.</strong> If we
are aware that we might face a particular challenge, we can learn more
and prepare for it. However, if we are unaware that we will face a
challenge, we may be more vulnerable to harm. Black swans are the latter
type of event; they are not even on our radar before they happen.<p>
The difference between known unknowns and unknown unknowns is sometimes
also described as a distinction between conscious ignorance and
<em>meta-ignorance</em>. Conscious ignorance is when we see that we do
not know something, whereas meta-ignorance is when we are unaware of our
ignorance.</p>
<p><strong>Black swans in the real world</strong> It might be unfair for
someone to present us with an unknown unknown, such as finding questions
on topics irrelevant to the subject in an exam setting. The wider world,
however, is not a controlled environment; things do happen that we have
not thought to prepare for.</p>
<p><strong>Black swans indicate that our worldview is inaccurate or
incomplete.</strong> Consider a turkey being looked after by humans, who
provide plenty of food and a comfortable shelter safe from predators.
According to all the turkey’s evidence, the humans are benign and have
the turkey’s best interests at heart. Then, one day, the turkey is taken
to the slaughterhouse. This is very much an unknown unknown, or a black
swan, for the turkey, since nothing in its experience suggested that
this might happen <span class="citation"
data-cites="taleb2007blackswan">[5]</span>.<p>
This illustrates that we might have a model or worldview that does a
good job of explaining all our evidence to date, but then a black swan
can turn up and show us that our model was incorrect. The turkey’s
worldview of benign humans explained all the evidence until the
slaughterhouse trip. This event indicated a broader context that the
turkey was unaware of.<p>
Similarly, consider the 2008 financial crisis. Before this event, many
people, including many of those working in finance, assumed that housing
prices would always continue to increase. When the housing bubble burst,
it showed that this assumption was incorrect.</p>
<p><strong>Black swans are defined by our understanding.</strong> A
black swan is a black swan because our worldview is incorrect or
incomplete, which is why we fail to predict it. In hindsight, such
events often only make sense after we realize that our theory was
flawed. Seeing black swans makes us update our models to account for the
new phenomena we observe. When we have a new, more accurate model, we
can often look back in time and find the warning signs in the lead-up to
the event, which we did not recognize as such at the time.<p>
These examples also show that we cannot always reliably predict the
future from our experience; we cannot necessarily make an accurate
calculation of future risk based on long-running historical data.</p>
<p><strong>Only some tail events are black swans.</strong> As touched on
earlier, it is essential to note that black swans are a subset of tail
events, and not all tail events are black swans. For example, it is well
known that earthquakes happen in California and that a high-magnitude
one, often called “the big one,” will likely happen at some point. It is
not known exactly when—whether it will be the next earthquake or in
several decades. It might not be possible to prevent all damage from the
next “big one,” but there is an awareness of the need to prepare for it.
This represents a tail event, but not a black swan.</p>
<p><strong>Some people might be able to predict some black
swans.</strong> A restrictive definition of a black swan is an event
that is an absolute unknown unknown for everybody and is impossible to
anticipate. However, for our purposes, we are using the looser, more
practical working definition given earlier: a highly impactful event
that is largely unexpected for most people. For example, some
individuals with relevant knowledge of the financial sector did predict
the 2008 crisis, but it came out of the blue for most people. Even among
financial experts, the majority did not predict it. Therefore, we count
it as a black swan.<p>
Similarly, although pandemics have happened throughout history, and
smaller disease outbreaks occur yearly, the possibility of a pandemic
was not on most people’s radar before COVID-19. People with specific
expertise were more conscious of the risk, and epidemiologists had
warned several governments for years that they were inadequately
prepared for a pandemic. However, COVID-19 took most people by surprise
and therefore counts as a black swan under the looser definition.</p>
<p><strong>The development and rollout of AI technologies could be
subject to black swans.</strong> Within the field of AI, the consensus
view for a long time was that deep learning techniques were
fundamentally limited. Many people, even computer science professors,
did not take seriously the idea that deep learning technologies might
transform society in the near term—even if they thought this would be
possible over a timescale of centuries.<p>
Deep learning technologies have already begun to transform society, and
the rate of progress has outpaced most people’s predictions. We should,
therefore, seriously consider the possibility that the release of these
technologies could pose significant risks to society.<p>
There has been speculation about what these risks might be, such as a
flash war and autonomous economy, which are discussed in the Collective Action Problem chapter.
These eventualities might be known to some people, but for many
potential risks, there is not widespread awareness in society; if they
happened today, they would be black swans. Policymakers must have some
knowledge of these risks. Furthermore, the expanding use of AI
technologies may entail risks of black swan scenarios that no one has
yet imagined.</p>
<h2
id="implications-of-tail-events-and-black-swans-for-risk-analysis">4.7.7 Implications
of Tail Events and Black Swans for Risk Analysis</h2>
<p>Tail events and black swans present problems for analyzing and
managing risks, because we do not know if or when they will happen. For
black swans, there is the additional challenge that we do not know what
form they will take.<p>
Since, by definition, we cannot predict the nature of black swans in
advance, we cannot put any specific defenses in place against them, as
we might for risks we have thought of. We can attempt to factor black
swans into our equations to some degree, by trying to estimate roughly
how likely they are and how much damage they would cause. However, they
add much more uncertainty into our calculations. We will now discuss
some common tendencies in thinking about risk, and why they can break
down in situations that are subject to tail events and black
swans.<p>
First, we consider how our typical risk estimation methods break down
under long tails because our standard arsenal of statistical tools are
rendered useless. Then, we consider how cost-benefit analysis is
strained when dealing with long-tailed events because of its sensitivity
to our highly uncertain estimates. After this, we discuss why we should
be more explicitly considering extremes instead of averages, and look at
three common mistakes when dealing with long-tailed data: the delay
fallacy, interpreting an absence of evidence, and the preparedness
paradox.</p>
<p><strong>Tail events and black swans can substantially change the
average risk of a system.</strong> It is challenging to account for tail
events in the risk equation. Since tail events and black swans are
extremely severe, they significantly affect the average outcome. Recall
the equation for risk associated with a system:</p>
<p><span
class="math display">Risk = ∑<sub>hazard</sub><em>P</em>(hazard) × severity(hazard).</span>
Additionally, it is difficult to estimate their probability and severity
accurately. Yet, they would considerably change the evaluation of risk
because they are so severe. Furthermore, since we do not know what form
black swans will take, it may be even more difficult to factor them into
the equation accurately. This renders the usual statistical tools
useless in practice for analyzing risk in the face of potential black
swans.<p>
If the turkey in the previous example had tried to calculate the risk to
its wellbeing based on all its prior experiences, the estimated risk
would probably have been fairly low. It certainly would not have pointed
to the high risk of being taken to the slaughterhouse, because nothing
like that had ever happened to the turkey before.<p>
</p>
<div>
    <figure>
        <img
        src="https://raw.githubusercontent.com/WilliamHodgkins/AISES/main/images/convergence_v2.png" class="tb-img-full"
        style="width:70%" />
        <p class="tb-caption">Figure 4.12: The mean of a long-tailed distribution is
slow to convergence, rendering the mean an inappropriate summary statistic. </p>
    </figure>
</div>
<p><strong>We need a much larger dataset than usual.</strong> As we
increase the number of observations, we converge on an average value.
Suppose we are measuring heights and calculating a new average every
time we add a new data point. As shown in the first graph in the figure above, as we
increase our number of data points, we quickly converge on an average
that changes less and less with the addition of each new data point.
This is a result of the law of large numbers.<p>
Heights, however, are a thin-tailed variable. If we look instead at a
long-tailed variable, such as net worth, as shown in the second graph in
the figure above, a
single extreme observation can change the average by several orders of
magnitude. The law of large numbers still applies, in that we will still
eventually converge on an average value, but it will take much
longer.</p>
<p><strong>Linear regression is a standard prediction method but is less
useful for long-tailed data.</strong> Linear regression is a technique
widely used to develop predictive models based on historical data.
However, in situations where we are subject to the possibility of tail
events or black swans, we might not be sure that we have enough
historical data to converge on an accurate calculation of average risk.
Linear regression is, therefore, less helpful in assessing and
predicting risk for long-tailed scenarios.</p>
<p>Cost-benefit analysis using long-tailed data often requires highly
accurate estimates. Traditional cost-benefit analysis weighs the
probability of different results and how much we would lose or gain in
each case. From this information, we can calculate whether we expect the
outcome of a situation to be positive or negative. For example, if we
bet on a 50/50 coin toss where we will either win $5 or lose $5, our
overall expected outcome is $0.<p>
<strong>Example: lotteries.</strong> Now, imagine that we are trying to
perform a cost-benefit analysis for a lottery where we have a high
probability of winning a small amount and a low probability of losing a
large amount. If we have a 99.9% chance of winning $15 and a 0.1% chance
of losing $10,000, then our expected outcome is:<p>
<span class="math display">(0.999×15) + (0.001×−10000) = 4.985.</span>
Since this number is positive, we might believe it is a good idea to bet
on the lottery. However, if the probabilities are only slightly
different, at 99.7% and 0.3%, then our expected outcome is:<p>
<span
class="math display">(0.997×15) + (0.003×−10000) =  − 15.045.</span></p>
<p>This illustrates that just a tiny change in probabilities sometimes
makes a significant difference in whether we expect a positive or a
negative outcome. In situations like this, where the expected outcome is
highly sensitive to probabilities, using an estimate of probability that
is only slightly different from the actual value can completely change
the calculations. For this reason, relying on this type of cost-benefit
analysis does not make sense if we cannot be sure we have accurate
estimates of the probabilities in question.</p>
<p><strong>It is difficult to form accurate probability estimates for
black swans.</strong> Black swans happen rarely, so we do not have a lot
of historical data from which to calculate the exact probability that
they will occur. As we explored above, it takes a lot of data—often more
than is accessible—to make accurate judgments for long-tailed events
more generally. Therefore, we cannot be certain that we know their
probabilities accurately, rendering cost-benefit analysis unsuitable for
long-tailed data, especially for black swans.<p>
This consideration could be significant for deciding whether and how to
use AI technologies. We might have a high probability of benefitting
from the capabilities of deep learning models, and there might be only a
low probability of an associated black swan transpiring and causing
harm. However, we cannot calculate an accurate probability of a black
swan event, so we cannot evaluate our expected outcome precisely.</p>
<p><strong>It is unrealistic to estimate risk when we could face black
swans.</strong> If we attempt to develop a detailed statistical model of
risk for a situation, we are making an implicit assumption that we have
a comprehensive understanding of all the possible failure modes and how
likely they are. However, as previous black swan events have
demonstrated, we cannot always assume we know all the
eventualities.<p>
Even for tail events that are known unknowns, we cannot assume we have
sufficiently accurate information about their probabilities and impacts.
Trying to precisely estimate risk when we might be subject to tail
events or black swans can be viewed as an “unscientific overestimation
of the reach of scientific knowledge” <span class="citation"
data-cites="taleb2012antifragile">[6]</span>.</p>
<p><strong>When making risk-related decisions, we should consider
extremes, not only the average.</strong> Aside from whether or not we
can calculate an accurate average outcome under the risk of tail events
and black swans, there is also a question of whether the average is what
we should be paying attention to in these situations anyway. This idea
is captured in the following adage commonly attributed to Milton
Friedman: “Never try to walk across a river just because it has an
average depth of four feet.” If a river is four feet deep on average,
that might mean that it has a constant depth of four feet and is
possible to wade across it safely. It might also mean that it is two or
three feet deep near the banks and eight feet deep at some point in the
middle. If this were the case, then it would not be a good idea to
attempt to wade across it.<p>
Failing to account for extremes instead of averages is one example of
the mistakes people make when thinking about event types that might have
black swans. Next, we will explore three more: the delay fallacy,
misinterpreting an absence of evidence, and the preparedness
paradox.</p>
<p>If we do not have enough information to conduct a detailed risk
analysis, it might be tempting to gather more information before taking
action. A common excuse for delaying action is: “If we wait, we will
know more about the situation and be able to make a more informed
decision, so we should not make any decisions now.”</p>
<p><strong>In thin-tailed scenarios, waiting for more information is
often a good approach.</strong> Under thin tails, additional
observations will likely help us refine our knowledge and identify the
best course of action. Since there are no tail events, there is a limit
to how much damage a single event can do. There is, therefore, less
urgency to take action in these situations. The benefit of more
information can be considered to outweigh the delay in taking
action.</p>
<p><strong>In long-tailed scenarios, waiting for more information can
mean waiting until it is too late.</strong> Additional observations will
not necessarily improve our knowledge of the situation under long tails.
Most, if not all, additional observations will probably come from the
head of the distribution and will not tell us anything new about the
risk of tail events or black swans. The longer we wait before preparing,
the more we expose ourselves to the possibility of such an event
happening while we are unprepared. When tail events and black swans do
materialize, it is often too late to intervene and prevent harm.<p>
Governments failing to improve their pandemic preparedness might be
considered an example of this. Epidemiologists’ warnings were long
ignored, which seemed fine for a long time because pandemics are rare.
However, when COVID-19 struck, many governments tried to get hold of
personal protective equipment (PPE) simultaneously and found a shortage.
If they had stocked up on this before the pandemic, as experts had
advised, then the outcome might have been less severe.<p>
Furthermore, if a tail event or black swan is particularly destructive,
we can never observe it and use that information to help us make better
calculations. The turkey cannot use the event of being taken to the
slaughterhouse to make future risk estimations more accurate. With
respect to society, we cannot afford for events of this nature to happen
even once.</p>
<p><strong>We should be proactively investing in AI safety now.</strong>
Since the development and rollout of AI technologies could represent a
long-tailed scenario, entailing a risk of tail events and black swans,
it would not make sense to delay action with the excuse that we do not
have enough information. Instead, we should be proactive about safety by
investing in the three key research fields discussed earlier:
robustness, monitoring, and control. If we wait until we are certain
that an AI could pose an existential risk before working on AI safety,
we might be waiting until it is too late.</p>
<p>It can be hard to imagine a future that is significantly different
from our past and present experiences. Suppose a particular event has
never happened before. In that case, it can be tempting to interpret
that as an indication that we do not need to worry about it happening in
the future, but this is not necessarily a sound judgment.</p>
<p><strong>An absence of evidence is not strong evidence of
absence.</strong> Even if we have not found evidence that there is a
risk of black swan events, that is not evidence that there is no risk of
black swan events. In the context of AI safety, we may not have found
evidence that deep learning technologies could pose specific risks like
deceptive alignment, but that does not necessarily mean that they do not
pose such risks or that they will not at some point in the future.</p>
<p><strong>Safety measures that prevent harm can seem
redundant.</strong> Imagine that we enact safety measures to reduce the
risk of a potentially destructive event, and then the event does not
happen. Some might be tempted to say that the safety measures were
unnecessary or that implementing them was a waste of time and resources.
Even if the event does happen but is not very severe, some people might
still say that the safety measures were unnecessary because the event’s
consequences were not so bad.<p>
However, this conclusion ignores the possibility that the event did not
happen or was less severe because of the safety measures. We cannot run
through the same period of time twice and discover how things would have
unfolded without any safety measures. This is a cognitive bias known as
the preparedness paradox: efforts to prepare for potential disasters can
reduce harm from these events and, therefore, reduce the perceived need
for such preparation.</p>
<p><strong>The preparedness paradox can lead to self-defeating
prophecies.</strong> A related concept is the “self-defeating prophecy,”
where a forecast can lead to actions that prevent the forecast from
coming true. For example, suppose an epidemiologist predicts that there
will be a high death toll from a particular infectious disease. In that
case, this might prompt people to wash their hands more frequently and
avoid large gatherings to avoid infection. These behaviors are likely to
reduce infection rates and lead to a lower death toll than the
epidemiologist predicted.<p>
If we work proactively on reducing risks from global pandemics, and no
highly destructive pandemics come to pass, some people would believe
that the investment was unnecessary. However, it might be
<em>because</em> of those efforts that no destructive events happen.
Since we usually cannot run two parallel worlds—–one with safety efforts
and one without—–it might be difficult or impossible to prove that the
safety work prevented harm. Those who work in this area may never know
whether their efforts have prevented a catastrophe and have their work
vindicated. Nevertheless, preventing disasters is essential, especially
in cases like the development of AI, where we have good theoretical
reasons to believe that a black swan is on the cards.</p>
<h2 id="identifying-the-risk-of-tail-events-or-black-swans">4.7.8 Identifying
the Risk of Tail Events or Black Swans</h2>
<p>Since the possibility of tail events and black swans affects how we
approach risk management, we must consider whether we are facing a
long-tailed or thin-tailed scenario. We need to know whether we can rely
on standard statistical methods to estimate risk or whether we face the
possibility of rare, high-impact events. This can be difficult to
determine, especially in cases of low information, but there are some
valuable indicators we can look for.</p>
<p><strong>Highly connected systems often give rise to long-tailed
scenarios.</strong> As discussed earlier, multiplicative phenomena can
lead to long tails. We should ask ourselves: Can one part of the system
rapidly affect many others? Can a single event trigger a cascade? If the
answers to these questions are yes, then it is possible that an event
can escalate to become a tail event with an extreme impact.</p>
<p><strong>The use of AI in society could create a new, highly connected
system.</strong> If deep learning models become enmeshed within society
and are put in charge of various decisions, then we will have a highly
connected system where these agents regularly interact with humans and
each other. In these conditions, a single erroneous decision made by one
agent could trigger a cascade of harmful decisions by others, for
example, if they govern the deployment of weapons. This could leave us
vulnerable to sudden catastrophes such as flash wars or powerful rogue
AIs.</p>
<p><strong>Complex systems may be more likely to entail a risk of black
swans.</strong> Complex systems can evolve in unpredictable ways and
develop unanticipated behaviors. We cannot usually foresee every
possible way a complex system might unfold. For this reason, we might
expect that complex evolving systems present an inherent risk of black
swans.</p>
<p><strong>Deep learning models and the surrounding social systems are
all complex systems.</strong> It is unlikely that we will be able to
predict every single way AI might be used, just as, in the early days of
the internet, it would have been difficult to predict every way
technology would ultimately be used. This means that there might be a
risk of AI being used in harmful ways that we have not foreseen,
potentially leading to a destructive black swan event that we are
unprepared for. The idea that deep learning systems qualify as complex
systems is discussed in greater depth in the Complex Systems chapter.</p>
<p><strong>New systems may be more likely to present black
swans.</strong> Absence of evidence is only evidence of absence if we
expect that some evidence should have turned up in the timeframe that
has elapsed. For systems that have not been around for long, we would be
unlikely to have seen proof of tail events or black swans since these
are rare by definition.</p>
<p><strong>AI may not have existed for long enough for us to have
learned about all its risks.</strong> In the case of emerging
technology, it is reasonable to think that there might be a risk of tail
events or black swans, even if we do not have any evidence yet. The lack
of evidence might be explained simply by the fact that the technology
has not been around for long. Our meta-ignorance means that we should
take AI risk seriously. By definition, we can’t be sure there are no
unknown unknowns. Therefore, it is over-confident for us to feel sure we
have eliminated all risks.</p>
<p><strong>Accelerating progress could increase the frequency of black
swan events.</strong> We have argued that black swan events should be
taken seriously, despite being rare. However, as technological progress
and economic growth advance at an increasing rate, such events may in
fact become more frequent, further compounding their relevance to risk
management. This is because the increasing pace of change also means
that we will more often face novel circumstances that could present
unknown unknowns. Moreover, within the globalised economy, social
systems are increasingly interconnected, increasing the likelihood that
one failure could trigger a cascade and have an outsized impact.</p>

<p><strong>There are techniques for turning some black swans into known
unknowns.</strong> As discussed earlier, under our practical definition,
not all black swans are completely unpredictable, especially not for
people who have the relevant expertise. Ways of putting more black swans
on our radar include expanding our safety imagination, conducting stress
tests, and red-teaming <span class="citation"
data-cites="Marsden2017blackswan">[7]</span>.<p>
Expanding our “safety imagination” can help us envision a wider range of
possibilities. We can do this by playing a game of “what if” to increase
the range of possible scenarios we can imagine unfolding. Brainstorming
sessions can also help to rapidly generate lots of new ideas about
potential failure modes in a system. We can identify and question our
assumptions–—about what the nature of a hazard will be, what might cause
it, and what procedures we will be able to follow to deal with it—–in
order to imagine a richer set of eventualities.<p>
Some HROs use a technique called horizon scanning, which involves
monitoring potential future threats and opportunities before they
arrive, to minimize the risk of unknown unknowns <span class="citation"
data-cites="boult2018horizon">[9]</span>. AI systems could be used to
enhance horizon-scanning capabilities by simulating situations that
mirror the real world with a high degree of complexity. The simulations
might generate data that reveal potential black swan risks to be aware
of when deploying a new system. As well as conducting horizon scanning,
HROs also contemplate near-misses and speculate about how they might
have turned into catastrophes, so that lessons can be learned.<p>
“Red teams” can find more black swans by adopting a mindset of malicious
intent. Red teams should try to think of as many ways as they can to
misuse or sabotage the system. They can then challenge the organization
on how it would respond to such attacks. Finally, stress tests such as
dry-running hypothetical scenarios and evaluating how well the system
copes with them, and thinking about how it could be improved can improve
a system’s resilience to unexpected events.</p>

<br>
<br>
<h3>References</h3>
<div id="refs" class="references csl-bib-body" data-entry-spacing="0"
role="list">
<div id="ref-hu2023chatgpt" class="csl-entry" role="listitem">
<div class="csl-left-margin">[1] K.
Hu, <span>“ChatGPT sets record for fastest-growing user base - analyst
note.”</span> Accessed: Feb. 03, 2023. [Online]. Available: <a
href="https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/">https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/</a></div>
</div>
<div id="ref-sagan1993limit" class="csl-entry" role="listitem">
<div class="csl-left-margin">[2] S.
D. Sagan, <em>The limits of safety: Organizations, accidents, and
nuclear weapons</em>, vol. 177. Princeton University Press, 1993.
Accessed: Sep. 26, 2023. [Online]. Available: <a
href="http://www.jstor.org/stable/j.ctvzsmf8r">http://www.jstor.org/stable/j.ctvzsmf8r</a></div>
</div>
<div id="ref-Marsden2017" class="csl-entry" role="listitem">
<div class="csl-left-margin">[3] E.
Marsden, <span>“Designing for safety: Inherent safety, designed
in.”</span> Accessed: Jul. 31, 2017. [Online]. Available: <a
href="https://risk-engineering.org/safe-design/">https://risk-engineering.org/safe-design/</a></div>
</div>
<div id="ref-ord2023lindy" class="csl-entry" role="listitem">
<div class="csl-left-margin">[4] T.
Ord, <span>“The lindy effect.”</span> 2023. Available: <a
href="https://arxiv.org/abs/2308.09045">https://arxiv.org/abs/2308.09045</a></div>

<div id="ref-taleb2007blackswan" class="csl-entry" role="listitem">
<div class="csl-left-margin">[5] N.
N. Taleb, <em>The black swan: The impact of the highly improbable</em>,
vol. 2. Random House, 2007.</div>
</div>
<div id="ref-taleb2012antifragile" class="csl-entry" role="listitem">
<div class="csl-left-margin">[6] N.
N. Taleb, <em>Antifragile: Things that gain from disorder</em>. in
Incerto. Random House Publishing Group, 2012. Available: <a
href="https://books.google.com.au/books?id=5fqbz_qGi0AC">https://books.google.com.au/books?id=5fqbz_qGi0AC</a></div>
</div>
<div id="ref-Marsden2017blackswan" class="csl-entry" role="listitem">
<div class="csl-left-margin">[7] E.
Marsden, <span>“Black swans: The limits of probabilistic
modelling.”</span> Accessed: Jul. 31, 2017. [Online]. Available: <a
href="https://risk-engineering.org/black-swans/">https://risk-engineering.org/black-swans/</a></div>
</div>
<div id="ref-boult2018horizon" class="csl-entry" role="listitem">
<div class="csl-left-margin">[8] M.
Boult <em>et al.</em>, <em>Horizon scanning: A practitioner’s
guide</em>. IRM - Institute of Risk Management, 2018.</div>
</div>
<div id="ref-danzig2018technology" class="csl-entry" role="listitem">
<div class="csl-left-margin">[9] R.
Danzig, <span>“Technology roulette.”</span> Center for a new American
Security, 2018.</div>
</div>
</div>