-
Notifications
You must be signed in to change notification settings - Fork 5
/
comparisonError.tex
204 lines (178 loc) · 9.67 KB
/
comparisonError.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
\section{Batch comparison audits of a tolerable overstatement in votes}
\label{sec:comparisonError}
In this section we expand previous comparison auditing work to handle two new requirements.
The first relates to partitioning the permissible overstatement
through the parameters $\lambda_s$, as discussed in section~\ref{sec:hybrid}.
The second handles batch-level comparison audits.
To test whether the overstatement of any margin
(in votes) exceeds some fraction $\lambda$ of the overall margin $V_{w\ell}$ between
reported winner $w$ and reported loser $\ell$.
If the stratum contains all the ballots cast in the contest, then for $\lambda = 1$, this
would confirm the election outcome.
For stratified audits, we might want to test other values of $\lambda$, as described in Section~\ref{sec:hybrid}.
This section also addresses the second requirement by deriving a method for batches of arbitrary size, which might be useful
to audit contests that include CVR counties and legacy counties.
We keep the \emph{a priori} error bounds tighter than the ``super-simple''
method~\cite{stark10d}.
To keep the notation simpler, we consider only a single contest, but the
MACRO test statistic \cite{stark09c,stark10d} automatically extends the result to
auditing $C>1$ contests simultaneously.
The derivation is for plurality contests, including ``vote-for-$k$'' plurality contests.
Majority and super-majority contests are a minor
modification~\cite{stark08a}.\footnote{%
So are some forms of preferential and approval voting, such as Borda count, and
proportional representation contests, such as D'Hondt~\cite{starkTeague14}.
See \url{https://github.com/pbstark/S157F17/blob/master/audit.ipynb} for a derivation
of ballot-level comparison risk-limiting audits for super-majority contests. (Last visited 14 May 2018.)
Changes for IRV/STV are more complicated.
}
\subsection{Notation}
\begin{itemize}
\item $\mathcal{W}$: the set of reported winners of the contest
\item $\mathcal{L}$: the set of reported losers of the contest
\item $N_s$ ballots were cast in all in the stratum. (The contest might not appear on all $N_s$ ballots.)
\item $P$ ``batches'' of ballots are in stratum $s$. A batch contains one or more ballots. Every ballot in stratum $s$ is in exactly one batch.
\item $n_p$: number of ballots in batch $p$. $N_s = \sum_{p=1}^P n_p$.
\item $v_{pi} \in \{0, 1\}$: the reported votes for candidate $i$ in batch $p$
\item $a_{pi} \in \{0, 1\}$: actual votes for candidate $i$ in batch $p$.
If the contest does not appear on any ballot in batch $p$, then $a_{pi} = 0$.
\item $V_{w\ell,s} \equiv \sum_{p=1}^P (v_{pw} - v_{p\ell})$:
Reported margin in stratum $s$ of reported winner $w \in \mathcal{W}$ over reported loser
$\ell \in \mathcal{L}$, in votes.
\item $V_{w\ell}$:
Overall reported margin of reported winner $w \in \mathcal{W}$ over reported loser
$\ell \in \mathcal{L}$, in votes, for the entire contest (not just stratum $s$)
% If I'm not punch drunk, we really did manage to keep generalizations across multiple contests out of
% the subsequent exposition. However, we still use V
%
% \item $V_s$: smallest reported margin in the stratum among all $C$ contests audited using the same sample:
%$V_s \equiv \min_{w \in \mathcal{W}, \ell \in \mathcal{L}} V_{w \ell, s}$
%
% \item $V$: smallest reported overall margin among all $C$ contests audited using the same sample:
% $V \equiv \min_{w \in \mathcal{W}, \ell \in \mathcal{L}} V_{w \ell}$
\item $V$: smallest reported overall margin between any reported winner and reported loser:
$V \equiv \min_{w \in \mathcal{W}, \ell \in \mathcal{L}} V_{w \ell}$
\item $A_{w\ell,s} \equiv \sum_{p=1}^P (a_{pw} - a_{p\ell})$:
actual margin in the stratum of reported winner $w \in \mathcal{W}$ over reported loser
$\ell \in \mathcal{L}$, in votes
\item $A_{w\ell}$:
actual margin of reported winner $w \in \mathcal{W}$ over reported loser
$\ell \in \mathcal{L}$, in votes, for the entire contest (not just in stratum $s$)
\end{itemize}
\subsection{Reduction to maximum relative overstatement}
If the contest is entirely contained in stratum $s$, then
the reported winners of the contest are the actual winners if
$$
\min_{w \in \mathcal{W}, \ell \in \mathcal{L}} A_{w\ell,s} > 0.
$$
Here, we address the case that the contest may include a portion outside the stratum.
To combine independent samples in different strata, it is convenient
to be able to test whether the net overstatement error in a stratum exceeds a given threshold.
Instead of testing that condition directly, we will test a condition that is sufficient
but not necessary for the inequality to hold, to get a computationally simple test that
is still conservative (i.e., the risk is not larger than its nominal value).
For every winner, loser pair $(w, \ell)$, we want to test
whether the overstatement error exceeds some threshold, generally
one tied to the reported margin between $w$ and $\ell$.
For instance, for a stratified hybrid audit, we set the threshold to be
$\lambda_s V_{w\ell}$.
We want to test whether
$$
\sum_{p=1}^P (v_{pw}-a_{pw} - v_{p\ell} + a_{p\ell})/V_{w\ell} \ge \lambda_s.
$$
The maximum of sums is not larger than the sum of the maxima; that is,
$$
\max_{w \in \mathcal{W}, \ell \in \mathcal{L}}
\sum_{p=1}^P (v_{pw}-a_{pw} - v_{p\ell} + a_{p\ell})/V_{w\ell}
\le
\sum_{p=1}^P \max_{w \in \mathcal{W}, \ell \in \mathcal{L}}
(v_{pw}-a_{pw} - v_{p\ell} + a_{p\ell})/V_{w\ell}.
$$
Define
$$
e_p \equiv \max_{w \in \mathcal{W} \ell \in \mathcal{L}}
(v_{pw}-a_{pw} - v_{p\ell} + a_{p\ell})/V_{w\ell}.
$$
Then no reported margin is overstated by a fraction $\lambda_s$ or more
if
$$
E \equiv \sum_{p=1}^P e_p < \lambda_s.
$$
Thus if we can reject the hypothesis $E \ge \lambda_s$, we can conclude that
no pairwise margin was overstated by as much as a fraction $\lambda_s$.
Testing whether $E \ge \lambda_s$ would require a very large sample if we knew nothing at
all about $e_p$ without auditing batch $p$: a single large value of $e_p$ could make
$E$ arbitrarily large.
But there is an \emph{a priori} upper bound for $e_p$.
Whatever the reported votes $v_{pi}$ are in batch~$p$, we can find the
potential values of the actual votes $a_{pi}$ that would make the
error $e_p$ largest, because $a_{pi}$ must be between 0 and $n_p$,
the number of ballots in batch~$p$:
$$
\frac{v_{pw}-a_{pw} - v_{p\ell} + a_{p\ell}}{V_{w\ell}} \le
\frac{v_{pw}- 0 - v_{p\ell} + n_p}{V_{w\ell}}.
$$
Hence,
\begin{equation} \label{eq:uDef}
e_p \le \max_{w \in \mathcal{W}, \ell \in \mathcal{L}}
\frac{v_{pw} - v_{p\ell} + n_p}{V_{w\ell}} \equiv u_p.
\end{equation}
Knowing that $e_p \le u_p$ might let us conclude reliably that $E < \lambda_s$
by examining only a small number of batches---depending on the
values $\{ u_p\}_{p=1}^P$ and on the values of $\{e_p\}$ for the audited batches.
To make inferences about $E$, it is helpful to work with the \emph{taint}
$t_p \equiv \frac{e_p}{u_p} \le 1$.
Define $U \equiv \sum_{p=1}^P u_p$.
Suppose we draw batches at random with replacement, with probability $u_p/U$
of drawing batch $p$ in each draw, $p = 1, \ldots, P$.
(Since $u_p \ge 0$, these are all positive numbers, and they sum to 1,
so they define a probability distribution on the $P$ batches.)
Let $T_j$ be the value of $t_p$ for the batch $p$ selected in the $j$th draw.
Then $\{T_j\}_{j=1}^n$ are IID, $\mathbb{P} \{T_j \le 1\} = 1$, and
$$
\mathbb{E} T_1 = \sum_{p=1}^P \frac{u_p}{U} t_p =
\frac{1}{U}\sum_{p=1}^P u_p \frac{e_p}{u_p} =
\frac{1}{U} \sum_{p=1}^P e_p = E/U.
$$
Thus $E = U \mathbb{E} T_1$.
So, if we have strong evidence that
$\mathbb{E} T_1 < \lambda_s/U$, we have
strong evidence that $E < \lambda_s$.
This approach can be simplified even further by noting that $u_p$ has
a simple upper bound that does not depend on $v_{pi}$.
At worst, the reported result for batch $p$ shows $n_p$ votes for the
``least-winning'' apparent winner of the contest with the smallest margin,
but a hand interpretation would show that all $n_p$ ballots in the batch
had votes for the runner-up in that contest.
Since $V_{w\ell} \ge V\equiv \min_{w \in \mathcal{W}, \ell \in \mathcal{L}} V_{w \ell}$ and $0 \le v_{pi} \le n_p$,
$$
u_p = \max_{w \in \mathcal{W}, \ell \in \mathcal{L}}
\frac{v_{pw} - v_{p\ell} + n_p}{V_{w\ell}}
\le \max_{w \in \mathcal{W}, \ell \in \mathcal{L}}
\frac{n_p - 0 + n_p}{V_{w\ell}}
\le \frac{2n_p}{V}.
$$
Thus if we use $2n_p/V$ in lieu of $u_p$, we still get conservative results.
(We also need to re-define $U$ to be the sum of those upper bounds.)
An intermediate, still conservative approach would be to use this upper bound for
batches that consist of a single ballot, but use the sharper bound (\ref{eq:uDef})
when $n_p > 1$.
Regardless, for the new definition of $u_p$ and $U$,
$\{T_j\}_{j=1}^n$ are IID, $\mathbb{P} \{T_j \le 1\} = 1$,
and
$$
\mathbb{E} T_1 = \sum_{p=1}^P \frac{u_p}{U} t_p =
\frac{1}{U}\sum_{p=1}^P u_p \frac{e_p}{u_p} =
\frac{1}{U} \sum_{p=1}^P e_p = E/U.
$$
So, if we have evidence that $\mathbb{E} T_1 < \lambda_s/U$, we have evidence that
$E < \lambda_s$.
\subsection{Testing $\mathbb{E} T_1 \ge \lambda_s/U$}
A variety of methods are available to test whether $\mathbb{E} T_1 < \lambda_s/U$.
One particularly ``clean'' sequential method is based on Wald's Sequential Probability
Ratio Test (SPRT) (\cite{wald45}).
Harold Kaplan pointed out this method on a website that no longer exists.
A derivation of this ``Kaplan-Wald'' method is given in Appendix A of~\cite{starkTeague14};
to apply the method here, take $t = \lambda_s$ in their equation~18.
A different sequential method, the Kaplan-Markov method (also due to Harold Kaplan),
is given in~\cite{stark09b}.