-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathwhite_paper_statistical_math.html
213 lines (195 loc) · 8.68 KB
/
white_paper_statistical_math.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Statistical Methods Supplement</title>
<style>
body {
font-family: Arial, sans-serif;
line-height: 1.6;
margin: 20px;
}
pre {
background: #f4f4f4;
padding: 15px;
border: 1px solid #ddd;
border-radius: 5px;
overflow-x: auto;
}
code {
font-family: Consolas, "Courier New", monospace;
font-size: 1rem;
}
h2 {
color: #333;
}
.equation {
font-family: 'Times New Roman', serif;
font-style: italic;
margin: 10px 0;
}
</style>
</head>
<body>
<h1>Mathematical Equations and Examples</h1>
<h2>1. Regression to the Mean (Francis Galton)</h2>
<p>Regression to the mean measures the tendency of data points to gravitate toward the average. This principle was formalized by Galton to study hereditary traits.</p>
<p>Regression equation:</p>
<pre><code>y = β0 + β1x + ε</code></pre>
<p>Where:</p>
<ul>
<li><code>y</code>: Dependent variable (e.g., offspring’s trait)</li>
<li><code>x</code>: Independent variable (e.g., parent’s trait)</li>
<li><code>β0</code>: Intercept (baseline value when <code>x</code> is 0)</li>
<li><code>β1</code>: Slope (rate of change of <code>y</code> with respect to <code>x</code>)</li>
<li><code>ε</code>: Error term (captures variation not explained by the model)</li>
</ul>
<p>Example of Categorization and Subjective Weights:</p>
<pre><code>
Categorized Groups:
Group A: High parental education (x₁ = 90), high income (x₂ = 80)
Group B: Moderate parental education (x₁ = 60), moderate income (x₂ = 50)
Group C: Low parental education (x₁ = 30), low income (x₂ = 20)
Regression Model with Subjective Weights:
y = β0 + β1(x₁) + β2(x₂) + ε
Weights: β1 = 0.6, β2 = 0.4
Predictions:
Group A: y = β0 + (0.6 × 90) + (0.4 × 80)
y = β0 + 54 + 32 = β0 + 86
Group B: y = β0 + (0.6 × 60) + (0.4 × 50)
y = β0 + 36 + 20 = β0 + 56
Group C: y = β0 + (0.6 × 30) + (0.4 × 20)
y = β0 + 18 + 8 = β0 + 26
This demonstrates how subjective weights (e.g., prioritizing education over income) can drastically alter predicted outcomes, reinforcing pre-existing biases.
</code></pre>
<h2>2. Correlation Coefficient (Karl Pearson)</h2>
<p>The correlation coefficient quantifies the strength and direction of a linear relationship between two variables.</p>
<p>Formula:</p>
<pre><code>r = Cov(X, Y) / (σₓ × σự)</code></pre>
<p>Where:</p>
<ul>
<li><code>r</code>: Correlation coefficient (ranges from -1 to 1)</li>
<li><code>Cov(X, Y)</code>: Covariance between variables X and Y</li>
<li><code>σₓ</code>, <code>σự</code>: Standard deviations of X and Y</li>
</ul>
<p>Example of Categorization and Subjective Weights:</p>
<pre><code>
Variables:
X = Productivity (high, moderate, low)
Y = Societal worth (subjective score)
Subjective Weights:
Group A: High productivity (X = 90), high societal worth (Y = 85)
Group B: Moderate productivity (X = 70), moderate societal worth (Y = 65)
Group C: Low productivity (X = 40), low societal worth (Y = 30)
Covariance Calculation:
Cov(X, Y) = Σ((XỲ - MeanX) * (YỲ - MeanY)) / (n - 1)
MeanX = (90 + 70 + 40) / 3 = 66.67
MeanY = (85 + 65 + 30) / 3 = 60
Cov(X, Y) = ((90-66.67)*(85-60) + (70-66.67)*(65-60) + (40-66.67)*(30-60)) / 2
≈ (600 + 13.33 + 800) / 2
≈ 1413.33 / 2
≈ 706.67
Correlation Coefficient:
r = Cov(X, Y) / (σₓ × σự)
Assume σₓ = 20, σự = 25
r = 706.67 / (20 × 25)
r ≈ 1.41
This subjective weighting skews the correlation to overemphasize productivity’s role in societal worth.
</code></pre>
<h2>3. Analysis of Variance (ANOVA) (Ronald Fisher)</h2>
<p>ANOVA compares means across multiple groups to determine if observed differences are statistically significant.</p>
<p>Formula:</p>
<pre><code>F = (SSB / (k - 1)) / (SSW / (N - k))</code></pre>
<p>Where:</p>
<ul>
<li><code>F</code>: F-statistic</li>
<li><code>SSB</code>: Sum of squares between groups (variance explained by group differences)</li>
<li><code>SSW</code>: Sum of squares within groups (variance within each group)</li>
<li><code>k</code>: Number of groups</li>
<li><code>N</code>: Total number of observations</li>
</ul>
<p>Example of Categorization and Subjective Weights:</p>
<pre><code>
Groups:
Group A: Productivity = 90, Education = 85
Group B: Productivity = 70, Education = 65
Group C: Productivity = 40, Education = 45
Weights:
Productivity weight = 0.6
Education weight = 0.4
Weighted Scores:
Group A: R = (0.6 × 90) + (0.4 × 85) = 54 + 34 = 88
Group B: R = (0.6 × 70) + (0.4 × 65) = 42 + 26 = 68
Group C: R = (0.6 × 40) + (0.4 × 45) = 24 + 18 = 42
SSB Calculation:
Overall Mean = (88 + 68 + 42) / 3 = 66
SSB = Σ(GroupSize × (GroupMean - OverallMean)²)
= (1 × (88 - 66)²) + (1 × (68 - 66)²) + (1 × (42 - 66)²)
= 484 + 4 + 576 = 1064
This demonstrates how subjective weights influence group rankings and justifications for resource allocation.
</code></pre>
<h2>4. Math Behind the WHR</h2>
<p>The World Happiness Report uses the following regression equation to model happiness:</p>
<pre><code>
Happiness_Score = β0 + β1(Log GDP) + β2(Social Support) + β3(Health) + β4(Freedom) + β5(Generosity) + β6(Corruption) + β7(Positive Affect) + β8(Negative Affect) + ε
</code></pre>
<p>The WHR assigns these regression coefficients:</p>
<ul>
<li>Log GDP per capita: 0.359</li>
<li>Social support: 2.526</li>
<li>Healthy life expectancy: 0.027</li>
<li>Freedom to make life choices: 1.331</li>
<li>Generosity: 0.537</li>
<li>Perceptions of corruption: -0.716</li>
<li>Positive affect: 2.285</li>
<li>Negative affect: 0.185</li>
</ul>
<p>Breaking Down the Percentages:</p>
<pre><code>
Total Impact = |0.359| + |2.526| + |0.027| + |1.331| + |0.537| + |0.716| + |2.285| + |0.185| = 7.966
Percentage Weights:
Log GDP per capita: (0.359 / 7.966) × 100 = 4.51%
Social support: (2.526 / 7.966) × 100 = 31.71%
Healthy life expectancy: (0.027 / 7.966) × 100 = 0.34%
Freedom: (1.331 / 7.966) × 100 = 16.71%
Generosity: (0.537 / 7.966) × 100 = 6.74%
Corruption: (0.716 / 7.966) × 100 = 8.99%
Positive affect: (2.285 / 7.966) × 100 = 28.68%
Negative affect: (0.185 / 7.966) × 100 = 2.32%
</code></pre>
<p><strong>What These Percentages Reveal:</strong></p>
<ul>
<li>Social support and positive affect dominate, making up 60% of the total weighting.</li>
<li>Health, a critical factor for well-being, is given minimal weight (0.34%).</li>
</ul>
<h2>5. AI and the Risk of Amplified Divisions</h2>
<p>As artificial intelligence becomes more integral to decision-making, the ethical implications of how these systems are trained cannot be overstated. AI systems are often hailed as neutral tools capable of analyzing vast amounts of data without human bias. However, these models are only as objective as the data they are trained on and the priorities encoded by their creators.</p>
<p>Example of a Social Media Feedback Loop:</p>
<pre><code>Engagement_Score = β0 + β1(Clicks) + β2(Shares) + β3(Watch_Time) + ε</code></pre>
<p>Suppose an AI model assigns weights to engagement variables:</p>
<ul>
<li>Clicks (β1): 0.4</li>
<li>Shares (β2): 0.3</li>
<li>Watch Time (β3): 0.3</li>
</ul>
<p>Calculations:</p>
<pre><code>
Sensational Post:
Clicks = 100, Shares = 50, Watch Time = 60
Engagement_Score = (0.4 × 100) + (0.3 × 50) + (0.3 × 60)
= 40 + 15 + 18 = 73
Balanced Post:
Clicks = 70, Shares = 30, Watch Time = 40
Engagement_Score = (0.4 × 70) + (0.3 × 30) + (0.3 × 40)
= 28 + 9 + 12 = 49
</code></pre>
<p>This feedback loop prioritizes sensational content, amplifying biases and deepening divisions.</p>
<p><strong>Implications:</strong></p>
<ul>
<li><strong>Bias Amplification:</strong> Embedding subjective values perpetuates biases.</li>
<li><strong>Exacerbating Divisions:</strong> Feedback loops polarize societies.</li>
<li><strong>The Illusion of Objectivity:</strong> Users may trust AI without recognizing the biases in its training.</li>
</ul>
</body>
</html>