forked from alecjacobson/geometry-processing-registration
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.html
1037 lines (840 loc) · 54.1 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<link type="text/css" rel="stylesheet" href="shared/css/style.css"/>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({ TeX: { equationNumbers: {autoNumber: "all"} } });
</script>
<script type="text/javascript" src="shared/js/MathJax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
</head>
<body>
<div style="display:none">
<span class="math">\(\newcommand{\A}{\mat{A}}\)</span>
<span class="math">\(\newcommand{\B}{\mat{B}}\)</span>
<span class="math">\(\newcommand{\C}{\mat{C}}\)</span>
<span class="math">\(\newcommand{\D}{\mat{D}}\)</span>
<span class="math">\(\newcommand{\E}{\mat{E}}\)</span>
<span class="math">\(\newcommand{\F}{\mat{F}}\)</span>
<span class="math">\(\newcommand{\G}{\mat{G}}\)</span>
<span class="math">\(\newcommand{\H}{\mat{H}}\)</span>
<span class="math">\(\newcommand{\I}{\mat{I}}\)</span>
<span class="math">\(\newcommand{\K}{\mat{K}}\)</span>
<span class="math">\(\newcommand{\L}{\mat{L}}\)</span>
<span class="math">\(\newcommand{\M}{\mat{M}}\)</span>
<span class="math">\(\newcommand{\N}{\mat{N}}\)</span>
<span class="math">\(\newcommand{\One}{\mathbf{1}}\)</span>
<span class="math">\(\newcommand{\P}{\mat{P}}\)</span>
<span class="math">\(\newcommand{\Q}{\mat{Q}}\)</span>
<span class="math">\(\newcommand{\Rot}{\mat{R}}\)</span>
<span class="math">\(\newcommand{\R}{\mathbb{R}}\)</span>
<span class="math">\(\newcommand{\S}{\mathcal{S}}\)</span>
<span class="math">\(\newcommand{\T}{\mat{T}}\)</span>
<span class="math">\(\newcommand{\U}{\mat{U}}\)</span>
<span class="math">\(\newcommand{\V}{\mat{V}}\)</span>
<span class="math">\(\newcommand{\W}{\mat{W}}\)</span>
<span class="math">\(\newcommand{\X}{\mat{X}}\)</span>
<span class="math">\(\newcommand{\Y}{\mat{Y}}\)</span>
<span class="math">\(\newcommand{\argmax}{\mathop{\text{argmax}}}\)</span>
<span class="math">\(\newcommand{\argmin}{\mathop{\text{argmin}}}\)</span>
<span class="math">\(\newcommand{\c}{\vec{c}}\)</span>
<span class="math">\(\newcommand{\d}{\vec{d}}\)</span>
<span class="math">\(\newcommand{\e}{\vec{e}}\)</span>
<span class="math">\(\newcommand{\f}{\vec{f}}\)</span>
<span class="math">\(\newcommand{\g}{\vec{g}}\)</span>
<span class="math">\(\newcommand{\mat}[1]{\mathbf{#1}}\)</span>
<span class="math">\(\newcommand{\min}{\mathop{\text{min}}}\)</span>
<span class="math">\(\newcommand{\n}{\vec{n}}\)</span>
<span class="math">\(\newcommand{\p}{\vec{p}}\)</span>
<span class="math">\(\newcommand{\q}{\vec{q}}\)</span>
<span class="math">\(\newcommand{\r}{\vec{r}}\)</span>
<span class="math">\(\newcommand{\transpose}{{\mathsf T}}\)</span>
<span class="math">\(\newcommand{\tr}[1]{\mathop{\text{tr}}{\left(#1\right)}}\)</span>
<span class="math">\(\newcommand{\t}{\vec{t}}\)</span>
<span class="math">\(\newcommand{\u}{\vec{u}}\)</span>
<span class="math">\(\newcommand{\vec}[1]{\mathbf{#1}}\)</span>
<span class="math">\(\newcommand{\x}{\vec{x}}\)</span>
<span class="math">\(\newcommand{\y}{\vec{y}}\)</span>
<span class="math">\(\newcommand{\z}{\vec{z}}\)</span>
<span class="math">\(\renewcommand{\v}{\vec{v}}\)</span>
</div>
<h1 id="geometryprocessing–registration">Geometry Processing – Registration</h1>
<blockquote>
<p><strong>To get started:</strong> Fork this repository then issue</p>
<pre><code>git clone --recursive http://github.com/[username]/geometry-processing-registration.git
</code></pre>
</blockquote>
<h2 id="installationlayoutandcompilation">Installation, Layout, and Compilation</h2>
<p>See
<a href="http://github.com/alecjacobson/geometry-processing-introduction">introduction</a>.</p>
<h2 id="execution">Execution</h2>
<p>Once built, you can execute the assignment from inside the <code>build/</code> using </p>
<pre><code>./registration [path to mesh1.obj] [path to mesh2.obj]
</code></pre>
<h2 id="background">Background</h2>
<p>In this assignment, we will be implementing a version of the <a href="https://en.wikipedia.org/wiki/Iterative_closest_point">iterative closest
point (ICP)</a>, not to be
confused with <a href="https://en.wikipedia.org/wiki/Insane_Clown_Posse">Insane Clown Posse</a>.</p>
<p>Rather than <a href="https://en.wikipedia.org/wiki/Point_set_registration">registering multiple point
clouds</a>, we will register
multiple triangle mesh surfaces. </p>
<p>This <em>algorithm</em> and its many variants has been use for quite some time to
align discrete shapes. One of the first descriptions is given in “A Method for
Registration of 3-D Shapes” by Besl & McKay 1992. However, the award-winning
PhD thesis of Sofien Bouaziz (“Realtime Face Tracking and Animation” 2015,
section 3.2–3.3) contains a more modern view that unifies many of the variants
with respect to how they impact the same core optimization problem. </p>
<p>For our assignment, we will assume that we have a triangle mesh representing a
complete scan of the surface <span class="math">\(Y\)</span> of some <a href="https://en.wikipedia.org/wiki/Rigid_body">rigid
object</a> and a new partial scan of
that surface <span class="math">\(X\)</span>.</p>
<figure>
<img src="images/max-inputs.jpg" alt="Example input" />
<figcaption> <meta name="Example input" content="a partial scan mesh surface $X$ is misaligned with the
mesh of the complete surface $Y$"/>
</figcaption>
</figure>
<p>These meshes will not have the same number of vertices or the even the same
topology. We will first explore different ways to <em>measure</em> how well aligned
two surfaces are and then how to optimize the <em>rigid</em> alignment of the partial
surface <span class="math">\(X\)</span> to the complete surface <span class="math">\(Y\)</span>.</p>
<h2 id="hausdorffdistance">Hausdorff distance</h2>
<p>We would like to compute a single scalar number that measures how poorly two
surfaces are matched. In other words, we would like to measure the <em>distance</em>
between two surfaces. Let’s start by reviewing more familiar distances:</p>
<h4 id="point-to-pointdistance">Point-to-point distance</h4>
<p>The usually Euclidean distance between <em>two points</em> <span class="math">\(\x\)</span> and <span class="math">\(\y\)</span> is the <span class="math">\(L²\)</span>
norm of their difference :</p>
<p><span class="math">\[
d(\x,\y) = ‖\x - \y‖.
\]</span></p>
<h4 id="point-to-projectiondistance">Point-to-projection distance</h4>
<p>When we consider the distance between a point <span class="math">\(\x\)</span> and some <em>larger</em> object <span class="math">\(Y\)</span> (a line,
a circle, a surface), the natural extension is to take the distance to the
closest point <span class="math">\(\y\)</span> on <span class="math">\(Y\)</span>:</p>
<p><span class="math">\[
d(\x,Y) = \inf_{\y ∈ Y} d(\x,\y).
\]</span></p>
<p>written in this way the
<a href="https://en.wikipedia.org/wiki/Infimum_and_supremum">infimum</a> considers all
possible points <span class="math">\(\y\)</span> and keeps the minimum distance. We may equivalently write
this distance instead as simply the point-to-point distance between <span class="math">\(\x\)</span> and
the <em>closest-point projection</em> <span class="math">\(P_Y(\x)\)</span>:</p>
<p><span class="math">\[
d(\x,Y) = d((\x,P_Y(\x)) = ‖\x - P_Y(\x)‖.
\]</span></p>
<p>If <span class="math">\(Y\)</span> is a smooth surface, this projection will also be an <a href="https://en.wikipedia.org/wiki/Projection_(linear_algebra)#Orthogonal_projections">orthogonal
projection</a>.</p>
<figure>
<img src="images/max-point-mesh.gif" alt="The distance between a surface $Y$ (light blue) and a point $\x$ (orange) is
determined by the closest point $P_Y(\x)$ (blue)" />
<figcaption>The distance between a surface <span class="math">\(Y\)</span> (light blue) and a point <span class="math">\(\x\)</span> (orange) is
determined by the closest point <span class="math">\(P_Y(\x)\)</span> (blue)</figcaption>
</figure>
<h3 id="directedhausdorffdistance">Directed Hausdorff distance</h3>
<p>We might be tempted to define the distance from surface <span class="math">\(X\)</span> to <span class="math">\(Y\)</span> as the
<em>infimum</em> of <em>point-to-projection</em> distances over all points <span class="math">\(\x\)</span> on <span class="math">\(X\)</span>:</p>
<p><span class="math">\[
D_\text{inf}(X,Y) = \inf_{\x ∈ X} ‖\x - P_Y(\x)‖,
\]</span></p>
<p>but this will not be useful for registering two surfaces: it will measure zero
if even just a single point of <span class="math">\(\x\)</span> happens to lie on <span class="math">\(Y\)</span>. Imagine the noses of
two faces touching at their tips.</p>
<p>Instead, we should take the <em>supremum</em> of <em>point-to-projection</em> distances over
all points <span class="math">\(\x\)</span> on <span class="math">\(X\)</span>:</p>
<p><span class="math">\[
D_{\overrightarrow{H}}(X,Y) = \sup_{\x ∈ X} ‖\x - P_Y(\x)‖.
\]</span></p>
<p>This surface-to-surface distance measure is called the <em>directed</em> <a href="https://en.wikipedia.org/wiki/Hausdorff_distance">Hausdorff
distance</a>. We may interpret
this as taking the worst of the best: we
let each point <span class="math">\(\x\)</span> on <span class="math">\(X\)</span> declare its shortest distance to <span class="math">\(Y\)</span> and then keep
the longest of those.</p>
<figure>
<img src="images/max-point-mesh-farthest.jpg" alt="The directed Hausdorff distance between from surface $X$ (light orange) to
another surface $Y$ (light blue) is determined by the point on $X$ (orange)
whose closest point on $Y$ (blue) is the farthest
away." />
<figcaption>The directed Hausdorff distance between from surface <span class="math">\(X\)</span> (light orange) to
another surface <span class="math">\(Y\)</span> (light blue) is determined by the point on <span class="math">\(X\)</span> (orange)
whose closest point on <span class="math">\(Y\)</span> (blue) is the farthest
away.</figcaption>
</figure>
<p>It is easy to verify that <span class="math">\(D_{\overrightarrow{H}}\)</span> will only equal zero if all
points on <span class="math">\(X\)</span> also lie exactly on <span class="math">\(Y\)</span>. </p>
<p>The converse is not true: if <span class="math">\(D_{\overrightarrow{H}}=0\)</span> there may still be
points on <span class="math">\(Y\)</span> that do not lie on <span class="math">\(X\)</span>. In other words, <em>in general</em> the directed
Hausdorff distance from surface <span class="math">\(X\)</span> to surface <span class="math">\(Y\)</span> will not equal the Hausdorff
distance from surface <span class="math">\(Y\)</span> to surface <span class="math">\(X\)</span>:</p>
<p><span class="math">\[
D_{\overrightarrow{H}}(X,Y) ≠ D_{\overrightarrow{H}}(Y,X).
\]</span></p>
<h4 id="directedhausdorffdistancebetweentrianglemeshes">directed Hausdorff distance between triangle meshes</h4>
<p>We can approximate a <em>lower bound</em> on the Hausdorff distance between two meshes
by densely sampling surfaces <span class="math">\(X\)</span> and <span class="math">\(Y\)</span>. We will discuss sampling methods,
later. For now consider that we have chosen a set <span class="math">\(\P_X\)</span> of <span class="math">\(k\)</span> points on <span class="math">\(X\)</span>
(each point might lie at a vertex, along an edge, or inside a triangle). The
directed Hausdorff distance from <span class="math">\(X\)</span> to another triangle mesh <span class="math">\(Y\)</span> must be
<em>greater</em> than the directed Hausdorff distance from this <a href="https://en.wikipedia.org/wiki/Point_cloud">point
cloud</a> <span class="math">\(\P_X\)</span> to <span class="math">\(Y\)</span>:</p>
<p><span class="math">\[
D_{\overrightarrow{H}}(X,Y) ≥
D_{\overrightarrow{H}}(\P_X,Y) = \max_{i=1}^k ‖\p_i - P_Y(\p_i)‖,
\]</span></p>
<p>where we should be careful to ensure that the projection <span class="math">\(P_Y(\p_i)\)</span> of the
point <span class="math">\(\p_i\)</span> onto the triangle mesh <span class="math">\(Y\)</span> might lie at a vertex, along an edge or
inside a triangle. </p>
<p>As our sampling <span class="math">\(\P_X\)</span> becomes denser and denser on <span class="math">\(X\)</span> this lower bound will
approach the true directed Hausdorff distance. Unfortunately, an efficient
<em>upper bound</em> is significantly more difficult to design.</p>
<h4 id="hausdorffdistanceforalignmentoptimization">Hausdorff distance for alignment optimization</h4>
<p>Even if it <em>were</em> cheap to compute, Hausdorff distance is difficult to
<em>optimize</em> when aligning two surfaces. If we treat the Hausdorff distance
between surfaces <span class="math">\(X\)</span> and <span class="math">\(Y\)</span> as an energy to be minimized, then only change to
the surfaces that will decrease the energy will be moving the (in general)
isolated point on <span class="math">\(X\)</span> and isolated point on <span class="math">\(Y\)</span> generating the maximum-minimum
distance. In effect, the rest of the surface does not even matter or effect the
Hausdorff distance. This, or any type of <span class="math">\(L^∞\)</span> norm, will be much more
difficult to optimize.</p>
<p>Hausdorff distance can serve as a validation measure, while we turn to <span class="math">\(L²\)</span>
norms for optimization.</p>
<h2 id="integratedclosest-pointdistance">Integrated closest-point distance</h2>
<p>We would like a distance measure between two surfaces that—like Hausdorff
distance—does not require a shared parameterization. Unlike Hausdorff
distance, we would like this distance to <em>diffuse</em> the measurement over the
entire surfaces rather than generate it from the sole <em>worst offender</em>. We can
accomplish this by replacing the <em>supremum</em> in the Hausdorff distance (<span class="math">\(L^∞\)</span>)
with a integral of squared distances (<span class="math">\(L²\)</span>). Let us first define a directed
<em>closest-point distance</em> from a surface <span class="math">\(X\)</span> to another surface <span class="math">\(Y\)</span>, as the
integral of the squared distance from every point <span class="math">\(\x\)</span> on <span class="math">\(X\)</span> to its
closest-point projection <span class="math">\(P_Y(\x)\)</span> on the surfaces <span class="math">\(Y\)</span>:</p>
<p><span class="math">\[
D_{\overrightarrow{C}}(X,Y) = \sqrt{\ ∫\limits_{\x∈X} ‖\x - P_Y(\x) ‖² \;dA }.
\]</span></p>
<p>This distance will only be zero if all points on <span class="math">\(X\)</span> also lie on <span class="math">\(Y\)</span>, but when
it is non-zero it is summing/averaging/diffusing the distance measures of all
of the points.</p>
<p>This distance is suitable to define a matching energy, but is not necessarily
welcoming for optimization: the function inside the square is non-linear. Let’s
dig into it a bit. We’ll define a directed <em>matching energy</em>
<span class="math">\(E_{\overrightarrow{C}}(Z,Y)\)</span> from <span class="math">\(Z\)</span> to <span class="math">\(Y\)</span> to be the squared directed
closest point distance from <span class="math">\(X\)</span> to <span class="math">\(Y\)</span>:</p>
<p><span class="math">\[
E_{\overrightarrow{C}}(Z,Y) = ∫\limits_{\z∈Z} ‖\z - P_Y(\z) ‖² \;dA =
∫\limits_{\z∈Z} ‖f_Y(\z) ‖² \;dA
\]</span></p>
<p>where we introduce the proximity function <span class="math">\(\f_Y:\R³→\R³\)</span> defined simply as the
vector from a point <span class="math">\(\z\)</span> to its closest-point projection onto <span class="math">\(Y\)</span>:</p>
<p><span class="math">\[
\f(\z) = \z - P_Y(\z).
\]</span></p>
<p>Suppose <span class="math">\(Y\)</span> was not a surface, but just a single point <span class="math">\(Y = \{\y\}\)</span>. In this
case, <span class="math">\(\f(\z) = \z - \y\)</span> is clearly linear in <span class="math">\(\z\)</span>.</p>
<p>Similarly, suppose <span class="math">\(Y\)</span> was an <a href="https://en.wikipedia.org/wiki/Plane_(geometry)">infinite
plane</a> <span class="math">\(Y = \{\y | (\y-\p)⋅\n =
0\}\)</span> defined by some point <span class="math">\(\p\)</span> on the plane and the plane’s unit normal vector
<span class="math">\(\n\)</span>. Then <span class="math">\(\f(\z) = ((\z-\p)⋅\n)\n)\)</span> is also linear in <span class="math">\(\z\)</span>.</p>
<p>But in general, if <span class="math">\(Y\)</span> is an interesting surface <span class="math">\(\f(\z)\)</span> will be non-linear; it
might not even be a continuous function.</p>
<figure>
<img src="images/closest-point-discontinuous.png" alt="" />
</figure>
<p>In optimization, a common successful strategy to minimize energies composed of
squaring a non-linear functions <span class="math">\(\f\)</span> is to
<a href="https://en.wikipedia.org/wiki/Linearization">linearize</a> the function about a
current input value (i.e., a current guess <span class="math">\(\z₀\)</span>), minimize the energy built
from this linearization, then re-linearize around that solution, and then
repeat. </p>
<p>This is the core idea behind <a href="https://en.wikipedia.org/wiki/Gradient_descent">gradient
descent</a> and the
<a href="https://en.wikipedia.org/wiki/Gauss–Newton_algorithm">Gauss-Newton</a> methods:</p>
<pre><code>minimize f(z)²
z₀ ← initial guess
repeat until convergence
f₀ ← linearize f(z) around z₀
z₀ ← minimize f₀(z)²
</code></pre>
<p>Since our <span class="math">\(\f\)</span> is a geometric function, we can derive its linearizations
<em>geometrically</em>.</p>
<h3 id="constantfunctionapproximation">Constant function approximation</h3>
<p>If we make the convenient—however unrealistic—assumption that in the
neighborhood of the closest-point projection <span class="math">\(P_Y(\z₀)\)</span> of the current guess
<span class="math">\(\z₀\)</span> the surface <span class="math">\(Y\)</span> is simply the point <span class="math">\(P_Y(\z₀)\)</span> (perhaps imagine that <span class="math">\(Y\)</span>
is makes a sharp needle-like point at <span class="math">\(P_Y(\z₀)\)</span> or that <span class="math">\(Y\)</span> is very far away
from <span class="math">\(\x\)</span>), then we can approximate <span class="math">\(\f(\z)\)</span> in the proximity of our current
guess <span class="math">\(\z₀\)</span> as the vector between the input point <span class="math">\(\z\)</span> and <span class="math">\(P_Y(\z₀)\)</span>:</p>
<p><span class="math">\[
\f(\z) \approx \f_\text{point}(\z) = \z-P_Y(\z₀)
\]</span></p>
<p>In effect, we are assuming that the surface <span class="math">\(Y\)</span> is <em>constant</em> function of its
parameterization: <span class="math">\(\y(u,v) = P_Y(\z₀)\)</span>.</p>
<p>Minimizing <span class="math">\(E_{\overrightarrow{C}}\)</span> iteratively using this linearization (or
rather <em>constantization</em>) of <span class="math">\(\f\)</span> is equivalent to the <a href="https://en.wikipedia.org/wiki/Gradient_descent">gradient
descent</a>. We have simply
derived our gradients geometrically.</p>
<h3 id="linearfunctionapproximation">Linear function approximation</h3>
<p>If we make make a slightly more appropriate assumption that in the neighborhood
of the <span class="math">\(P_Y(\z₀)\)</span> the surface <span class="math">\(Y\)</span> is a plane, then we can improve this
approximation while keeping <span class="math">\(\f\)</span> linear in <span class="math">\(\z\)</span>:</p>
<p><span class="math">\[
\f(\z) \approx \f_\text{plane}(\z) = ((\z-P_Y(\z₀))⋅\n) \n.
\]</span></p>
<p>where the plane that <em>best</em> approximates <span class="math">\(Y\)</span> locally near <span class="math">\(P_Y(\z₀)\)</span> is the
<a href="https://en.wikipedia.org/wiki/Tangent_space">tangent plane</a> defined by the
<a href="https://en.wikipedia.org/wiki/Normal_(geometry)">normal vector</a> <span class="math">\(\n\)</span> at
<span class="math">\(P_Y(\z₀)\)</span>.</p>
<p>Minimizing <span class="math">\(E_{\overrightarrow{C}}\)</span> iteratively using this linearization of
<span class="math">\(\f\)</span> is equivalent to the
<a href="https://en.wikipedia.org/wiki/Gauss–Newton_algorithm">Gauss-Newton</a> method. We
have simply derived our linear approximation geometrically.</p>
<p>Equipped with these linearizations, we may now describe an <a href="https://en.wikipedia.org/wiki/Mathematical_optimization#Optimization_algorithms">optimization
algorithm</a>
for minimizing the matching energy between a surface <span class="math">\(Z\)</span> and another surface
<span class="math">\(Y\)</span>.</p>
<h2 id="iterativeclosestpointalgorithm">Iterative closest point algorithm</h2>
<p>So far we have derived distances between a surface <span class="math">\(Z\)</span> and another surface <span class="math">\(Y\)</span>.
In our <em>rigid</em> alignment and registration problem, we would like to
<a href="https://en.wikipedia.org/wiki/Transformation_(function)">transform</a> one
surface <span class="math">\(X\)</span> into a new surface <span class="math">\(T(X) = Z\)</span> so that it best aligns with/matches
the other surface <span class="math">\(Y\)</span>. Further we require that <span class="math">\(T\)</span> is a rigid transformation:
<span class="math">\(T(\x) = \Rot \x + \t\)</span> for some rotation matrix <span class="math">\(\Rot ∈ SO(3) ⊂ \R^{3×3}\)</span>
(i.e., an <a href="https://en.wikipedia.org/wiki/Rotation_group_SO(3)">orthogonal matrix with determinant
1</a>) and translation vector
<span class="math">\(\t∈\R³\)</span>.</p>
<p>Our matching problem can be written as an optimization problem to find the best
possible rotation <span class="math">\(\Rot\)</span> and translation <span class="math">\(\t\)</span> that match surface <span class="math">\(X\)</span> to surface
<span class="math">\(Y\)</span>:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\t∈\R³,\ \Rot ∈ SO(3)}
∫\limits_{\x∈X} ‖\Rot \x + \t - P_Y(T(\x)) ‖² \;dA
\]</span></p>
<p>Even if <span class="math">\(X\)</span> is a triangle mesh, it is difficult to <em>integrate</em> over <em>all</em>
points on the surface of <span class="math">\(X\)</span>. <em>At any point</em>, we can approximate this energy by
<em>summing</em> over a point-sampling of <span class="math">\(X\)</span>:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\t∈\R³,\ \Rot ∈ SO(3)}
∑_{i=1}^k ‖\Rot \x_i + \t - P_Y(T(\x_i)) ‖²,
\]</span></p>
<p>where <span class="math">\(\X ∈ \R^{k×3}\)</span> is a set of <span class="math">\(k\)</span> points on <span class="math">\(X\)</span> so that each point <span class="math">\(\x_i\)</span>
might lie at a vertex, along an edge, or inside a triangle. We defer discussion
of <em>how</em> to sample a triangle mesh surface.</p>
<h3 id="pseudocode">Pseudocode</h3>
<p>As the name implies, the method proceeds by iteratively finding the closest
point on <span class="math">\(Y\)</span> to the current rigid transformation <span class="math">\(\Rot \x + \t\)</span> of each sample
point <span class="math">\(\x\)</span> in <span class="math">\(\X\)</span> and then minimizing the <em>linearized</em> energy to update the
rotation <span class="math">\(\Rot\)</span> and translation <span class="math">\(\t\)</span>. </p>
<p>If <span class="math">\(V_X\)</span> and <span class="math">\(F_X\)</span> are the vertices and faces of a triangle mesh surface <span class="math">\(X\)</span>
(and correspondingly for <span class="math">\(Y\)</span>), then we can summarize a generic ICP algorithm in
pseudocode:</p>
<pre><code>icp V_X, F_X, V_Y, F_Y
R,t ← initialize (e.g., set to identity transformation)
repeat until convergence
X ← sample source mesh (V_X,F_X)
P0 ← project all X onto target mesh (V_Y,F_Y)
R,t ← update rigid transform to best match X and P0
V_X ← rigidly transform original source mesh by R and t
</code></pre>
<h3 id="updatingtherigidtransformation">Updating the rigid transformation</h3>
<p>We would like to find the rotation matrix <span class="math">\(\Rot ∈ SO(3) ⊂ \R^{3×3}\)</span> and
translation vector <span class="math">\(\t∈\R³\)</span> that <em>best</em> aligns a given a set of points <span class="math">\(\X ∈
\R^{k×3}\)</span> on the source mesh and their current closest points <span class="math">\(\P ∈ \P^{k×3}\)</span>
on the target mesh. We have two choices for <em>linearizing</em> our matching energy:
point-to-point (gradient descent) and point-to-plane (Gauss-Newton).</p>
<figure>
<img src="images/max-point-to-point.gif" alt="ICP using the point-to-point matching energy linearization is slow to
converge." />
<figcaption>ICP using the point-to-point matching energy linearization is slow to
converge.</figcaption>
</figure>
<figure>
<img src="images/max-point-to-plane.gif" alt="ICP using the point-to-plane matching energy linearization is
faster." />
<figcaption>ICP using the point-to-plane matching energy linearization is
faster.</figcaption>
</figure>
<p>In either case, this is still a non-linear optimization problem. This time due
to the <a href="https://en.wikipedia.org/wiki/Constrained_optimization">constraints</a>
rather than the energy term. </p>
<p>We require that <span class="math">\(\Rot\)</span> is not just any 3×3 matrix, but a rotation matrix. We
can <em>linearize</em> this constraint, by assuming that the rotation in <span class="math">\(\Rot\)</span> will
be very small and thus well approximated by the identity matrix <span class="math">\(\I\)</span> plus a
skew-symmetric matrix:</p>
<p><span class="math">\[
\Rot \approx \I +
\left(\begin{array}{ccc}
0 & -γ & β \\
γ & 0 & -α \\
-β & α & 0 \\
\end{array}\right)
\]</span></p>
<p>where we can now work directly with the three scalar unknowns <span class="math">\(α\)</span>, <span class="math">\(β\)</span> and <span class="math">\(γ\)</span>.</p>
<h3 id="approximatepoint-to-pointminimizer">Approximate point-to-point minimizer</h3>
<p>If we apply our linearization of <span class="math">\(\Rot\)</span> to the <strong>point-to-point</strong> distance
linearization of the matching energy, our minimization becomes:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\t∈\R³, α, β, γ}
∑_{i=1}^k \left\|
\left(\begin{array}{ccc}
0 & -γ & β \\
γ & 0 & -α \\
-β & α & 0 \\
\end{array}\right)
\x_i + \t - \p_i \right\|^2.
\]</span></p>
<p>This energy is quadratic in the translation vector <span class="math">\(\t\)</span> and the linearized
rotation angles <span class="math">\(α\)</span>, <span class="math">\(β\)</span> and <span class="math">\(γ\)</span>. Let’s gather these degrees of freedom into a
vector of unknowns: <span class="math">\(\u = [α β γ \t^\transpose] ∈ \R⁶\)</span>. Then we can write our
problem in summation form as:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\u∈\R⁶}
∑_{i=1}^k \left\|
\left(\begin{array}{cccccc}
0 & x_{i,3} & -x_{i,2} & 1 & 0 & 0 \\
-x_{i,3} & 0 & x_{i,1} & 0 & 1 & 0 \\
x_{i,2} & -x_{i,1} & 0 & 0 & 0 & 1
\end{array}\right) \u +
\x_i - \p_i \right\|^2.
\]</span></p>
<p>This can be written compactly in matrix form as:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\u∈\R⁶}
\left\|
\underbrace{
\left(\begin{array}{cccccc}
0 & \X_3 & -\X_2 & \One & 0 & 0 \\
-\X_3 & 0 & \X_1 & 0 & \One & 0 \\
\X_2 & -\X_1 & 0 & 0 & 0 & \One
\end{array}\right)
}_{\A}
\u +
\left[\begin{array}{c}
\X_1-\P_1 \\
\X_2-\P_2 \\
\X_3-\P_3
\end{array}\right]
\right\|_F^2,
\]</span>
where we introduce the matrix <span class="math">\(\A ∈ \R^{3k × 6}\)</span> that gathers the columns
<span class="math">\(\X_i\)</span> of <span class="math">\(\X\)</span> and columns of ones <span class="math">\(\One ∈ \R^k\)</span>.</p>
<p>This quadratic energy is minimized with its partial derivatives with respect to
entries in <span class="math">\(\u\)</span> are all zero:</p>
<p><span class="math">\[
\begin{align}
\A^\transpose \A \u & = -\A^\transpose
\left[\begin{array}{c}
\X_1-\P_1 \\
\X_2-\P_2 \\
\X_3-\P_3
\end{array}\right]
, \\
\u & = \left(\A^\transpose \A\right)^{-1} \left(-\A^\transpose
\left[\begin{array}{c}
\X_1-\P_1 \\
\X_2-\P_2 \\
\X_3-\P_3
\end{array}\right]
\right),
\end{align}
\]</span></p>
<p>Solving this small 6×6 system gives us our translation vector <span class="math">\(\t\)</span> and the
linearized rotation angles <span class="math">\(α\)</span>, <span class="math">\(β\)</span> and <span class="math">\(γ\)</span>. If we simply assign </p>
<p><span class="math">\[
\Rot ← \M := \I +
\left(\begin{array}{ccc}
0 & -γ & β \\
γ & 0 & -α \\
-β & α & 0 \\
\end{array}\right)
\]</span></p>
<p>then our transformation will <em>not</em> be rigid. Instead, we should project <span class="math">\(\M\)</span>
onto the space of rotation matrices.</p>
<h4 id="recoveringapurerotationfromitslinearization">Recovering a pure rotation from its linearization</h4>
<blockquote>
<p>In an effort to provide an alternative from “Least-Squares Rigid Motion Using
SVD” [Sorkine 2009], this derivation purposefully <em>avoids</em> the <a href="https://en.wikipedia.org/wiki/Trace_(linear_algebra)">trace
operator</a> and its
various nice properties.</p>
</blockquote>
<p>If <span class="math">\(α\)</span>, <span class="math">\(β\)</span> and <span class="math">\(γ\)</span> are all small, then it may be safe to <em>interpret</em> these
values as rotation angles about the <span class="math">\(x\)</span>, <span class="math">\(y\)</span>, and <span class="math">\(z\)</span> axes respectively.</p>
<p>In general, it is better to find the closest rotation matrix to <span class="math">\(\M\)</span>. In other
words, we’d like to solve the small optimization problem:</p>
<p><span class="math">\[
\Rot^* = \argmin_{\Rot ∈ SO(3)} \left\| \Rot - \M \right\|_F^2,
\]</span>
where <span class="math">\(\|\X\|_F^2\)</span> computes the squared <a href="https://en.wikipedia.org/wiki/Matrix_norm#Frobenius_norm">Frobenius
norm</a> of the matrix
<span class="math">\(\X\)</span> (i.e., the sum of all squared element values. In MATLAB syntax:
<code>sum(sum(A.^2))</code>). We can expand the norm by taking advantage of the <a href="https://en.wikipedia.org/wiki/Associative_property">associativity
property</a> of the Frobenius
norm:
<span class="math">\[
\Rot^* = \argmin_{\Rot ∈ SO(3)} \left\| \M \right\|_F^2 + \left\| \Rot
\right\|_F^2 - 2 \left<\Rot, \M \right>_F,
\]</span>
where <span class="math">\(\left<\A, \B \right>_F\)</span> is the
<a href="https://en.wikipedia.org/wiki/Frobenius_inner_product">Frobenius inner
product</a> of <span class="math">\(\A\)</span> and
<span class="math">\(\B\)</span> (i.e., the sum of all per-element products. In MATLAB syntax:
<code>sum(sum(A.*B))</code>). We can drop the Frobenius norm
of <span class="math">\(\M\)</span> term (<span class="math">\(\left\| \M
\right\|_F^2\)</span>) because it is constant with respect to the unknown rotation
matrix <span class="math">\(\Rot\)</span>. We can also drop the Frobenius norm of <span class="math">\(\Rot\)</span> term because it
must equal one (<span class="math">\(\left\|
\Rot\right\|_F^2 = 1\)</span>) since <span class="math">\(\Rot\)</span> is required to be a orthonormal matrix
(<span class="math">\(\Rot ∈ SO(3)\)</span>). We can drop the factor of <span class="math">\(2\)</span> and flip the minus sign to
change our <em>minimization</em> problem into a <em>maximization</em> problem:
<span class="math">\[
\Rot^* = \argmax_{\Rot ∈ SO(3)} \left<\Rot, \M \right>_F
\]</span></p>
<p>We now take advantage of the <a href="https://en.wikipedia.org/wiki/Singular_value_decomposition">singular value
decomposition</a> of
<span class="math">\(\M = \U Σ \V^\transpose\)</span>, where <span class="math">\(\U, \V ∈ \R^{3×3}\)</span> are orthonormal matrices
and <span class="math">\(Σ∈\R^{3×3}\)</span> is a non-negative diagonal matrix:</p>
<p><span class="math">\[
\Rot^* = \argmax_{\Rot ∈ SO(3)} \left<\Rot,\U Σ \V^\transpose \right>_F.
\]</span></p>
<p>The Frobenius inner product will let us move the products by <span class="math">\(\V\)</span> and <span class="math">\(\U\)</span> from
the right argument to the left argument:</p>
<blockquote>
<p>Recall some linear algebra properties:</p>
<ol>
<li>Matrix multiplication (on the left) can be understood as <em>acting</em> on each
column: <span class="math">\(\A \B = \A [\B_1 \ \B_2 \ … \ \B_n] = [\A \B_1 \ \A \B_2 \ … \
\A \B_n]\)</span>,</li>
<li>The <a href="https://en.wikipedia.org/wiki/Kronecker_product">Kronecker product</a>
<span class="math">\(\I ꕕ \A\)</span> of the identity matrix <span class="math">\(\I\)</span> of size <span class="math">\(k\)</span> and a matrix <span class="math">\(\A\)</span> simply
repeats <span class="math">\(\A\)</span> along the diagonal k times. In MATLAB, <code>repdiag(A,k)</code>,</li>
<li>Properties 1. and 2. imply that the vectorization of a matrix product
<span class="math">\(\B\C\)</span> can be written as the Kronecker product of the #-columns-in-<span class="math">\(\C\)</span>
identity matrix and <span class="math">\(\B\)</span> times the vectorization of <span class="math">\(\C\)</span>:
<span class="math">\(\text{vec}(\B\C) = (\I ꕕ \B)\text{vec}(\C)\)</span>,</li>
<li>The transpose of a Kronecker product is the Kronecker product of
transposes: <span class="math">\((\A ꕕ \B)^\transpose = \A^\transpose ꕕ \B^\transpose\)</span>,</li>
<li>The Frobenius inner product can be written as a <a href="://en.wikipedia.org/wiki/Dot_product">dot
product</a> of
<a href="https://en.wikipedia.org/wiki/Vectorization_(mathematics)">vectorized</a>
matrices: <span class="math">\(<\A,\B>_F = \text{vec}(\A) ⋅ \text{vec}(\B) =
\text{vec}(\A)^\transpose \text{vec}(\B)\)</span>,</li>
<li>Properties 3., 4., and 5. imply that Frobenius inner product of a matrix
<span class="math">\(\A\)</span> and the matrix product of matrix <span class="math">\(\B\)</span> and <span class="math">\(\C\)</span> is equal to the
Frobenius inner product of the matrix product of the transpose of <span class="math">\(\B\)</span> and
<span class="math">\(\A\)</span> and the matrix <span class="math">\(\C\)</span>:
<span class="math">\(<\A,\B\C>_F = \text{vec}(\A)^\transpose \text{vec}(\B\C) =
\text{vec}(\A)^\transpose (\I ꕕ \B)\text{vec}(\C) =
\text{vec}(\A)^\transpose (\I ꕕ \B^\transpose)^\transpose \text{vec}(\C) =
\text{vec}(\B^\transpose\A)^\transpose \text{vec}(\C) =
<\B^\transpose \A,\C>_F\)</span>.</li>
</ol>
</blockquote>
<p><span class="math">\[
\Rot^* = \argmax_{\Rot ∈ SO(3)} \left<\U^\transpose \Rot \V, Σ \right>_F.
\]</span></p>
<p>Now, <span class="math">\(\U\)</span> and <span class="math">\(\V\)</span> are both
<a href="https://en.wikipedia.org/wiki/Orthogonal_matrix">orthonormal</a>, so multiplying
them against a rotation matrix <span class="math">\(\Rot\)</span> does not change its orthonormality. We can
pull them out of the maximization if we account for the reflection they <em>might</em>
incur: introduce <span class="math">\(Ω = \U^T\Rot\V ∈ O(3)\)</span> with <span class="math">\(\det{Ω} = \det{\U\V^\transpose}\)</span>.
This implies that the optimal rotation for the original probklem is recovered
via <span class="math">\(\Rot^* = \U Ω^* \V^\transpose\)</span>. When we move the <span class="math">\(\argmax\)</span> inside, we now
look for an orthonormal matrix <span class="math">\(Ω ∈ O(3)\)</span> that is a reflection (if
<span class="math">\(\det{\U\V^\transpose} = -1\)</span>) or a rotation (if <span class="math">\(\det{\U\V^\transpose} = 1\)</span>):</p>
<p><span class="math">\[
\Rot^* = \U \left( \argmax_{Ω ∈ O(3),\ \det{Ω} = \det{\U\V^\transpose}} \left<Ω, Σ \right>_F \right) \V^\transpose.
\]</span></p>
<p>This ensures that as a result <span class="math">\(\Rot^*\)</span> will be a rotation: <span class="math">\(\det{\Rot^*} = 1\)</span>.</p>
<blockquote>
<p>Recall that <span class="math">\(Σ∈\R^{3×3}\)</span> is a non-negative diagonal matrix of singular values
sorted so that the smallest value is in the bottom right corner.</p>
</blockquote>
<p>Because <span class="math">\(Ω\)</span> is orthonormal, each column (or row) of <span class="math">\(Ω\)</span> must have unit norm.
Placing a non-zero on the off-diagonal will get “killed” when multiplied by the
corresponding zero in <span class="math">\(Σ\)</span>. So the optimal choice of <span class="math">\(Ω\)</span> is to set all values to
zero except on the diagonal. If <span class="math">\(\det{\U\V^\transpose} = -1\)</span>, then we should set
one (and only one) of these values to <span class="math">\(-1\)</span>. The best choice is the bottom right
corner since that will multiply against the smallest singular value in <span class="math">\(∑\)</span> (add
negatively affect the maximization the least):</p>
<p><span class="math">\[
Ω^*_{ij} = \begin{cases}
1 & \text{ if $i=j\lt3$} \\\\
\det{\U\V^\transpose} & \text{ if $i=j=3$} \\\\
0 & \text{ otherwise.}
\end{cases}
\]</span></p>
<p>Finally, we have a formula for our optimal rotation:</p>
<p><span class="math">\[
\Rot = \U Ω^* \V^\transpose.
\]</span></p>
<blockquote>
<h3 id="closed-formpoint-to-pointminimizer">Closed-form point-to-point minimizer</h3>
<p><em>Interestingly</em>, despite the non-linear constraint on <span class="math">\(\Rot\)</span> there is actually
a closed-form solution to the point-to-point matching problem:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\t∈\R³,\ \Rot ∈ SO(3)} ∑_{i=1}^k ‖\Rot \x_i + \t - \p_i‖²,
\]</span></p>
<p>This is a variant of what’s known as a <a href="https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem">Procrustes
problem</a>, named
after a <a href="https://en.wikipedia.org/wiki/Procrustes">mythical psychopath</a> who
would kidnap people and force them to fit in his bed by stretching them or
cutting off their legs. In our case, we are forcing <span class="math">\(\Rot\)</span> to be perfectly
orthogonal (no “longer”, no "shorter).</p>
<h4 id="substitutingoutthetranslationterms">Substituting out the translation terms</h4>
<p>This energy is <em>quadratic</em> in <span class="math">\(\t\)</span> and there are no other constraints on
<span class="math">\(\t\)</span>. We can immediately solve for the optimal <span class="math">\(\t^*\)</span>—leaving <span class="math">\(\Rot\)</span> as an unknown—by
setting all derivatives with respect to unknowns in <span class="math">\(\t\)</span> to zero:</p>
<p><span class="math">\[
\begin{align}
\t^*
&= \argmin_{\t} ∑_{i=1}^k ‖\Rot \x_i + \t - \p_i‖² \\\\
&= \argmin_\t \left\|\Rot \X^\transpose + \t \One^\transpose - \P^\transpose\right\|^2_F,
\end{align}
\]</span>
where <span class="math">\(\One ∈ \R^{k}\)</span> is a vector ones. Setting the partial derivative with
respect to <span class="math">\(\t\)</span> of this
quadratic energy to zero finds the minimum:
<span class="math">\[
\begin{align}
0
&= \frac{∂}{∂\t} \left\|\Rot \X^\transpose + \t \One^\transpose - \P^\transpose\right\|^2_F \\\\
&= \One^\transpose \One \t + \Rot \X^\transpose \One - \P^\transpose \One,
\end{align}
\]</span></p>
<p>Rearranging terms above reveals that the optimal <span class="math">\(\t\)</span> is the vector aligning
the <a href="https://en.wikipedia.org/wiki/Centroid">centroids</a> of the points in <span class="math">\(\P\)</span>
and the points in <span class="math">\(\X\)</span> rotated by the—yet-unknown—<span class="math">\(\Rot\)</span>. Introducing
variables for the respective centroids <span class="math">\(\hat{\p} = \tfrac{1}{k} ∑_{i=1}^k
\p_i\)</span> and <span class="math">\(\hat{\x} = \tfrac{1}{k} ∑_{i=1}^k \x_i\)</span>, we can write the
formula for the optimal <span class="math">\(\t\)</span>:</p>
<p><span class="math">\[
\begin{align}
\t
&= \frac{\P^\transpose \One - \Rot \X^\transpose \One}{ \One^\transpose \One} \\\\
&= \hat{\p} - \Rot \hat{\x}.
\end{align}
\]</span></p>
<p>Now we have a formula for the optimal translation vector <span class="math">\(\t\)</span> in terms of the
unknown rotation <span class="math">\(\Rot\)</span>. Let us
<a href="https://en.wikipedia.org/wiki/Substitution_(algebra)">substitute</a> this formula
for all occurrences of <span class="math">\(\t\)</span> in our energy written in its original summation
form:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\Rot ∈ SO(3)} ∑\limits_{i=1}^k \left\| \Rot \x_i + ( \hat{\p} - \Rot\hat{\x}) - \p_i \right\|^2 \\\
\mathop{\text{minimize}}_{\Rot ∈ SO(3)} ∑\limits_{i=1}^k \left\| \Rot (\x_i - \hat{\x}) - (\p_i - \hat{\p}) \right\|^2 \\\\
\mathop{\text{minimize}}_{\Rot ∈ SO(3)} ∑\limits_{i=1}^k \left\| \Rot \overline{\x}_i - \overline{\p}_i \right\|^2 \\\\
\mathop{\text{minimize}}_{\Rot ∈ SO(3)} \left\| \Rot \overline{\X}^\transpose - \overline{\P}^\transpose \right\|_F^2,
\]</span></p>
<p>where we introduce <span class="math">\(\overline{\X} ∈ \R^{k × 3}\)</span> where the ith row contains the
<em>relative position</em> of the ith point to the centroid <span class="math">\(\hat{\x}\)</span>: i.e.,
<span class="math">\(\overline{\x}_i = (\x_i - \hat{\x})\)</span> (and analagously for <span class="math">\(\overline{\P}\)</span>).</p>
<p>Now we have the canonical form of the <a href="https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem">orthogonal procrustes
problem</a>. To
find the optimal rotation matrix <span class="math">\(\Rot^*\)</span> we will massage the terms in the
<em>minimization</em> until we have a <em>maximization</em> problem involving the <a href="https://en.wikipedia.org/wiki/Frobenius_inner_product">Frobenius
inner-product</a> of the
unknown rotation <span class="math">\(\Rot\)</span> and <a href="https://en.wikipedia.org/wiki/Covariance_matrix">covariance
matrix</a> of <span class="math">\(\X\)</span> and <span class="math">\(\P\)</span>:</p>
<p><span class="math">\[
\begin{align}
\Rot^*
&= \argmin_{\Rot ∈ SO(3)} \left\| \Rot \overline{\X}^\transpose - \overline{\P}^\transpose \right\|_F^2 \\\\
&= \argmin_{\Rot ∈ SO(3)} \left<\Rot \overline{\X}^\transpose - \overline{\P}^\transpose , \Rot \overline{\X}^\transpose - \overline{\P}^\transpose \right>_F\\\\
&= \argmin_{\Rot ∈ SO(3)} \left\| \overline{\X} \right\|_F^2 + \left\| \overline{\P} \right\|_F^2 - 2 \left<\Rot \overline{\X}^\transpose , \overline{\P}^\transpose \right>_F\\\\
&= \argmax_{\Rot ∈ SO(3)} \left<\Rot,\overline{\P}^\transpose\,\overline{\X}\right>_F\\\\
&= \argmax_{\Rot ∈ SO(3)} \left<\Rot,\M\right>_F\\
\end{align}
\]</span></p>
<p>Letting <span class="math">\(\M = \overline{\P}^\transpose\,\overline{\X}\)</span> we can now follow the
steps above using <a href="https://en.wikipedia.org/wiki/Singular_value_decomposition">singular value
decomposition</a> to
find the optimal <span class="math">\(\Rot\)</span>.</p>
</blockquote>
<h3 id="approximatepoint-to-planeminimizer">Approximate point-to-plane minimizer</h3>
<p>If we apply our linearization of <span class="math">\(\Rot\)</span> to the <strong>point-to-plane</strong> distance
linearization of the matching energy, our minimization is:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\t∈\R³, α, β, γ}
∑_{i=1}^k
\left(
\left(
\left(\begin{array}{ccc}
0 & -γ & β \\
γ & 0 & -α \\
-β & α & 0 \\
\end{array}\right)\x_i +
\x_i + \t - \p_i
\right)⋅\n_i
\right)^2.
\]</span></p>
<p>We can follow similar steps as above. Let’s gather a vector of unknowns: <span class="math">\(\u =
[α β γ \t^\transpose] ∈ \R⁶\)</span>. Then we can write our problem in summation form
as:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\u∈\R⁶}
∑_{i=1}^k \left(\n_i^\transpose
\left(\begin{array}{cccccc}
0 & x_{i,3} & -x_{i,2} & 1 & 0 & 0 \\
-x_{i,3} & 0 & x_{i,1} & 0 & 1 & 0 \\
x_{i,2} & -x_{i,1} & 0 & 0 & 0 & 1
\end{array}\right) \u +
\n_i^\transpose(\x_i - \p_i) \right)^2.
\]</span></p>
<p>This can be written compactly in matrix form as:</p>
<p><span class="math">\[
\mathop{\text{minimize}}_{\u∈\R⁶}
\left(
\left[\begin{array}{ccc} \text{diag}(\N_1) & \text{diag}(\N_2) & \text{diag}(\N_2)\end{array}\right]
\left(
\A
\u +
\left[\begin{array}{c}
\X_1-\P_1 \\
\X_2-\P_2 \\
\X_3-\P_3
\end{array}\right]\right)
\right)^2,
\]</span></p>
<p>where <span class="math">\(\N_i\)</span> is the ith column from the matrix of normals <span class="math">\(\N ∈ \R^{k × 3}\)</span>,
<span class="math">\(\text{diag}(\v)\)</span> <a href="https://en.wikipedia.org/wiki/Diagonal_matrix#Matrix_operations">creates a diagonal
matrix</a> from a
vector, and <span class="math">\(\A ∈ \R^{3k × 6}\)</span> is the same as above.</p>
<p>This energy is quadratic in <span class="math">\(\u\)</span> and can be solve by setting all partial
derivatives with respect to <span class="math">\(\u\)</span> to zero.</p>
<blockquote>
<h3 id="closed-formpoint-to-pointminimizer">Closed-form point-to-point minimizer</h3>
<p>To the best of my knowledge, no known closed-form solution exists. I am not
sure whether it <strong><em>can not</em></strong> exist or just whether no one has figured it out
(or they did and I just do not know about it).</p>
</blockquote>
<h2 id="uniformrandomsamplingofatrianglemesh">Uniform random sampling of a triangle mesh</h2>
<p>Our last missing piece is to sample the surface of a triangle mesh <span class="math">\(X\)</span> with <span class="math">\(m\)</span>
faces uniformly randomly. This allows us to approximate <em>continuous</em> integrals
over the surface <span class="math">\(X\)</span> with a summation of the integrand evaluated at a finite
number of randomly selected points. This type of <a href="https://en.wikipedia.org/wiki/Numerical_integration">numerical
integration</a> is called the
<a href="https://en.wikipedia.org/wiki/Monte_Carlo_method">Monte Carlo method</a>.</p>
<p>We would like our <a href="https://en.wikipedia.org/wiki/Random_variable">random
variable</a> <span class="math">\(\x ∈ X\)</span> to have a
uniform <a href="https://en.wikipedia.org/wiki/Probability_density_function">probability density
function</a> <span class="math">\(f(\x) =
1/A_X\)</span>, where <span class="math">\(A_X\)</span> is the <a href="https://en.wikipedia.org/wiki/Surface_area">surface
area</a> of the triangle mesh <span class="math">\(X\)</span>. We
can achieve this by breaking the problem into two steps: uniformly sampling in
a single triangle and sampling triangles non-uniformly according to their
area.</p>
<p>Suppose we have a way to evaluate a continuous random point <span class="math">\(\x\)</span> in a triangle
<span class="math">\(T\)</span> with uniform probability density function <span class="math">\(g_T(\x) = 1/A_T\)</span> <em>and</em> we have a
away to evaluate a discrete random triangle index <span class="math">\(T ∈ \{1,2,‥,m\}\)</span> with <a href="https://en.wikipedia.org/wiki/Probability_distribution#Discrete_probability_distribution">discrete
probability
distribution</a>
<span class="math">\(h(T) = A_T/A_X\)</span>, then the joint probability of evaluating a certain triangle
index <span class="math">\(T\)</span> and then uniformly random point in that triangle <span class="math">\(\x\)</span> is indeed
uniform over the surface:</p>
<p><span class="math">\[
h(T) g_T(\x) = \frac{A_T}{A_X} \frac{1}{A_T} = \frac{1}{A_X} = f(\x).
\]</span></p>
<h3 id="uniformrandomsamplingofasingletriangle">Uniform random sampling of a single triangle</h3>
<p>In order to pick a point uniformly randomly in a triangle with corners <span class="math">\(\v_1,
\v_2, \v_3 ∈ \R^3\)</span> we will <em>first</em> pick a point uniformly randomly in the
<a href="https://en.wikipedia.org/wiki/Parallelogram">parallelogram</a> formed by
reflecting <span class="math">\(\v_1\)</span> across the line <span class="math">\(\overline{\v_2\v_3}\)</span>:</p>
<p><span class="math">\[
\x = \v_1 + α (\v_2-\v_1) + β (\v_3 - \v_1)
\]</span></p>
<p>where <span class="math">\(α,β\)</span> are uniformly sampled from the unit interval <span class="math">\([0,1]\)</span>. If <span class="math">\(α+β > 1\)</span>
then the point <span class="math">\(\x\)</span> above will lie in the reflected triangle rather than the
original one. In this case, preprocess <span class="math">\(α\)</span> and <span class="math">\(β\)</span> by setting <span class="math">\(α←1-α\)</span> and
<span class="math">\(β←1-β\)</span> to reflect the point <span class="math">\(\x\)</span> back into the original triangle.</p>
<h3 id="area-weightedrandomsamplingoftriangles">Area-weighted random sampling of triangles</h3>
<p>Assuming we know how to draw a <em>continuous</em> uniform random variable <span class="math">\(γ\)</span> from
the unit interval <span class="math">\([0,1]\)</span>, we would now like to draw a <em>discrete</em> random
triangle index <span class="math">\(T\)</span> from the sequence <span class="math">\({1,‥,m}\)</span> with likelihood proportional to
the relative area of each triangle in the mesh.</p>
<p>We can achieve this by first computing the <a href="https://en.wikipedia.org/wiki/Running_total">cumulative
sum</a> <span class="math">\(\C ∈ \R^{m}\)</span> of the relative
areas:</p>
<p><span class="math">\[
C_i = ∑_{j=1}^i \frac{A_j}{A_X},
\]</span></p>
<p>Then our random index is found by identifying the first entry in <span class="math">\(\C\)</span> whose
value is greater than a uniform random variable <span class="math">\(γ\)</span>. Since <span class="math">\(\C\)</span> is sorted,
locating this entry can be done in <span class="math">\(O(\log m)\)</span>
<a href="https://en.wikipedia.org/wiki/Big_O_notation">time</a>.</p>
<h3 id="whyismycodesoslow">Why is my code so slow?</h3>
<p>Try profiling your code. Where is most of the computation time spent?</p>
<p>If you have done things right, the majority of time is spent computing
point-to-mesh distances. For each query point, the <a href="https://en.wikipedia.org/wiki/Computational_complexity_theory">computational
complexity</a> of
computing its distance to a mesh with <span class="math">\(m\)</span> faces is <span class="math">\(O(m)\)</span>.</p>
<p>This can be <em>dramatically</em> improved (e.g., to <span class="math">\(O(\log m)\)</span> on average) using an
<a href="https://en.wikipedia.org/wiki/Space_partitioning">space partitioning</a> data
structure such as a <a href="https://en.wikipedia.org/wiki/K-d_tree">kd tree</a>, a
<a href="https://en.wikipedia.org/wiki/Bounding_volume_hierarchy">bounding volume
hierarchy</a>, or
<a href="https://en.wikipedia.org/wiki/Bin_(computational_geometry)">spatial hash</a>.</p>
<h2 id="tasks">Tasks</h2>
<h3 id="readbouaziz2015">Read [Bouaziz 2015]</h3>
<p>This reading task is not directly graded, but it’s expected that you read and
understand sections 3.2–3.3 of Sofien Bouaziz’s PhD thesis “Realtime Face
Tracking and Animation” 2015. <em>Understanding</em> this may require digging into
wikipedia, other online resources or other papers.</p>
<h3 id="blacklist">Blacklist</h3>
<p>You may not use the following libigl functions:</p>
<ul>
<li><code>igl::AABB</code></li>
<li><code>igl::fit_rotations</code></li>
<li><code>igl::hausdorff</code></li>
<li><code>igl::point_mesh_squared_distance</code></li>
<li><code>igl::point_simplex_squared_distance</code></li>
<li><code>igl::polar_dec</code></li>
<li><code>igl::polar_svd3x3</code></li>
<li><code>igl::polar_svd</code></li>
<li><code>igl::random_points_on_mesh</code></li>
</ul>
<h3 id="whitelist">Whitelist</h3>
<p>You are encouraged to use the following libigl functions:</p>
<ul>
<li><code>igl::cumsum</code> computes cumulative sum</li>
<li><code>igl::doublearea</code> computes triangle areas</li>
<li><code>igl::per_face_normals</code> computes normal vectors for each triangle face</li>
</ul>
<h3 id="srcrandom_points_on_mesh.cpp"><code>src/random_points_on_mesh.cpp</code></h3>
<p>Generate <code>n</code> random points uniformly sampled <em>on</em> a given triangle mesh with
vertex positions <code>VX</code> and face indices <code>FX</code>. </p>
<h3 id="srcpoint_triangle_distance.cpp"><code>src/point_triangle_distance.cpp</code></h3>
<p>Compute the distance <code>d</code> between a given point <code>x</code> and the closest point <code>p</code> on
a given triangle with corners <code>a</code>, <code>b</code>, and <code>c</code>.</p>
<h3 id="srcpoint_mesh_distance.cpp"><code>src/point_mesh_distance.cpp</code></h3>
<p>Compute the distances <code>D</code> between a set of given points <code>X</code> and their closest
points <code>P</code> on a given mesh with vertex positions <code>VY</code> and face indices <code>FY</code>.
For each point in <code>P</code> also output a corresponding normal in <code>N</code>.</p>
<blockquote>
<p>It is OK to assume that all points in <code>P</code> lie inside (rather than exactly at
vertices or exactly along edges) for the purposes of normal computation in
<code>N</code>.</p>
</blockquote>
<h3 id="srchausdorff_lower_bound.cpp"><code>src/hausdorff_lower_bound.cpp</code></h3>