-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy patha_first_introduction_to_system_exploitation.tex
executable file
·2486 lines (1570 loc) · 269 KB
/
a_first_introduction_to_system_exploitation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass{article}
\usepackage[margin=40px]{geometry}
\usepackage{color}
\usepackage{float}
\usepackage{pagecolor,lipsum}
\usepackage[export]{adjustbox}
\usepackage{array}
\definecolor{pwnable-purple}{RGB}{97,40,86}
\definecolor{sky}{RGB}{146,165,241}
\definecolor{ubuntuback}{RGB}{45,9,34}
\definecolor{string}{RGB}{230,219,116}
\definecolor{comment}{RGB}{117, 113, 94}
\definecolor{normal}{RGB}{248, 248, 242}
\definecolor{identifier}{RGB}{166, 226, 46}
\definecolor{periwinkle}{RGB}{121, 119, 184}
\renewcommand{\familydefault}{\sfdefault}
\pagecolor{pwnable-purple}
\color{white}
\usepackage{graphicx}
\usepackage[hidelinks]{hyperref}
\usepackage{etoolbox}
\usepackage{atbegshi,ifthen}
\usepackage{listings}
\usepackage{wrapfig}
\usepackage{tikz}
\usetikzlibrary{backgrounds}
\hypersetup{
colorlinks,
linkcolor={sky},
citecolor={white},
urlcolor={sky}
}
\lstset{
aboveskip=3mm,
belowskip=3mm,
breaklines=true,
breakatwhitespace=true,
showstringspaces=false,
columns=fullflexible,
numbers=none,
numberstyle=\color{gray}\ttfamily,
basicstyle=\color{normal}\ttfamily,
keywordstyle=\color{magenta}\ttfamily,
commentstyle=\color{comment}\ttfamily,
stringstyle=\color{string}\ttfamily,
emph={format_string, eff_ana_bf, permute, eff_ana_btr},
emphstyle=\color{identifier}\ttfamily
tabsize=4,
backgroundcolor=\color{ubuntuback}\ttfamily,
linewidth=\textwidth,
frame=tb,
}
\makeatletter
\tikzset{%
fancy quotes/.style={
text width=\fq@width pt,
align=justify,
inner sep=1em,
anchor=north west,
minimum width=\linewidth,
},
fancy quotes width/.initial={.8\linewidth},
fancy quotes marks/.style={
scale=8,
text=white,
inner sep=0pt,
},
fancy quotes opening/.style={
fancy quotes marks,
},
fancy quotes closing/.style={
fancy quotes marks,
},
fancy quotes background/.style={
show background rectangle,
inner frame xsep=0pt,
background rectangle/.style={
fill=periwinkle,
rounded corners,
},
}
}
\newenvironment{fancyquotes}[1][]{%
\noindent
\tikzpicture[fancy quotes background]
\node[fancy quotes opening,anchor=north west] (fq@ul) at (0,0) {``};
\tikz@scan@one@point\pgfutil@firstofone([email protected])
\pgfmathsetmacro{\fq@width}{\linewidth - 2*\pgf@x}
\node[fancy quotes,#1] (fq@txt) at ([email protected] west) \bgroup}
{\egroup;
\node[overlay,fancy quotes closing,anchor=east] at ([email protected] east) {''};
\endtikzpicture}
\newcommand{\quotestart}[0] {
\begin{fancyquotes}
}
\newcommand{\quoteend}[0] {
\end{fancyquotes}
}
\newcommand{\displayimage}[1] {
\begin{figure}[H]
\centering
\includegraphics[max size={\textwidth}{0.3\textheight}]{#1}
\end{figure}
}
\newcommand{\displayimagecap}[2] {
\begin{figure}
\centering
\includegraphics[max size={\textwidth}{0.5\textheight}]{#1}
\caption{#2}
\end{figure}
}
\newcommand{\displayimagecaphere}[2] {
\begin{figure}[H]
\centering
\includegraphics[max size={\textwidth}{0.5\textheight}]{#1}
\caption{#2}
\end{figure}
}
\newcommand{\wrapimagerightcap}[2] {
\begin{wrapfigure}{r}{0.3\textwidth}\begin{center}\includegraphics[max size={0.3\textwidth}{\textheight}]{#1}\end{center}\caption{#2}\end{wrapfigure}}
\newcommand{\wrapimageleftcap}[2] {\begin{wrapfigure}{l}{0.3\textwidth}\begin{center}\includegraphics[max size={0.3\textwidth}{\textheight}]{#1}\end{center}\caption{#2}\end{wrapfigure}
}
\newcommand{\wrapimageright}[1] {
\begin{wrapfigure}{r}{0.3\textwidth}
\begin{center}
\includegraphics[max size={0.3\textwidth}{\textheight}]{#1}
\end{center}
\end{wrapfigure}
}
\newcommand{\wrapimageleft}[1] {
\begin{wrapfigure}{l}{0.3\textwidth}
\begin{center}
\includegraphics[max size={0.3\textwidth}{\textheight}]{#1}
\end{center}
\end{wrapfigure}
}
\newcommand{\xcode}[2]{\colorbox{ubuntuback}{\lstinline[language=#1]|#2|}}
\newcommand{\asm}[1]{\xcode{{[x86masm]assembler}}{#1}}
\newcommand{\code}[1]{\colorbox{ubuntuback}{\texttt{#1}}}
\newcommand{\arm}[1]{\code{#1}}
\newcommand{\gdb}[1]{\xcode{C}{#1}}
\newcommand{\exerciseopen}[2]{
\begin{tabular}{c p{0.9\textwidth}}
\includegraphics[max size={0.1\textwidth}{\textheight}]{#1} & \quotestart #2 \quoteend
\end{tabular}
}
\title{%
A First Introduction to System Exploitation \\
\large With Georgia Tech's "pwnable" challenges \\
\large Ben Herzog ([email protected])
}
\begin{document}
\date{}
\maketitle
\begin{figure}[H]
\centering
\includegraphics[max size={\textwidth}{\textheight}]{./images/cpr.png}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[max size={\textwidth}{\textheight}]{./images/pwnable_splash.png}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[max size={\textwidth}{\textheight}]{./images/all_challenges.png}
\end{figure}
\newpage
\tableofcontents
\newpage
\parskip 1.5ex
\renewcommand{\baselinestretch}{1.33}
\section{What is this?}
\wrapimagerightcap{./images/pizza_hack.jpg}{You're not "supposed to" do that.}
It's an introduction to that part of information security that your parents warned you about.
The field doesn't have a proper name, exactly, but we know it when we see it. Systems are understood in terms of naked primitives; convenient abstractions are stripped away, or are unavailable to begin with. The narrative about \href{https://en.wikipedia.org/wiki/functional_fixedness}{how the system is "supposed to" behave} is ignored with prejudice. These systems are then understood in more detail than before, and may even be made to behave in ways that they shouldn't. Terms like "reverse engineering", "exploitation", and the by-now-kitschy "hacking" seem to figure into it.
Unfortunately, abstractions are intuitive and legible, whereas the primitives they abstract away are neither of these things. This means that looking past abstractions is a terrible experience all around. Still, every now and then an excited newbie hears of the above and says, "that sounds great! Where do I get started?". An embarrassed expert then answers that there is no royal road, and they should probably follow this and that person on Twitter, and "go practice, like with CTFs or something idk".
This is sound advice, but we've seen people who follow it have a bad time. There's a pervasive mentality in the field that the tao that can be taught is not the true tao, and that the only way to learn is to \href{https://www.offensive-security.com/offsec/say-try-harder/}{Try Harder\textsuperscript{\textregistered}}. As a result, exercises challenge, but don't educate. They demand would-be solvers to summon a grab-bag of disparate knowledge, to surmount minor technical gotchas that a beginner won't recognize as such, and to know how to deal with pure caprice. Solutions -- if they exist -- are provided by third parties, are unbearably terse, and are devoid of any connection to a larger picture. Most of all, they fail to answer the most pertinent question: "How was \textit{I} supposed to think of that?". The student's only recourse is to search for an easier problem and pray vigorously that, working through it, they will finally grok the general principle. It's Try Harder\textsuperscript{\textregistered} all the way down.
\wrapimageleftcap{./images/complications.png}{Barrier to entry. Also, that $\left(\frac{N}{m}\right)$ should be ${N \choose m}$.}
The upside of this is that it's realistic. Reality is not a learning opportunity; it does demand disparate information, it does frustrate with minor technical gotchas and pure caprice, and it does often leave the student no choice but to Try Harder\textsuperscript{\textregistered}. Students should be ready to deal with problems in these harsh terms, which is why we have the ageless academic tradition of final exams. Still, imagine a course comprised entirely of final exams. No theory, no guided solutions, not even proper homework problems -- just the hapless student vs. their own ignorance. People would run for the hills at such a proposal, and for good reason. No one likes being told to run before they've walked.
The buck has to stop somewhere, with a teaching moment that doesn't assume that deep down the student already knows all the answers. Georgia Tech's "Toddler's Bottle" exercises are the closest thing we've found to the missing homework problems: exercises which distil a concept, simplify its presentation, and filter out distractions. This guide, then, is an attempt to complete the puzzle: the missing guided solutions and lecture notes that walk the reader through the challenges, and try to provide context and perspective.
The buck stops here. Hopefully.
\section{What do I Need to Know Coming in?}
\wrapimagerightcap{./images/ubuntu.png}{Ubuntu Linux Desktop}
We tried to trim the list of prerequisites as much as possible and as much as time allowed. Still, some pieces of knowledge turned out to be too fundamental to route around, and too hefty to be transmitted in a digression. Throughout this document, we assume that:
\begin{itemize}
\item You have a working \textbf{Virtual Machine} with a working \textbf{Linux Distribution}, such as \href{https://ubuntu.org}{Ubuntu Linux}, installed
\item You know \textbf{C language} at the 101 level -- enough to know when to use \code{\&var} instead of \code{var} and how 2's complement works
\item You know \textbf{C++ language} at the 101 level -- enough to know what polymorphism is, what inheritance is and what virtual functions are for
\item You know enough \textbf{Python} to comfortably read it and write in it
\item You know about \textbf{binary and hexadecimal representation}, and how to convert between those and decimal
\end{itemize}
Probably the biggest hurdle not on this list is knowing how to use a debugger and a disassembler. We tried, we \textit{really} tried, to put together a proper tutorial to bring the reader up to speed on how to use both; but these are very hefty subjects, and if you've had zero experience with a disassembler or a debugger up until now, some of the exercises may get somewhat frustrating. If mid-exercise you feel that this is the bottleneck holding you back, you probably want to put aside the problem and first complete a dedicated tutorial on these subjects.
\section{Basic Linux Commands}
In our Linux VM, let's create a new terminal (\code{ctrl+shift+t}), then try out the commands below and get a feel for how they work.
\wrapimagerightcap{./images/sandwich.png}{\href{https://xkcd.com/149/}{xkcd \#149, "sandwich"}}
\begin{itemize}
\item \xcode{bash}{pwd} - print the current working directory.
\item \xcode{bash}{ls} - list files and directories in the current directory.
\item \xcode{bash}{cd dirname} - "enter" the directory \code{dirname}, so it becomes the new current directory. To go back up in the directory structure, use the command \xcode{bash}{cd ..}. It's also possible to \xcode{bash}{cd} directly to a completely different path, e.g. \xcode{bash}{cd /tmp}; to go back to the home directory, do \xcode{bash}{cd \~}.
\item \xcode{bash}{cat filename} - print the contents of the file \code{filename} to the terminal.
Can be given several files (\xcode{bash}{cat file1 file2 file3...}) and will print all of them in succession.
\item \xcode{bash}{cp filename1 filename2} - create a copy of \code{filename1}; the copy will have the name \code{filename2}.
\item \xcode{bash}{mv filename1 filename2} - move the file \code{filename1} to a new location \code{filename2}.
\item \xcode{bash}{mkdir dirname} - create a new empty directory with the name \code{dirname}. The directory will be created in the current working directory.
\item \xcode{bash}{rm filename} - delete the file \code{filename}.
\item \xcode{bash}{vim} - a powerful text editor which has a "write mode" and a "command mode". To start typing, press \xcode{bash}{i}; this starts write mode. To use commands (such as save, quit, etc) go back into command mode by pressing \code{esc}. Once in command mode, to save do: \xcode{bash}{:w} + return and to quit do: \xcode{bash}{:q!} + return. If \xcode{bash}{vim} is a bit too much, try \xcode{bash}{nano} instead.
\item \xcode{bash}{chmod a+x filename} - add execution privileges to a file for everyone, so that any user can execute the file. \xcode{bash}{chmod} can also be used to add/remove read privileges (+r, -r) and write privileges (+w, -w); and can be used to modify permissions only for the file owner or file group (with u+ or g+ instead of a+). More on this below, under "access control".
\item \xcode{bash}{sudo} - execute a command with administrator privileges. Using this command causes the OS to prompt for the current user's account password.
\item \xcode{bash}{groups} - print the list of groups the current user belongs to.
\item \xcode{bash}{sudo apt install python3} - installs Python3 on the machine (chances are it's installed already, and the command will quit with a note explaining this). Other programs can be installed similarly, by specifying their name instead of \code{python3}. Since it invokes \xcode{bash}{sudo}, this command requires the current user to be admin on the machine, and will prompt for the account password. \code{apt} is the package manager for Ubuntu, Linux Mint and Debian; users of other Linux distributions (such as Arch Linux) should use whichever package manager is included with it.
\item \xcode{bash}{python3} - starts a python shell. Try \code{2+2} and see how the shell responds. It's possible to exit the shell by typing \code{quit()}. It's also possible to do \xcode{bash}{python3 filename.py}; this will run all the commands in \code{filename.py} through the Python interpreter.
\end{itemize}
Before we're done, one neat trick that's useful to know is backtick substitution. If a bash command is placed in backticks (\xcode{bash}{`}), bash will replace it with the output it generates if itself invoked as a bash command. So, for example, in the command \xcode{bash}{cp /bin/cat `pwd`}, bash will expand \xcode{bash}{`pwd`} to the actual current directory, and create a copy of \xcode{bash}{/bin/cat} there.
\section{SSH and SCP}
It is possible to connect from a linux machine $M_1$ to another linux machine $M_2$, and run commands on $M_2$ as if sitting at the keyboard of $M_2$ in person. To do this, one must know $M_2$'s IP address, know which port its SSH server is running on, and have valid credentials for an $M_2$ user account.
This is done by going to $M_1$ and executing: \xcode{bash}{ssh [email protected] -p 1001} where \code{1.2.3.4} should be $M_2$'s IP address, \code{1001} the SSH port and \code{user} the username at $M_2$. The remote machine will issue a password prompt for \code{user}. If verification is successful, an SSH session is established and the user at $M_1$ can now issue commands remotely to $M_2$. To stop issuing commands to $M_2$ and go back to the $M_1$ command line, one should use the command \xcode{bash}{exit}.
Apart from starting an SSH session, it is also possible to copy files from $M_1$ to $M_2$ and back, by using $M_1$'s command line. This is done using the \xcode{bash}{scp} command. To copy the file \code{/home/bob/grocery\_list.txt} from $M_2$ to $M_1$, execute the command \xcode{bash}{scp -P 1001 1.2.3.4:/home/bob/grocery\_list.txt ./grocery\_list.txt}. To copy the file back to the remote $M_2$, execute: \xcode{bash}{scp -P 1001 ./grocery\_list.txt 1.2.3.4:/home/bob/grocery\_list.txt}.
\wrapimageleft{./images/ssh.png}
The server at \code{pwnable.kr} runs an SSH server in port 2222; one of the accounts on that machine has username \code{fd} and password \code{guest}. Try to establish an SSH session using that server and that account:
\xcode{bash}{ssh [email protected] -p 2222}
When prompted for a password, write "guest" and hit return (the password will not appear on screen). Verify that the SSH session has been successfully established. Create a directory under \code{/tmp/}:
\xcode{bash}{mkdir /tmp/an\_original\_dir\_name}
(use something original instead of \code{an\_original\_dir\_name}; the command will fail if someone else has already created a directory by that name)
Exit the session with \xcode{bash}{exit}.
Now, try to copy the file \code{/home/fd/fd.c} from the remote server to the current directory:
\xcode{bash}{scp -P 2222 [email protected]:/home/fd/fd.c .}
Another password prompt will appear (it's still \code{guest}). Verify that a copy of \code{fd.c} is now present on the local machine.
Try to send a file back to the pwnable server.
\begin{lstlisting}[language=bash]
echo "testing" > test.txt
scp -P 2222 ./test.txt [email protected]:/tmp/an_original_dir_name/test.txt
\end{lstlisting}
Start another SSH session and verify that a copy of \code{test.txt} is really there.
We wish we could just breezily explain how to troubleshoot network issues, \textit{just in case} there are any. Alas, if we started, we'd get to the actual material on page 50 or so. If an SSH connection fails and you've never resolved a similar issue on your own before, go ask someone for help.
\section{Access Control}
Nearly every challenge on pwnable.kr is of the form: "here's a program; get clever with it and make it access the flag". If you honestly don't care why you can't just read the flag directly on your own, and you're willing to deal with plenty of trial and error when trying to read/create files and directories, then in theory you can go ahead and skip this section. In practice, we suggest you don't.
\subsection{File Permissions}
\wrapimageright{./images/file_permissions.png}
As we've mentioned above, the concept of \textit{boundaries} is deeply interwoven into digital system best practices. By default, Alice should not have access to documents created by Bob. By default, if Bob visits a website the website should not have the ability to meddle with Bob's \code{My Documents} folder, and his web browser should not have the ability to install another operating system on his machine. In its most idealized form, this is called the \textbf{Principle of Least Privilege} (PLP): entities should have exactly the privileges necessary to carry out their duties, and no more.
We live in the real, non-idealized world, where violations of the PLP are a fact of life. Still, most digital systems do have a form of access control -- a system for determining who has the right to do what. In the real world, entities do not always have the least possible privilege, but they are typically subject to just enough limitations to prevent anything outright insane.
Linux, in particular, has a certain system in place that limits access to files. This may not sound like much, except in Linux everything is a file, so it's really more accurate to say that this system limits access to everything.
To see this system in action, on the Linux machine, execute the following mysterious commands. Each invocation of \xcode{bash}{sudo} might require your admin account password, which you picked when you installed the OS.
\begin{lstlisting}[language=bash]
cd ~
mkdir ac_test
cd ac_test
sudo groupadd characters
sudo useradd alice
sudo usermod -a -G characters alice
sudo passwd alice
#when prompted for password
drinkme
#when prompted again
drinkme
sudo useradd bob
sudo usermod -a -G characters bob
sudo passwd bob
#when prompted for password
fixit
#when prompted again
fixit
touch jabberwocky.txt
echo "twas brillig etc" > jabberwocky.txt
sudo chown alice jabberwocky.txt
sudo chgrp characters jabberwocky.txt
touch collab_diary.txt
echo "today was a great day" > collab_diary.txt
sudo chown alice collab_diary.txt
sudo chgrp characters collab_diary.txt
sudo chmod g+w collab_diary.txt
touch yes_we_can.txt
sudo chown bob yes_we_can.txt
sudo chgrp characters yes_we_can.txt
sudo chmod og+w yes_we_can.txt
\end{lstlisting}
When done, execute \xcode{bash}{ls -l}. The output should look like this:
\displayimage{./images/access_control_file_list.png}
\wrapimageright{./images/file_permissions_how_to_read.png}
There's some amount of information here to unpack. First, we got quite a lot more output than with a simple \xcode{bash}{ls}; this is because we specified the \xcode{bash}{-l} flag, which causes \xcode{bash}{ls} to output additional information. This additional information includes each file's access permissions, which we are interested in.
The access permissions for each file are the very first blob of characters on the line that ends with the file name. So, in the example above, \code{collab\_diary.txt} has access permissions \xcode{bash}{-rw-rw-r--}.
Here's how to read these permissions:
(Ignore the first - for the time being.)
\begin{itemize}
\item The \textbf{owner} of the file can...
\item \xcode{bash}{r}ead it
\item \xcode{bash}{w}rite to it
\item \xcode{bash}{-} but not execute it
\item The \textbf{group} associated with the file can...
\item \xcode{bash}{r}ead it
\item \xcode{bash}{w}rite to it
\item \xcode{bash}{-} but not execute it
\item Anyone \textbf{else} can...
\item \xcode{bash}{r}ead the file
\item \xcode{bash}{-} but not write to it
\item \xcode{bash}{-} or execute it
\end{itemize}
The access permissions are followed by the mysterious number 1 (leave that alone for the time being, too). Following \textit{that}, one can see the user who owns the file and the group associated with the file.
For instance, in the example above:
\begin{itemize}
\item Alice, and any member of "characters", can read \code{collab\_diary.txt} and write to it. Anyone else can only read the diary, but not write to it.
\item Alice can read \code{jabberwocky.txt} and write to it; anyone else can only read it, whether they are a member of "characters" or not.
\item Anyone can read \code{yes\_we\_can.txt} or write to it.
\end{itemize}
Feel free to experiment with the various files and permissions. It's possible to switch users with the \xcode{bash}{su} command - for example, \xcode{bash}{su alice} (this will provoke a prompt for Alice's password; this is \code{drinkme}. Bob's password is similarly \code{fixit}). To resume using the main user account, use the command \xcode{bash}{exit}. Try to read and modify the various files while acting as that account, then as Alice and as Bob, and see whether the results match your expectations. Note that we've made both Alice and Bob members of the "characters" group.
Just as files have read, write and execute permissions, so do directories. A user having "read" permission for a directory means they are allowed to see its contents; "write" permission means they are allowed to create and remove files from it; and "execute" permission means they are allowed to \xcode{bash}{cd} into it.
\subsection{SUID bit}
\wrapimageright{./images/baton.png}
Here's a trick question. Suppose Alice \textit{executes} a file owned by Bob; which permissions should the program have -- Alice's or Bob's?
If we answer "Alice's" then we have a problem. Suppose the file was the command \xcode{bash}{passwd}; in this case, Alice is trying to change her account password, and Bob is the system administrator. Now \xcode{bash}{passwd} will run with Alice's permissions. But all account passwords are stored in the same file. Alice can't read it, or write to it (if she can, that's a serious security breach). Therefore, if \xcode{bash}{passwd} is run with Alice's privileges, it can't do its job of changing Alice's password.
But if we answer "Bob's", then we \textit{also} have a problem. Suppose that Bob has created a simple text editor. He owns the file for the text editor executable. If Alice tries to use the text editor, she'll find that instead of her own files, she can only edit Bob's files! This is not good news for either Alice's productivity or Bob's privacy.
Because of the above issues, the answer is not "Alice" or "Bob"; the answer is "Bob should get to decide, on a per-file basis". This is implemented via a feature called "suid". Executable files will, by default, run with the permissions of whoever executed them. But if suid is on for that file, and it is a \textbf{binary} file, then it will run with the permissions of the file owner. We therefore expect that \xcode{bash}{passwd} should have suid on, and in fact, it does:
\displayimage{./images/suid_passwd.png}
The \xcode{bash}{s} (instead of \xcode{bash}{x}) in the permissions for the file owner indicates that suid is on for this file.
suid can be turned on or off for a file using \xcode{bash}{chmod} -- with \xcode{bash}{u+s} or \xcode{bash}{u-s} respectively. Again, this only applies to binary files - not scripts!
Let's test out the way SUID works. Execute the following commands:
\begin{lstlisting}[language=bash]
cd ~/ac_test
echo "Alice's secret" > ./alice_secret.txt
sudo chown alice ./alice_secret.txt
sudo chgrp characters ./alice_secret.txt
sudo chmod og-rwx ./alice_secret.txt
cp /bin/cat .
sudo chown alice ./cat
sudo chgrp characters ./cat
sudo chmod a+x ./cat
cp /bin/cat ./cat_suid
sudo chown alice ./cat_suid
sudo chgrp characters ./cat_suid
sudo chmod a+x ./cat_suid
sudo chmod u+s ./cat_suid
ls -l
\end{lstlisting}
The output should look something like the below:
\displayimage{./images/alice_bob_suid.png}
Take a moment to guess the output of the following commands:
\begin{lstlisting}[language=bash]
su bob
fixit #when prompted for password
./cat alice_secret.txt
./cat_suid alice_secret.txt
\end{lstlisting}
The call using \xcode{bash}{cat} fails, because it runs with Bob's permissions and he does not have read permissions for \code{alice\_secret.txt}. The call with \code{cat\_suid} succeeds, because due to suid being on, it runs with Alice's permissions.
\section{Linux File Descriptors}
A lot of things in Linux are files. Directories are files. Hard disks are files. Processes are files. Internet connections are files. \textit{Files are files}.
\wrapimageright{./images/file_descriptors.png}
We've seen how to work with files on Linux: creating files, modifying the contents of files, copying and moving files. It's a pretty intuitive API, so of course it's the result of an abstraction on top of an abstraction on top of thirty other abstractions. Let's ask the question bluntly: The command line shell we are using, \code{bash}, was written in C language -- so how did its author create \code{bash} without already having \code{bash} to handle all the file operations?
The answer is that C (and assembly) programs use a filesystem API which is a layer of abstraction down from \code{bash}. Consider the following C program:
\lstinputlisting[language=C]{./code/fd_demo.c}
Every process in Linux (and Windows, too) has a number which is its process id (pid). This number uniquely identifies the process. The above program, when launched, will display the pid of its own process. Launch the program, then on a different terminal execute the following command: \xcode{bash}{ls -la /proc/<pid>/fd} where \code{<pid>} should be replaced with the actual pid that the process reported. The output should look something like this:
\displayimage{./exercises/00_fd/fd_before_open.png}
On the original program terminal, press return. Now on the other terminal run \xcode{bash}{ls -la /proc/<pid>/fd} again. The specific details will vary, but there should be something in the ouptut that wasn't there before:
\displayimage{./exercises/00_fd/fd_after_open.png}
The reader might wonder what file descriptors 0, 1 and 2 are; these are the process' \textit{standard input}, \textit{standard output} and \textit{standard error} streams, respectively. These three file descriptor numbers are always the same for every process (so, for example, the standard input is always descriptor number 0). The standard output is where program output is written, and the standard input is where input is taken from. Right now, both are bound to the same value: \code{/dev/pts/7}, which is the terminal that launched the C program (your value will probably be different). Let's do an experiment:
\xcode{bash}{echo "hi" > /dev/pts/7}
Go look at the terminal that spawned the C program; the word \code{hi} should appear there.
\section{Challenge 0x00: fd}
\exerciseopen{./images/00_fd.png}{Mommy! What is a file descriptor in Linux?}
Visit pwnable.kr, create an account and click "play" at the top menu. Choose the first challenge in the "toddler's bottle" category -- \code{fd}. We're given the IP address, port and credentials for an SSH session.
After establishing an SSH session with the correct parameters, the following message should appear:
\displayimage{./exercises/00_fd/fd_welcome_message.png}
Following the logon, the session sets the current directory to the home folder of the remote user whose credentials were used (in this case, the user is \code{fd} and so the current directory is set initially to \code{/home/fd}). Let's look at the files in that directory and their permissions:
\displayimage{./exercises/00_fd/fd_permissions.png}
Since we're playing Capture The Flag, we turn our interest to the file labeled "flag". Unfortunately, based on its permissions, owner and group, to read it we need to be logged in as \code{fd\_pwn} or belong to the \code{root} group.
Or maybe there's another option? We have execute permissions for the file \code{fd}, which belongs to the user \code{fd\_pwn} \textit{and} has suid turned on. In other words, if we run \code{fd} and convince it to read and output the flag for us, we win. This is a pretty standard setup for CTF exercises; the whole challenge revolves around cajoling \code{fd} to do our bidding in some way -- which may be easy, difficult, head-against-the-wall difficult, or even impossible (though in that last case, it's less of a proper exercise and more of an exercise in futility).
How do we get \code{fd} to print the contents of "flag" for us, then? Conveniently, we have been given the source code for the \code{fd} program in the file \code{fd.c}:
\lstinputlisting[language=C]{./exercises/00_fd/fd.c}
The program takes the first parameter as a number, subtracts \code{0x1234} from it and reads 32 bytes from the file descriptor with that number. If the result is \code{LETMEWIN}, the program prints the flag.
When CTFing (and solving problems in general), a precise concept of what we \textit{can't} do and what we \textit{don't} know can be a valuable asset. For example, when the \code{fd} program runs on the remote server, we can't:
\begin{enumerate}
\item access \code{fd}'s list of open file descriptors; we don't have read access to \code{/proc/}
\item directly interfere with the list of open file descriptors; we don't have write access to \code{/proc/}, either
\item modify our choice of \code{argv[1]} retroactively after the program has been run
\item get the \code{fd} process to open a file for us, thus adding an entry to its file descriptor table; the program source code mandates no such action
\end{enumerate}
Problems 1 and 3, in themselves, are solvable -- somewhat. We've seen that when a process opens a new file, that file is assigned the next available free file descriptor. Therefore, typically the first file opened has descriptor 3, the next one 4, and so on. This implies that if the program source had contained an extra line:
\begin{lstlisting}[language=C]
fd = open("read_from_here.txt",O_RDONLY);
\end{lstlisting}
We could guess that \code{read\_from\_here.txt} would be assigned a file descriptor of 3. We could then solve the challenge the following way: create the file \code{read\_from\_here.txt} in advance with the contents \code{LETMEWIN}, and then execute \code{fd 4663}; 4663 is chosen because \xcode{bash}{4663 = 0x1234 + 3}. The program \code{fd} would compute \xcode{bash}4663-0x1234 = 3\xcode{bash}, read 32 bytes from the file associated with file descriptor 3 (that's \code{read\_from\_here.txt}), see that it is the correct value \code{LETMEWIN} and print the flag.
There are two problems with the above solution draft. First, we will run into permission issues when trying to write to \code{read\_from\_here.txt}; we don't have write permissions for the \code{/home/fd/} directory on the remote server. Second of all and more importantly, this is all a hypothetical situation. In the actual program, there is no \code{read\_from\_here.txt}. Our idle dream bubble goes poof, and we must face the problem in its original, actual form.
What \textit{do} we know for sure about the state of the \code{fd} process' file descriptor list, then? Not much. All we know for sure is that the 3 first descriptors (0, 1 and 2), representing the standard input, output and error, will all be bound to the \code{/dev/pts} terminal that spawned the program. We are effectively forced to choose an \code{argv[1]} value of \code{0x1234+0}, \code{0x1234+1} or \code{0x1234+2}; any other value will be a futile shot in the dark. We've seen that by default, processes don't \textit{have} open file descriptors other than these three.
Translated into English, this means we can tell the program to read the 32 bytes either from the standard input, the standard output, or the standard error stream. And, once we've said that out loud, that first option should sound pretty attractive to us. "Reading from the standard input" is a very common C idiom; it's what happens, for example, when a C program calls \code{getchar()} or \code{scanf()}. If we can get the program to "read 32 bytes from the standard input", what this means in practice is that the program will halt and wait for us to input those 32 bytes manually from the terminal!
This insight yields the solution; simply run \code{fd 4660} (since \code{4660 = 0x1234}). Instead of chastising us to learn about file IO, the program will seem to halt and wait for our input. Write \code{LETMEWIN}, press return and the program will print the flag.
Generally speaking, when a CTF challenge is doing something strange and apparently meaningless with its given input, it \textit{may} be the case that the answer is just \textit{very simple} and the author didn't want anyone to stumble upon it blindly. Without the artificial factor of \code{0x1234} introduced here, it's very feasible to imagine people just trying to run \xcode{bash}{fd 0} to see what happens.
In this sort of situation, if we're looking to get the flag and do zero learning, we can try to send input that \textit{after} the meaningless transformation becomes exactly the sort of thing that someone might mindlessly type. It's a shot in the dark, but it's very worth it if and when it pays off (it works for one of the later exercises in the sequence, so stay tuned).
\section{Hexadecimal Representation, Special Characters and the xxd program}
There are 256 possible bytes, all of which we may need to provide to various programs as input, and fewer than half of which appear on our keyboard. This is a problem.
One way of getting around the problem is inserting special characters into the terminal by pressing \code{ctrl+shift+u}, followed by the desired unicode hex value, and then return (try this in the terminal right now with the value \code{41}, and verify that this inserts an \code{A} into the command line). But we don't generally recommend this approach. First of all, most of the time, we'll need values to be ascii-encoded and not 2-byte unicode values. Second, imagine inputting a sequence of 300 special characters into the terminal with this method, again and again, in order to debug an issue with how a program reacts to the input! It's a sure way to go insane.
The reader might ask, "can't I just construct the input with a script and send it to the process via, I don't know, some Python module or another?". That's an excellent question; this "some module or another" is called \code{pexpect}, and yes, it would solve the immediate problem. But considering where we are right now in the challenge sequence, \code{pexpect} is overkill. It's bad form to reach for complex tools when dealing with simple problems.
Instead, we're going to tackle the issue with backtick substitution (we already saw that trick under "basic linux commands"), a bash feature called \textit{IO redirection} and a tool called \code{xxd}.
\code{xxd} converts input from raw bytes to hexadecimal representation. The easiest way to understand how it works is to see an example. Create a new file, named \code{xxd\_demo}, with the following contents:
\lstinputlisting{./code/xxd_demo}
Now run:
\xcode{bash}{xxd xxd\_demo > xxd\_demo.hex}
This should create a new file with the name \code{xxd\_demo.hex}. Open it in a text editor:
\lstinputlisting{./code/xxd_demo.hex}
Now run:
\xcode{bash}{xxd -r xxd\_demo.hex > xxd\_demo\_2}
Open \code{xxd\_demo\_2}. It should be identical to \code{xxd\_demo}. As we've just demonstrated, \xcode{bash}{xxd -r} converts from hexadecimal representation back to raw bytes.
One thing that's important to note is that in the hexadecimal representation used by \code{xxd}, the first number in each line corresponds to the offset in the file, and each line specifies at most 16 bytes. So we can't, for example, just write \code{41} 80 times, and have \xcode{bash}{xxd -r} convert it into 80 "A"s. To see how we \textit{do} get 80 "A"s, simply create a file containing 80 "A"s and run \xcode{bash}{xxd} on it.
\wrapimageleft{./images/ascii_table.png}
\code{xxd} is not the only way to feed special characters to linux programs. For small use cases, one can also use \code{printf}, which is a shell built-in rather than a program (this means we can still use it if we accidentally wipe \code{/usr/bin}; this is actually relevant later in one of the exercises). Try \xcode{bash}{printf "\\x41\\x42\\x43\\x44"}. If one must specify characters directly from the terminal and does not have access to \code{xxd} or \code{printf}, they can also use character substitution with the \code{\$} sigil. Try: \xcode{bash}{echo \$'\\x68\\x65\\x6c\\x6c\\x6f'}; this should echo \xcode{bash}{hello} to the terminal. Both these methods can be similarly used to specify special characters directly from the terminal to other programs. The \code{\$} sigil can also be used to an effect similar to backtick substitution; try \xcode{bash}{\$(echo ls)}.
As for IO redirection -- it is a fanciful name for a really simple feature. We can have a program read from a file instead of the terminal, or write to a file instead of the terminal, or both. To have a program write to a file instead of the standard output, do \xcode{bash}{program > file}; to have it read from a file instead of the standard input, do \xcode{bash}{program < file}. To do both: \xcode{bash}{program < in\_file > out\_file}. Actually, a lot of commands we've typed so far involve IO redirection.
An important caveat is that when redirecting stdin, once the input file is exhausted, it is not possible to interact any further with the target process. The program will simply assume there is no more input, and react accordingly.
How do \code{xxd}, IO redirection and backtick substitution solve our problem? Well, if we want to feed a program a certain complicated input full of strange characters, we can first create a file (let's name it \code{input.hex}) that contains the input in hexadecimal representation. Then run \xcode{bash}{xxd -r input.hex > input.dat} to get our input in raw bytes form sitting in the file \code{input.dat}; then, finally, to feed the input to the program, execute \xcode{bash}{program < input.dat}. To redirect to or from the standard error channel, we can use \code{2>} or \code{2<} (this is esoteric and not used very often, but it crops up in one of the exercises). If instead we want to give our input as a command line argument, we can use backtick substitution: \xcode{bash}{program `cat input.dat`}.
Are we done? Almost. Even now, some gotchas remain that we should be aware of. Not all bytes were created equal, and some bytes may cause our specially crafted input to not carry over into the program. In particular, command line parameters typically don't play well with the null byte (\code{0x00}), the tab (\code{0x09}), the newline (\code{0x0A}) and the space character (\code{0x20}). Methods that read from the standard input are even more picky, and apart from the bytes already mentioned, might also have trouble with the end-of-transmission byte (\code{0x04}), the vertical tab (\code{0x0B}), the form feed (\code{0x0C}) and the carriage return (\code{0x0D}). If a targeted program is acting up because of problematic bytes, try to think of an alternative input which does not contain these bytes, but achieves the same goals.
\section{Hash Functions}
A \textbf{hash function} is a function $h$ that fulfils two conditions.
First, it maps input of arbitrary length to output of a fixed length of $n$ bits. For example, the hash function \code{sha256} can take any input, and will produce an output which is 256 bit long.
\wrapimageright{./images/hash_functions.jpg}
Second, for $h$ to be considered a \textit{really} proper hash function, $h$'s output needs to have been produced by a certified ancient deity at the dawn of time. The deity must have gone over every possible input out of infinitely many, and assigned each input a single $n$-bit output, perfectly at random out of all possible strings of $n$ bits. Once this ritual is complete, $h$ is ready for use. Every time $h$ is computed for some input $x$, the deity must be consulted directly for the correct value of $h(x)$.
If this story sounds somewhat suspicious to you, we'll further say that all the hash functions you've heard about, like \code{sha1} and \code{md5}, are actually knock-offs created by mortals. These are mere human-written functions, cleverly designed to create output "random enough" to seem like the real deal. Some of the knockoffs are pretty good; \code{sha256} can be treated, for all practical purposes, as if it were the real thing. \code{sha1} and \code{md5} are okay-ish, but best avoided. Most importantly of all, if we try and write our own hash function from scratch right now, it will end badly.
\wrapimageleftcap{./images/hash_function_creation.jpg}{the original, perfect hash function being bestowed unto man. It has since been lost.}
How so? Hash functions are useful, for example, for dealing with account passwords. We can pick a hash function $h$ and then, instead of storing Alice's password $p$ on a server, we can store $h(p)$. Every time Alice logs in, she provides a password $p'$ and we verify that $h(p')=h(p)$ to approve the login. Now if the database leaks, instead of $p$ an attacker only has $h(p)$. If the hash function is really proper, it's not clear how an attacker can proceed from there. They can try asking the relevant deity to "un-compute" the hash and recover $p$, but the deity will certainly refuse to answer and probably smite them for their insolence.
But what if we're using our own weak knock-off hash function? Well, it might be weak enough that it's actually possible to recover $p$ back from $h(p)$. This renders the "hash passwords" protection completely useless. Because of this, if one can efficiently find \textit{preimages} for $h$ -- that is, given $h(p)$ easily find a $p'$ with $h(p')=h(p)$ -- then $h$ is officially declared a lousy knockoff, and unfit for use.
(As an aside, a hash function knockoff $h$ is also considered lousy if one can efficiently find \textit{collisions} for it: that is, pairs $p_1, p_2$ where $p_1 \neq p_2$ but $h(p_1)=h(p_2)$. But we'd rather not get into that right now.)
Just to get a feel of how hash functions operate, execute \xcode{bash}{python3} and inside the python shell do:
\lstinputlisting[language=python]{./code/hash_demo.py}
This should output:
\lstinputlisting{./code/hash_demo.out}
Which is the \code{sha256} value of \code{hello world} in hexadecimal notation.
A rule of thumb is that if a hash function is at least okay-ishly proper then its implementation will be complex, full of redundancy, complicated, ugly and full of redundancy. It'll be called from some third-party library and have its own Wikipedia entry. If a function is purporting to be a hash function but it's simple, elegant, and seems like it was invented on the spot by someone who's never heard of this whole deity business -- it might be possible for an attacker to find preimages and wreak all kinds of mischief.
\section{Challenge 0x01: collision}
\exerciseopen{./images/01_collision}{Daddy told me about cool MD5 hash collision today. I want to do something like that too!}
This exercise should really be called "preimage", but we understand how that's less catchy.
As usual, we're given the IP address, port and credentials for an SSH session. We are again provided with a program \code{col} which has permission to read \code{flag}, and will print it for us if we give it the correct input. Also similarly to the previous challenge, the remote folder has \code{col}'s source code, \code{col.c}:
\lstinputlisting[language=C]{./exercises/01_collision/col.c}
The function \code{check\_password} is something akin to a hash function: it takes input and generates a 4-byte output. We say "something akin" for two reasons:
\begin{enumerate}
\item \code{check\_password} only takes an input length of exactly 20 bytes. A proper hash functions can take input of arbitrary length.
\item more importantly, it's a simple and elegant function with no Wikipedia entry. Therefore, as we've learned, it's broken.
\end{enumerate}
The challenges requires that we find a preimage for \code{check\_password}, which is exactly the sort of thing an attacker can do when a hash function is broken. We're given the value \xcode{C}{hashcode = 0x21DD09EC} and need to provide a password such that \xcode{C}{check_output(password) == hashcode}.
Looking at \code{check\_password}, we quickly conclude that it simply computes the 2's complement sum of the 5 dwords in the 20-byte password (with each dword interpreted as a little-endian integer). Since 2's complement is equivalent to addition modulo $2^{32}$, we simply need to find 5 integers that sum to \code{0x21DD09EC}.
Or do we? Consider that one such solution is \code{0x21DD09EC, 0, 0, 0, 0}; but if we try that, all the \code{0} values will be encoded as strings of null bytes. As we've seen above, in "special character woes", if we try to give the program such an input, it will simply assume that the input terminated on the first null byte. So, to be more precise, we need to find 5 integers which sum to \code{0x21DD09EC} modulo $2^{32}$ -- and, in their hexadecimal representation, don't contain any problematic special characters that don't play well with \code{argv} (\code{00}, \code{09}, \code{0A} and \code{20}).
Even with this limitation, the number of possible solutions is staggering; ironically, there are so many possible solutions that one might have difficulty pinning down a concrete way forward. One insight that can mitigate this issue is that for any four numbers $a, b, c, d$, there is exactly one $e$ that will result in the correct sum (\code{e = 0x21DD09EC - (a+b+c+d)} modulo $2^{32}$). So, one possible approach is to randomly pick some values of $a, b, c, d$ that lack any special characters, and hope that we get an $e$ that contains no special characters, either. If we fail, we can just try again with different values for $a, b, c, d$.
Let's try something banal: $a = b = c = d =$\code{0x41414141}. We then use Python to compute:
\xcode{python}{e = (0x21DD09EC - 4*0x41414141) \% 2**32}
This returns the number \code{483919080}, or \code{0x1cd804e8} in hexadecimal (we can ask Python to compute this for us by issuing the command \xcode{python}{hex(e)}). We got lucky on our first try: the resulting value of $e$ does not contain any problematic bytes. We now compose our crafted password in hexadecimal:
\lstinputlisting{./exercises/01_collision/pass.hex}
And issue the command:
\xcode{bash}{xxd -r pass.hex > pass}
This writes the password to the file \code{pass}. Run \code{col} with the contents of \code{pass} as the first parameter:
\xcode{bash}{./col `cat pass`}
And this should get the program to display the flag. Now, take a deep breath. The next exercise is somewhat of a milestone, and we're in for a journey before we get there.
\section{Computation at the Machine Level}
\subsection{Machine Code \& Assembly Language}
\wrapimageright{./images/simple_assembly.png}
We can write a program in C, but when we run the program, our machine doesn't actually \textit{run} the C. The part of our machine responsible for running programs only understands machine code.
There are many different machine codes out there, and our machine only understands one. For instance, a standard laptop likely understands "x86" machine code, and the average cell phone likely understands "ARM" machine code. When we compile a C program, what actually happens is that our compiler converts the program into machine code (plus a bunch of metadata to help our machine run it). The machine code implements the functionality we specified in our C source.
At this point the reader might protest, "why do we need both C and machine code, then? This is confusing". That's a good question with a complicated answer. The short of it is that humans are decent at reading and writing C, but less adept at reading and writing machine code; whereas machines can be easily designed to read, write and execute machine code -- but are much more difficult to design to directly read, write or execute C.
Historically, machine code came first. For a while, humans programmed by directly writing machine code, and they were miserable. C language, and its compiler, were invented in the 1970s to alleviate this pain somewhat.
Let's take a look at some assembly language. Download the \href{https://www.hex-rays.com/products/ida/support/download_freeware.shtml}{free version of IDA pro disassembler 7.0} and install it on a Windows machine. Copy the below C code to a file named \code{hello\_assembly.c}. Compile it with:
\xcode{bash}{gcc -m32 -o hello_assembly hello_assembly.c}
Copy the resulting program to the windows machine.
\lstinputlisting[language=C]{./code/hello_assembly.c}
Open the \code{hello\_assembly} program in IDA pro. After some grinding and churning, it should display something like the below:
\displayimage{./images/hello_assembly_x86.png}
This is x86 machine code. (Well, fine; machine code is actually ones and zeros -- this is x86 assembly language, which is equivalent but marginally more human-readable. It's possible to see the machine code by choosing \code{options->general} and changing \code{number of opcode bytes} to 10; we recommend it.)
Most people remember the first day they see, with their own eyes, that a humble "hello world" program is secretly this bunch of insufferable nonsense -- a pile of \asm{push}es and \asm{mov}s and \asm{lea}s and what-have-you. We inflicted this trauma on you for good reason; it's a necessary rite of passage.
How does a machine (such as a laptop) run the machine code? Let's say it picks the first instruction after the label \asm{main} (in the picture, that's \asm{lea ecx, [esp+4]}) and starts executing the instructions, one by one. If it runs out of instructions, it screams and dies painfully. This is not exactly true, but close enough to the truth for our purposes. So, when this program is run, first \asm{lea ecx, [esp+4]} is executed; then \asm{and esp, 0xFFFFFFF0}; then \asm{push dword ptr [ecx-4]}; and so forth.
\wrapimageleft{./images/process_memory.jpg}
Every byte in the memory space has an \textit{address}. This includes all the data, all the code for execution, the stack and heap (which we'll learn about later) - everything. It's possible to see address numbers in the IDA disassembler by pressing space; to go back to graph view, press space again.
x86 machine code can run on our laptop because the laptop has an x86-compatible processor. This processor contains a bunch of \textit{registers} - memory stores, each 4 bytes long; and supports a bunch of \textit{instructions}. We've already seen the instructions; these are the \asm{push}es, \asm{mov}s and \asm{lea}s. They typically manipulate the content of registers, or manipulate the control flow so that the next instruction that gets executed is some other instruction, instead of the next one directly below.
To have a passable knowledge of x86 architecture, one should at least know a shortlist of registers and what they are for, as well as a shortlist of instructions and what they are for. We dutifully include both. Don't get too worked up over all the references to the "stack" -- pushing into it, popping into it, its top and bottom and so on. We'll get into that in a short moment. Right now, it's enough to know that the "stack" is some region of memory, and that it used for handling function calls and local function variables. The more important thing is to become familiar with these registers and instructions.
\begin{figure}[H]
\centering
\begin{tabular}{|c|c|c|}
\hline
\textbf{register} & \textbf{what it's for} & \textbf{often seen} \\
\hline
\asm{eax} & general purpose & carrying function return values \\
\hline
\asm{ebx} & general purpose & storing values when \asm{eax} is already taken \\
\hline
\asm{ecx} & general purpose & doing the counting for a \xcode{C}{for} loop \\
\hline
\asm{edx} & general purpose & joining with \asm{eax} to store 64-bit numbers \\
\hline
\asm{esi} & general purpose & holding an address to copy bytes from \\
\hline
\asm{edi} & general purpose & holding an address to copy bytes to \\
\hline
\asm{esp} & stack pointer & holding the address of the stack frame top \\
\hline
\asm{ebp} & frame pointer & holding the address of the stack frame bottom \\
\hline
\asm{eip} & instruction pointer & holding the address of the current instruction \\
\hline
\asm{flags} & flags & keeping results of comparisons \\
\hline
\end{tabular}
\caption{Table of choice x86 registers}
\end{figure}
\begin{figure}[H]
\centering
\begin{tabular}{|c|c|c|p{0.5\textwidth}|}
\hline
\textbf{instruction} & \textbf{english} & \textbf{example} & \textbf{english} \\
\hline
\asm{mov} & move & \asm{mov eax, ebx} & copy the value in \asm{ebx} into \asm{eax} \\
\hline
\asm{inc} & increment & \asm{inc ecx} & increase the value of \asm{ecx} by 1 \\
\hline
\asm{dec} & decrement & \asm{dec ecx} & decrease the value of \asm{ecx} by 1 \\
\hline
\asm{cmp} & compare & \asm{cmp eax, ecx} & compare \asm{eax} with \asm{ecx} and remember which is larger \\
\hline
\asm{add} & add & \asm{add eax, ecx} & add the values of \asm{eax} and \asm{ecx} then put the result in \asm{eax} \\
\hline
\asm{sub} & subtract & \asm{sub eax, ecx} & subtract the value of \asm{ecx} from that of \asm{eax}, and then put the result in \asm{eax} \\
\hline
\asm{mul} & multiply & \asm{mul ebx} & multiply the value of \asm{eax} by the value of \asm{ebx}, and then put the 64-bit result in \asm{edx} (significant 32 bits) and \asm{eax} (remaining 32 bits)\\
\hline
\asm{div} & divide & \asm{div ebx} & treat \asm{edx} as 32 significant bits, \asm{eax} as 32 remaining bits; divide result by the value of \asm{ebx}; put the quotient in \asm{eax} and the remainder in \asm{edx} (phew!) \\
\hline
\asm{and} & and & \asm{and eax, edx} & compute the bitwise and of \asm{eax} and \asm{ecx}; put the result in \asm{eax} \\
\hline
\asm{xor} & exclusive or & \asm{xor eax, ecx} & compute the bitwise exclusive-or of \asm{eax} and \asm{ecx}; put the result in \asm{eax} \\
\hline
\asm{lea} & load effective address & \asm{lea edx, [eax+4]} & move the \textit{address} of the second operand to the first operand; puts \asm{eax+4} into \asm{edx}. \\
\hline
\asm{jmp} & jump & \asm{jmp 0x401000} & transfer control flow to the address \code{0x401000} \\
\hline
\asm{jge} & jump if greater or equal & \asm{jge 0x401000} & same as \asm{jmp}, but only if in last comparison, first term was greater or equal \\
\hline
\asm{push} & push & \asm{push ebx} & push the value in \asm{ebx} on top of the stack \\
\hline
\asm{pop}& pop & \asm{pop ebx} & pop a value off the top of the stack and put it in \asm{ebx} \\
\hline
\asm{call} & call & \asm{call 0x401000} & push address of next instruction to the stack, transfer control to \code{0x401000} \\
\hline
\asm{ret} & return & \asm{ret} & pop address off the stack, transfer control there \\
\hline
\end{tabular}
\caption{table of choice x86 instructions}
\end{figure}
Brackets are used to refer to values by their memory address. e.g. \asm{mov ebx, [eax]} means "interpret the value of \asm{eax} as a memory address; copy the 4 bytes there into \asm{ebx} as its new value".
There are some more registers beside the ones mentioned here, and \textit{many, many} more instructions -- too many to list here. Reading programs at the assembly level is a skill that requires a lot of experience and a lot of web searching. This is something you should get comfortable with, but do expect the amount of web searching to decrease with time.
\subsection{Thread Stack and Stack Frames}
\wrapimageright{./images/if_statement_assembly.png}
One thing we maybe haven't emphasized enough is how most of the convenient concepts we're used to when programming in C just don't exist in assembly-land. We can't simply write an \xcode{C}{if} statement; instead, we spell out a conditional jump followed by a block of instructions, so that the block is executed only if the jump is not triggered. We can't simply write a \xcode{C}{switch} statement; instead, we have to spell out comparison after comparison of the switch parameter against each relevant value, with every comparison followed by a conditional jump.
Finally, and perhaps most frustratingly, we can't just call a function with a bunch of parameters - and we can't declare local variables, either. Instead, we have to do a complicated dance that's enough of a hassle to get its own subsection, which you're reading right now. To support the implementation of function parameters and local variables, every execution thread in every process has a special memory region called \textit{the stack}.
\wrapimageright{./images/stack.jpg}
A stack, in general, is not a concept unique to assembly language. It's a simple data structure that supports two main operations: \code{push} and \code{pop}. One might imagine it as a, well, stack of numbers. \code{push} puts a new number on top, and \code{pop} removes the number at the top. So, for instance: if we take an empty stack and issue the commands \code{push 2}, \code{push 17} and then \code{push 9}, the stack will now read, from top to bottom: \code{9 17 2}. Issuing the command \code{pop} on this stack will yield the value \code{9}; the stack will then read \code{17 2} (again, from top to bottom). The number last pushed onto the stack is the first to be popped out; for this reason, stacks are said to implement Last-In-First-Out logic (LIFO). Many things in real life operate on LIFO logic, and are therefore modeled well by stacks (e.g. assault rifle cartridges; milk cartons at the grocery store).
The stack used by process threads is not much different than that. There are some differences between it and the "standard stack" we have just described, but none of them are too dramatic.
\wrapimageleftcap{./images/stack_frame.png}{schema of stack frame layout}
First, since the stack lives in process memory, it is associated with a range of memory addresses. Modern compilers orient the stack so that it grows towards the lower addresses when a value is \code{push}ed onto it. For example, if the current stack top is at the address \code{0xFFFF4044}, and we push a new 4-byte value onto the stack, the new value will be written into the address \code{0xFFFF4040}. For this reason, stack diagrams are usually drawn with the lower addresses at the top (we'll see such a diagram very soon).
Second, there is a dedicated register that is used to keep track of the current stack top - the register \asm{esp}. This makes possible all sorts of manipulations which aren't possible in the simplified stack that we discussed above. For example, the instruction \asm{sub esp, 0x10} can be interpreted as pushing \code{0x10} bytes onto the top of the stack. Their initial value is undefined, and is effectively whatever happened to already be in memory there. They can be used as a temporary memory store, and when done we can dismiss the store with \asm{add esp, 0x10}.
At this point the reader might object, "why can't we just use \asm{esp-0x8} as a memory store directly, without subtracting anything from \asm{esp}?". The answer is that actually, we can -- but we'd rather not, at the moment. The stack is an intuitive abstraction; if we just treat it as a chunk of memory where we can do as we please, the magic is lost and everything gets much more confusing. Yes, a lot of what we do here is about losing the magic and analyzing the resulting confusion -- but this particular section is about how to properly use the stack. We'll get to abusing it later.
So, how \textit{is} the stack properly used? For one thing, it's used to implement nearly every convenient C language abstraction that relates to functions: calling, returning, arguments and local variables. The exact details can vary, but the caller and the callee must be in agreement about what protocol to follow when calling and returning. Such a protocol is called a \textit{calling convention}.
One example of a popular calling convention is \textbf{stdcall}. We'll now explain - in broad strokes - how it works. The explanation will assume a certain convenient state of affairs when the program starts; then it'll show how every function does its part to preserve this state of affairs across function calls and returns.
The "convenient state" is as follows:
\begin{itemize}
\item The stack is divided into "frames", starting at the top of the stack. Each frame is associated with a function. The top frame is associated with the function currently executing, $f_0$. The frame below it is associated with $f_1$, the function that called $f_0$. The frame below \textit{that} is associated with $f_2$, the function that called $f_1$. And so on.
\item \asm{esp} has the address of the top of the current stack frame (that's the top of the stack in general); \asm{ebp} has the address of the bottom of the current stack frame.
\item From the top of the current stack frame going down, we have: a space for local variables; a backup of the \asm{ebp} value for $f_1$; a backup of where to resume execution in $f_1$ once $f_0$ returns; the arguments that $f_1$ supplied to $f_0$, in the order that they appear in $f_0$'s signature.
\end{itemize}
Assume that when the program starts, it soon enters the convenient state (everything in this section can be understood perfectly well without understanding how this happens; for now, just take it on faith). From then on, functions make sure to preserve the convenient state. When a function call occurs, the following happens on the assembly level:
\begin{enumerate}
\item $f$ decides to call $g$.
\item $f$ pushes the parameters for $g$ onto the stack, one after the other. The last parameter in $g$'s signature is pushed first, so that when done, the first parameter signature-wise is at the top of the stack.
\item $f$ executes a \asm{call} instruction. The address of where execution should resume in $f$ after $g$ returns is pushed to the stack (this is the address of the instruction immediately following the \asm{call} instruction in memory). Execution is transferred to the first instruction of $g$.
\item $g$ has the bottom of its stack frame already in place -- the parameters and the address to resume execution in $f$; but the top part is missing: from bottom to top there's supposed to be a backed-up value of \asm{ebp} for $f$'s stack pointer and a space for local variables. So $g$ completes the frame - it performs a \textit{function prologue}:
\begin{enumerate}
\item $g$ pushes the current value of \asm{ebp} into the top of the stack.
\item $g$ sets the value of \asm{ebp} to the current value of \asm{esp}, basically declaring the current location of \asm{esp} as the bottom of its own frame.
\item $g$ subtracts a value $S$ from \asm{esp} to make room on the stack for local variables.
\end{enumerate}
\item $g$ has reached the convenient state. It now performs its actual, material duties: computes the sum of an array, displays a dialogue box, or whatever else. Some of the stack arguments may be manipulated, and a return value is put somewhere for $f$ to see (typically in \asm{eax}).
\item Once that's done, $g$ sets out to dismantle its stack frame and return to a convenient state with $f$'s stack frame at the top:
\begin{enumerate}
\item It adds the value $S$ back to \asm{esp}, relinquishing the room on the stack.
\item Now \asm{esp} points to $f$'s value of \asm{ebp}; $g$ pops that value off the stack and puts it back into \asm{ebp}.
\item Now \asm{esp} points to the address where execution should be resumed in $f$. $g$ executes a \asm{ret} instruction; this instruction pops the address off the top of the stack and transfers execution to that address. $g$ also adds to \asm{esp} to push it down past the rest of its stack frame, containing the arguments. \asm{esp} is now back where it was before $f$ started pushing $g$'s arguments onto the stack, and execution is back at $f$.
\end{enumerate}
\end{enumerate}
As we said, that's one calling convention, stdcall; but it includes all the core concepts that play out in other calling conventions. For instance, \textbf{cdecl} convention is virtually the same, except after $g$ returns, it's $f$ - not $g$ - that is responsible for pushing \asm{esp} down past $g$'s arguments. With \textbf{fastcall}, in contrast, parameters are usually passed via registers instead of the stack.
Being familiar with all calling conventions can come in handy, but isn't really the point. Someone could write their own assembly or implement their own compiler, with a new calling convention that no one else has seen before. When reading a function's assembly, a key step is understanding how the calling convention works -- even if it's, well, unconventional.
Use IDA Pro to read the assembly of this \code{hello\_assembly} program until it mostly makes sense to you. Pay special attention to the function calls and the loops. A handy feature is that double-clicking on a function name jumps to the address of that function; press \code{esc} to go back.
\subsection{Dynamic Analysis and the Debugger}
Staring at a program's disassembly, and the program itself, are both forms of \textit{static analysis}. The program does not run; at best, it runs in your head.
In theory it should be possible to answer any question about the code by static analysis alone. In practice, some pieces of assembly will give you an aneurysm if you try to do that. When doing static analysis, the slightest wrong idea about how a function works can lead to a long, fruitless exploration -- and even without making a single mistake, it's easy to spend hours analyzing a huge pile of assembly from first principles, only to realize that it's just the assembly implementation that's so complicated, and the basic concept of what's happening boils down to 3 lines of code.
\wrapimagerightcap{./images/rube_goldberg.jpg}{Better just push it.}
In order to avoid all of that, we need to be familiar with the complementary skill of \textit{dynamic analysis}. Dynamic analysis is the art of grounding oneself in what's actually happening in the program as it runs, as opposed to any belief about what \textit{should} be happening based on the assembly. What value does that register \textit{really} take? What value does that function \textit{really} return? With answers to these questions in hand, we can weed out misconceptions and bypass hours of work. Sometimes, a single pair of function input and output are worth a hundred hours of staring at the assembly.
Dynamic analysis is a whole discipline that has various tools at its disposal, but we're going straight for the crown jewel - the debugger. A debugger is a program that can execute other programs in a controlled environment. Using a debugger, we can step through a program - instruction by instruction - and see in real time what values are being returned from functions, and what memory gets written where. We can even modify the program in real time and see how it responds to the new conditions we've imposed.
The debugger we're going to be using is called \code{gdb}. Edit the text file \code{\textasciitilde /.gdbinit} (if it doesn't exist, create it) and add the following lines to it:
\lstinputlisting{./code/gdbinit_sample}
Now from the terminal, go to the directory that has the \code{hello\_assembly} program and execute the command \xcode{bash}{gdb hello\_assembly}. In the \code{(gdb)} prompt, execute the command \gdb{b *main} and then \gdb{r}.
\displayimagecap{./images/hello_assembly_debugger.png}{debugger state after the breakpoint is hit.}
The \code{hello\_assembly} program is now running, but in stasis - frozen right before the first instruction of the \code{main} function. It's waiting for directions from the debugger. The top window is a "registers window", which displays the current value of every register. For example, \code{eip} is pointing at the address of the \asm{lea ecx, [esp+0x4]} instruction. From this display, one can determine the current position of the stack, as the address of the top of the stack is the value of \code{esp}.
Below the registers window lies the disassembly window. This window displays the part of the program that is currently being executed. It's possible to scroll with the up and down arrows to look around the surrounding assembly code. The next instruction for execution is highlighted in white.
Execute the command \xcode{C}{stepi} ("step to next instruction"). It will execute the current instruction and move to the next. The white highlighted line in the disassembly window will move to the next instruction, and registers that had their values changed by the most recent instruction will be highlighted in white as well in the registers window (these are \asm{eip} and \asm{ecx}; make sure you understand why these were changed, and not others).
In theory, we now have everything we need to use a debugger: we can just step through the whole program, instruction after instruction, and see what happens. But that's not very convenient. We are probably interested in some parts of execution, and not others. For one thing, we're bound to run across some "utility functions" inserted by the compiler that don't even encapsulate any proper program logic; if we step into those, we'll be spending quite a long while inside with no new insight to show for our trouble. Surely the debugger offers some tools to make the task easier.
The most important such tool is \textit{breakpoints}. We can set a breakpoint on a specific address and then just let the program run, instead of stepping again and again. When execution reaches the breakpoint's address, the program will freeze and control will be handed back to the debugger. From there, we can examine registers, single step and so on, just like we could earlier.
Breakpoints are set with the \gdb{b} ("break") command -- \gdb{b *addr} (symbols and addresses should be preceded with asterisks; otherwise, gdb thinks we're talking about a source line and not an instruction address, and gets confused). Just a few paragraphs ago we set a breakpoint on the \code{main} function with the command \gdb{b *main}; while \gdb{main} is not an address, it is a symbol associated with an address, so that works too (we could have also written \gdb{main}'s address explicitly: \gdb{break *0x565556a2}, but be aware that this value may be different on your end). The command \gdb{i b} ("info breakpoints") displays a list of active breakpoints; it's also possible to delete breakpoints with the \gdb{d} ("delete") command by the breakpoint number, which appears in the output of \gdb{i b}. For example, to delete breakpoint 1, do \gdb{d 1}. Now, find the breakpoint number of the breakpoint we set on \gdb{main} -- then delete it, and create it again.
When \gdb{gdb} is started for the first time, the program is not running yet. So, after we've set the breakpoints that we want, we should get the program running with the \gdb{r} ("run") command (we did this earlier, too). If we hit a breakpoint while the program is already running, and want to continue running the program, we can use the \gdb{c} ("continue") command. Take note that the "run" command supports command-line arguments, and even IO redicrection! Try loading \gdb{hello\_assembly} from scratch and issuing the command \gdb{r > test\_output.txt}. When the process exits, issue the command \gdb{q} ("quit"). The output of the \gdb{hello\_assembly} executable should appear in a new file named \gdb{test\_output.txt}.
The \gdb{ni} ("next instruction") command is similar to \gdb{stepi}, but will elegantly skip \gdb{call} instructions instead of iterating into the function being called (in many debuggers, this is called "step over"). This is implemented by the program running until the \code{call}ed function returns, so be careful if you suspect that the function behind the \code{call} is malformed and doesn't return properly! The command \gdb{finish} will continue executing until the \textit{current} function returns, and execution is transferred back to the caller. Try stepping over a function call in the "hello assembly" program with \gdb{ni}; then try stepping into a function with \gdb{stepi} and immediately back out with \gdb{finish}.
One other important tool to be familiar with is the \gdb{x} ("examine") command. This command can be used to look around the program and answer questions like "what instructions are at that address?" and "what values are on the stack right now?". It's kind of a swiss army knife and may take time to get the hang of. The command takes the form of \gdb{x/nfu addr} where:
\begin{itemize}
\item \gdb{n} is the number of bytes to display
\item \gdb{f} is a single letter that specifies what format to use when displaying the data: \gdb{s}tring, \gdb{i}nstructions, he\gdb{x}adecimal, \gdb{a}ddress, \gdb{c}character
\item \gdb{u} is the unit size for aggregating data: \gdb{b}yte, \gdb{h}alfword (2 bytes), \gdb{w}ord (4 bytes), \gdb{g}iantword (8 bytes). This is not relevant for all formats; e.g. it makes little sense to use it with the \gdb{i}nstructions format.
\item \gdb{addr} is the address of the data to be displayed.
\end{itemize}
(You may be used to "word" meaning "2 bytes" and "dword", or "double word", meaning "4 bytes". This is kind of like how the text editor \xcode{bash}{vim} and the window manager \xcode{bash}{i3} use the exact same navigation scheme, except \xcode{bash}{vim} uses the keys \code{hjkl} whereas \xcode{bash}{i3} uses the keys \code{jkl;}. Just take a deep breath, count to ten and drink a cold glass of water.)
Try these out:
\begin{itemize}
\item \gdb{x/32xw $eip} - show the 32 top 4-byte values starting from the top of the stack going down. (It's \gdb{$eip} and not just \gdb{eip}; it's necessary to prepend a \gdb{$} when referring to registers.)
\item \gdb{x/5i *draw\_triangle} - show the first 5 assembly instructions of the \gdb{draw\_triangle} function
\item Get execution to just before the first \asm{call} to \xcode{C}{puts} is executed, so that the \asm{call} instruction is highlighted. Then issue the commands \gdb{x/13xb $eax} and \gdb{x/s $eax} to see the input argument to \gdb{puts} (we can't 100\% guarantee this will work).
\end{itemize}
Other than \gdb{$eip}, \gdb{$ebx} and other registers, there are other \textit{pseudo-registers} that exist for our convenience. \gdb{$\_} resolves to the last address we have examined; \gdb{$\_\_} resolves to the \textit{value} in the last address we have examined. (Try \gdb{x/xw $esp}, and then \gdb{x/32xw $\_} immediately next). This may also be a good place to mention that \gdb{gdb} can be run from the command line as-is without specifying which executable to debug; it's possible to load a file from inside \gdb{gdb} with the \gdb{file} command. (Try this now: start \gdb{gdb} and do \gdb{file hello\_assembly}.)
\gdb{gdb} supports scripting. This is done via the \gdb{source} command; if we create a file with \xcode{bash}{gdb} commands in it named \code{script.gdb}, then inside \xcode{bash}{gdb} execute \gdb{source script.gdb}, all the commands in the script will be executed. We can even specify a script for \xcode{bash}{gdb} to run from the command line: \xcode{bash}{gdb -x script.gdb}. Try it out -- create a file named \xcode{bash}{hello\_assembly.gdb} with the following contents:
\lstinputlisting{./code/hello_assembly.gdb}
Then run:
\xcode{bash}{gdb -x hello\_assembly.gdb}
When prompted to press return to continue, do so. When done, the debugging session should display an active prompt after it's hit the breakpoint in the first call to the function \xcode{C}{fib\_recursion}, and output all the register values at that point to the terminal.
\subsection{x64 assembly}
In x86 assembly, the size of an address pointer is 4 bytes -- enough to be able to address $2^{32}$ bytes or 4GB of memory. As demands on computation grew, this 4GB gradually became more and more crowded until processor companies intervened and introduced 64 bit processors. These have their own assembly, which uses 8-byte pointers and 8-byte registers. Register names start with \code{r} -- so it's \asm{rax} instead of \asm{eax}, \asm{rip} instead of \asm{eip}, and so on. x64 assembly is typically used with a calling convention where parameters are passed via registers, and not the stack.
Of course there are other differences, but that's all the extra knowledge necessary to tackle x64 assembly where it crops up in these exercises.
\subsection{Final Word on Assembly}
People new to assembly see a soup of instructions, and think in terms of what the processor is doing. Those more experienced see \textit{patterns}, and think in terms of what the \textit{compiler} was trying to accomplish when it generated that assembly. Being comfortable with reading assembly, most of all, takes experience. If from this point on you find that static and dynamic analysis of assembly code are the bottleneck holding you back, you may want to study the subject more thoroughly before continuing (we personally recommend the textbook \textit{Practical Malware Analysis}).
\section{Exploitation Basics: Buffer Overflow}
\wrapimageright{./images/buffer_overflow.png}
We're going to take another look at that clever "function calls using the stack" convention that we talked about a while back -- because unfortunately, it's not just clever but also fundamentally broken. How broken? Suppose we run an innocent service on our machine that takes an MS-word document, converts it to \code{pdf} format, and sends the result back. Further suppose the server program was compiled using that "stack handles function calls" trick we discussed, and takes user input via simple functions off the C standard library such as \xcode{C}{gets}. Then a malicious user can come up to our server and say "hi" in a very specific way, such that our server is compelled to send 70,000 spam messages to everyone in our contact list, and then wipe clean all the data on it.
This shouldn't sound right to you. Most programs \textit{do} use stack-based calling conventions, and yet the internet is \textit{not} the Wild West where anyone who talks to a server can commandeer it. That's because once the danger became clear, people figured out all sorts of defenses and mitigations that can be used to prevent the attack. We'll get to those later; let's first understand the basic attack and how it works.
The attack is called a \textbf{buffer overflow}, and we already know everything we need to know to understand how it works. Suppose a program calls a function $f$ and suppose that one of $f$'s local variables is a string that lives on the stack. Further suppose that at some point, $f$ calls \xcode{C}{gets} to consult with the user and get a value for this variable. \xcode{C}{gets} just copies user input into the variable address blindly; it doesn't care about stacks, allocations or common sense, and will keep going on and rewriting memory until it's out of input.
From an attacker's point of view, this is an invitation to party. They can forge an input long enough to overwrite everything on the stack, starting at the variable they were supposed to give a value for and going down, wiping out all the other variables lower on the stack until they finally reach the backed-up "return here later" value, then overwrite that value with any value they wish. When $f$ returns, \asm{eip} will take that value. The attacker, therefore, can gain control of the program execution and divert it anywhere.
The attack isn't over. A proper attack has two stages: controlling execution and running code. To get to the second stage, the attacker has to write their code somewhere and reliably produce an \asm{eip} value that will result in the code being executed. But let's not get ahead of ourselves; the upcoming exercise will focus on the overwriting of stack values with precision.
\section{Scripted Process Interaction}
\wrapimageright{./images/robot_laptop.jpg}
Often, we'll want to provide input to a program that depends on program behavior at run-time. For example, suppose we're playing a game and we want to respond to challenges being posed to us, which vary with each playthrough. We can't pre-compile all our input with \xcode{bash}{xxd} or such and use IO redirection; before the program runs, we don't know what our input is going to be. Without a tool suited to this obstacle, we're back at square 1, dealing with exactly the same problem of providing complex input to a program without losing our mind in the process.
Our first thought might (or might not) be to do something with linux IO redirection and pipes. Pipe the program output to a file, then pipe that file to our script, and have the script output pipe somehow back to the program input. This would be the textbook solution to this issue, except it doesn't work due to wonky buffering optimizations that kick into gear the moment a program is interacting with a pipe, instead of the terminal. This really isn't the place for a deep dive into that subject, but the bottom line is that the program writes "hello! Plese give me your input" and the operating system sits there, smugly saying to itself "ha ha, there's no human at the other end of this input so I'm just going to procrastinate until I feel that I absolutely have to write this to stdout before someone gets angry". The program does have the ability to declare "I am getting angry, now go and do your job", but if we don't have the permissions to modify the program for it to actually say that, well, tough luck.
Due to the above, the existing tools for scripted process interaction are all either kludgy work-arounds, or are built on top of kludgy work-arounds, and contain a big chunk of non-trivial hideous code. You should \textit{never ever} try to implement process interaction by yourself from primitive shell features if you're not looking for a big, fat, time-consuming, put-a-hold-on-everything-else learning opportunity. Use a ready-made solution and be glad that some other poor soul had already gone to the trouble of creating it.
Our personal tool of choice for process interaction is \xcode{python}{pexpect}. This is a Python library around a unix utility called \xcode{bash}{expect}, but since \xcode{bash}{expect} has its own dedicated syntax, we believe that skipping it and going straight for the Python wrapper is a mentally healthier approach.
To use \xcode{python}{pexpect}, like any other python library, we first have to install it via \xcode{bash}{pip} and then import it. We can then create a new process by \xcode{python}{p = pexpect.spawn(cmd)}, which is very convenient because \xcode{python}{cmd} can be, let's say, \xcode{python}{nc <some_server> <some_port>}, and pexpect will seamlessly interact with the remote server just like it would with a local process. One way or the other, once the \xcode{python}{spawn} call goes through, the variable \xcode{python}{p} refers to a "process" object. The three main methods supported by that object are:
\begin{itemize}
\item \xcode{python}{setecho} -- Sets whether input sent to the process will be echoed to the terminal or not. Takes one parameter of either \xcode{python}{True} or \xcode{python}{False}.
\item \xcode{python}{expect} -- Takes a list of strings as a parameter (regular expressions are also supported, if that helps). This method lets the process run until its output contains at least one of the strings, and then returns the index of that string in the list. For instance, if the command \xcode{python}{p.expect(["hello", "goodbye"])} is issued and the process prints \xcode{bash}{hello world!} then the call to \xcode{python}{expect} will return 0. Some built-in constants are also supported in addition to the strings: for example, \xcode{python}{pexpect.EOF} will trigger if there is no more process output and none of the other strings were found.
\item \xcode{python}{send} -- takes a string and passes that string as input to the spawned program. To add an automatic line feed at the end of the input (as a human typically would), use \xcode{python}{sendline} instead.
\end{itemize}
To actually examine spawned program output and react accordingly, familiarity is required with these two fields of the "process" object:
\begin{itemize}
\item \xcode{python}{p.before} -- holds a slice of the spawned program output, starting with the last character of the previous \xcode{python}{expect} match up until the first character of the current \xcode{python}{expect} match, not inclusive.
\item \xcode{python}{p.after} -- holds the contents of the current \xcode{python}{expect} match.
\end{itemize}
For example, if the spawned program outputs "lorem ipsum dolor sit amet" and we call \xcode{python}{p.expect("ipsum")} and then \xcode{python}{p.expect("sit")}, then \xcode{python}{p.before} will hold \xcode{python}{" dolor "} and \xcode{python}{p.after} will hold \xcode{python}{" amet"}. You can do some of your own experimentation by calling \xcode{python}{pexpect.spawn("echo <choose some wacky text here>")} and then trying various combinations of \xcode{python}{p.expect(...)}, \xcode{python}{p.before} and \xcode{python}{p.after} to see how \xcode{python}{pexpect}'s internal states behave and what output is generated.
Let's give a more meaty example containing a toy use case for \xcode{python}{pexpect}. Consider the following simple Python script, which implements a calculator that takes an integer $i$ as input, picks a random integer $j$ as input, computes $i+j$ and reports the result:
\lstinputlisting[language=python]{./code/lousy_calc.py}
This script is maybe not the best possible implementation of its stated goal (try to use it a few times to really get a feel for it). Suppose that we cannot make any changes to this script, but we need to use it anyway on a regular basis. We can create a pexpect-based wrapper around it, like so:
\lstinputlisting[language=python]{./code/pexpect_demo.py}
Try to use this script a few times to get a feel for it, too. Try to toy around with it and tweak the wrapper behavior. Write a simple \code{pexpect} wrapper for a program of your choice; let the wrapper do some meaningful processing of the program output.
An alternative to \xcode{python}{pexpect} is \xcode{python}{pwntools}, which offers process interaction faculties (among many other features). We personally prefer \xcode{python}{pexpect}, due to \code{pwntools}' lack of support for Python 3 as well as its violation of the unix philosophy (do one thing and do it well). But be our opinion as it may, \code{pwntools} is installed on the pwnable servers and \code{pexpect} is not, which means that under certain plausible circumstances, we'll be using \code{pwntools} whether we like it or not (more on this directly below, in "Mock vs. Target Environments").
The pwntools API is very similar to that of pexpect: it's \xcode{python}{process} instead of \xcode{python}{spawn} and \xcode{python}{recvuntil} instead of \xcode{python}{expect}. \xcode{python}{process} receives a list of arguments (as in the more mainstream \xcode{python}{subprocess} module); \xcode{python}{recvuntil} doesn't modify fields in the process object, and instead returns the equivalent of \xcode{python}{p.before+p.after}. If that sounds kind of hand-wavy, in some of the later exercises we'll present some sample code that makes use of \code{pexpect}.
\section{Mock vs. Target Environments}
This is the final note before we actually reach the next exercise, and it's not very technical, so bear with us!
\displayimage{./images/bearwithme.png}
When CTFing, and in general when trying to manipulate a target environment, there is a certain trade-off involved.
\wrapimagerightcap{./images/expectation_vs_reality.jpg}{Typical experience when developing exploit in mock environment first (illustration).}
On the one end of the spectrum, we set up our mock environment from scratch on our own machine. We compile the source code ourselves. We fabricate unknowns: we don't know the contents of \code{key.txt} in the target environment, so we create a file in our mock environment with that name and the contents \code{test\_key\_here}. In this environment we can debug every issue, pin-point exactly where our attack is going wrong, remove distractions, add convenient \xcode{C}{printf}s. We perfect our attack, launch it on the target environment -- and the attack fails miserably. This is because it implicitly relies on 521 different features of our mock environment, 77 of which we don't even realize exist and 4 of which are Python dependencies.
On the other hand of the spectrum, we attack the actual target environment from the get-go, confident in the knowledge that if we succeed, we are immediately done. We don't succeed. There are 31 different issues with our first draft of a solution: bugs in the implementation, subtle errors in the logic, flaws in the entire approach. To untangle these, all we have to go on are ominous error messages and segmentation faults -- and, if we're lucky, a laggy debugging session fraught with \code{SIGALARM}s and other roadblocks, annoyances and surprises baked into the program by the person who compiled it originally, and hates us on a personal level. Hours later, we are zero percent closer to understanding what any of the 31 issues even are.
\wrapimageleftcap{./images/target_environment.png}{Typical experience when developing exploit directly in target environment (illustration).}
It's not at all clear how to deal with this dilemma, and there is probably no right answer. We've found that what works best for us is starting with a completely controlled mock environment, but doing some thorough thinking instead of rushing ahead with a solution. Suppose for the sake of the argument that we have a working solution for the mock environment. Is it \textit{salvageable} for the target environment? How? With how much work? What constraints from the target environment are game-changers, and should clearly be considered straight away?
Making these decisions in an informed, deliberate manner will save us a lot of wasted time, and will help manage our expectations for when the toy solution is done. To make such informed decisions, it's a great help to know what (usually) varies between compilations, what (usually) varies between machines and what (usually) varies between different runs of the same program. Most of all, if possible, it's a great help to compare and contrast the mock and target environment behavior dynamically. Run the program on both environments separately; use a debugger if necesary; map out an attack plan and verify in detail that it will survive contact with the target environment.
A rule of thumb is "what fails in the mock environment will also fail in the target machine" -- meaning that until we have a working solution in our controlled environment, we shouldn't even bother contacting the remote server. While this is almost always true, it \textit{might} be the case that a subtle misconfiguration on our end is foiling an otherwise perfectly good attack. Keep this remote possibility in mind; it might make for a good Hail Mary when all else fails.
Perhaps the biggest issue we have to take into account when considering toy vs real solutions is that of dependencies. Take a good look at the set of dependencies available on the remote server before constructing, or even designing, a solution. If a dependency we were eyeing is missing on the remote server, we can try one or more of the following solutions:
\begin{itemize}
\item \textbf{install the missing dependencies}. We probably don't have permissions to do this, but it is worth a shot.
\item \textbf{use a static, compiled language} such as C, C++, Rust or Golang that we feel comfortable and productive with (that's relatively speaking, of course; nothing is as comfortable and productive as Python). Compile for the target machine architecture and simply run the executable in the target environment. This may be a handful of work, but will eliminate the vast majority of dependency issues.
\item \textbf{use Python with pyinstaller}. \xcode{bash}{pyinstaller} can convert Python scripts into executable programs:
\xcode{bash}{pyinstaller --onefile script.py}.
\item \textbf{force executables to be linked statically}. Some run-time libraries we're relying on might not be available on the target environment, which will cause executables to crash and complain about missing libraries. If the programming language being used supports static compilation, consult the compiler documentation on how to perform one, as this will resolve the issue. Otherwise, another solution is to use \xcode{bash}{staticx} on a dynamically-linked executable to obtain a statically-linked version: e.g. \xcode{bash}{staticx input_dynamic_executable output_static_executable}.
\item \textbf{look for an alternative on the target system}. For instance, as we mentioned before, the pwnable servers don't have the \code{pexpect} Python library installed, but do have \code{pwntools} which offer analogous functionality. \textbf{Note} that pwntools is only implemented for Python 2 and not Python 3.
\item \textbf{sigh and implement the functionality ourselves}. We should proceed with caution. If we are not super comfortable with the problem domain, chances are this will end badly.
\end{itemize}
\section{Challenge 0x02: bof}
\exerciseopen{./images/02_bof.png}{Mama told me that the buffer overflow is one of the most common software vulnerabilities. Is it true?
Download: http://pwnable.kr/bin/bof
Download: http://pwnable.kr/bin/bof.c
running at: nc pwnable.kr 9000}
We are given the following program: