-
Notifications
You must be signed in to change notification settings - Fork 7
/
chap-cheri-x86-64.tex
1262 lines (1035 loc) · 55 KB
/
chap-cheri-x86-64.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter[The CHERI-x86-64 ISA (Sketch)]{The CHERI-x86-64 Instruction-Set Architecture (Sketch)}
\label{chap:cheri-x86-64}
\rwnote{New introduction is required, and some change of pitch.}
In this chapter, we explore models for applying CHERI protection to the x86
architecture.
The x86 architecture is a widely deployed CPU architecture used in a
variety of applications ranging from mobile to high-performance computing.
The architecture has evolved over time from 16-bit processors without
MMUs to present-day systems with 64-bit processors supporting virtual
memory via a combination of segmentation and paging.
The x86 architecture has spanned three register sizes (16, 32, and
64 bits) and multiple memory management models. We choose to define
CHERI solely for the 64-bit x86 architecture for a variety of reasons
including its more mature virtual-memory model, as well as its larger
general-purpose integer register file.
\section{CHERI-x86-64 Approach}
In applying CHERI to the 64-bit x86 architecture, we aim to provide a
model similar to CHERI-RISC-V and Morello. This model should have the
following properties:
\begin{itemize}
\item A new capability hardware type that is usable for C language
pointers.
\item Capability values should be intentionally used. Instructions
should explicitly specify whether a register operand should be used as a
capability or an integer scalar. Specifically, the presence (or
lack) of a tag should not determine if a value is treated as a
capability rather than an integer.
\item While new instructions will be required to manipulate
capabilities, common code patterns for pure-capability C such as
function prologues and epilogues should use similar opcode density
to 64-bit x86.
\end{itemize}
\subsection{Capability Registers versus Segments}
The x86 architecture first added virtual memory support via
relocatable and variable-sized segments. Each segment was assigned a
mask of permissions. Memory references were resolved with respect to a
specific segment including relocation to a base address, bounds
checking, and access checks. Special segment types permitted transitions
to and from different protection domains.
These features are similar to features in CHERI capabilities.
However, there are also some key differences.
First, x86 addresses are stored as a combination of an offset and a
segment spanning two different registers. General-purpose registers
are used to hold offsets, and dedicated segment selector registers are
used to hold information about a single segment. The x86 architecture
provides six segment selector registers -- three of which are reserved
for code, stack, and general data accesses. A fourth register is
typically used to define the location of thread-local storage (TLS).
This leaves two segment registers to use for fine-grained segments
such as separate segments for individual stack variables. These
registers do not load a segment descriptor from arbitrary locations in
memory. Instead, each register selects a segment descriptor from a
descriptor table with a limited number of entries. One could treat
the segment descriptor tables (or portions of these tables) as a cache
of active segments.
Second, more fine-grained segments are not derived from existing
segments. Instead, each entry in a descriptor table is independent.
Write access to a descriptor table permits construction of arbitrary
segments (including special segments that permit privilege
transitions). Restricting descriptor-table write access to kernel
mode does not protect against construction of arbitrary segments in
kernel mode due to bugs or vulnerabilities. As a result, segment
descriptors are not able to provide the same provenance guarantees as
tagged capabilities.
Third, existing segment descriptors do not have available bits for
storing types or permissions more expressive than the existing
read, write, and execute.
Finally, x86 segmentation is typically not used in modern operating
systems. On the 32-bit x86 architecture, systems generally create
segments with infinite bounds and use a non-zero base address only
for a single segment that provides TLS. The 64-bit x86 architecture
codifies this by removing segment bounds entirely and supporting non-zero-base
addresses only for two segment registers.
Software for x86 systems stores only the offset portion of virtual
addresses in pointer variables. Segment registers are set to fixed
values at program startup, never change, and are largely ignored.
One approach for providing a similar set of features to CHERI
capabilities on x86 would be to extend the existing segment primitives
to accommodate some of these differences. For example, descriptor-table
entries could be tagged, whereby loading an untagged segment would trigger
an exception. However, some other potential changes are broader in
scope (e.g., whether segment selectors should contain an index into a
table, versus a logical address of a segment descriptor). Extending
segments would also result in a very different model compared to CHERI
capabilities on other architectures, limiting the ability to share code
and algorithms. Instead, we propose to add CHERI capabilities to 64-bit
x86 by extending existing general-purpose integer registers.
\subsection{Common Architectural Features}
CHERI-x86-64 shares the following features with other CHERI
architectures:
\begin{itemize}
\item Tagged memory with capability-width tag granularity and alignment.
\item Registers able to hold capabilities are tagged.
\item \CIP{} controls program-counter-relative fetches.
\item \DDC{} controls memory operands using integer addresses.
\item Floating point is fully supported, including capability-relative
floating-point load and store instructions.
\item General-purpose registers are extended to hold capabilities.
\item It is never left ambiguous as to whether a register operand used
as the base address of a memory operand or branch target
is a capability and therefore must have a tag set.
\item \cappermASR limits privileged ISA operations when within
privileged rings.
\end{itemize}
\subsection{Unique Architectural Features}
The following changes are specific to CHERI-x86-64:
\begin{itemize}
\item CHERI-x86-64 makes use of opcode prefixes to permit altering
the addressing mode and operand size of individual instructions,
both in 64-bit mode and capability mode.
\item \RIP{} is the full integer value (virtual address) of \CIP{}
and not \CIP{}.offset.
\item Integer addresses are treated as absolute virtual addresses
bounded by \DDC{}, and are not treated as offsets to \DDC{}.base.
\item x86 exception handling is extended to support capabilities
including a new architectural stack frame for exception entry and
return.
\item A new exception code is used to report CHERI-related
exceptions.
\item New PTE bits and page-fault exception code bits are defined for
loading and storing capabilities in memory.
\item The \FSBASE{} and \GSBASE{} registers are extended as
capabilities.
\item As with CHERI-RISC-V, the \cflags{} field contains a single bit
used to enable capability mode in code capabilities installed into
\CIP{}.
\item Operations on capabilities can set bits in the \RFLAGS{}
register.
\end{itemize}
\section{CHERI-x86-64 Specification}
\subsection{Tagged Capabilities and Memory}
As with CHERI-RISC-V, we recommend that both memory and
registers contain tagged 128-bit capabilities.
Since capabilities require 16-byte alignment in memory, attempts to
load or store capabilities at misaligned addresses should raise a
General Protection Fault with an error code of zero, similar to
misaligned loads and stores of SSE registers.
\subsection{General-Purpose Capability Registers}
The x86 architecture has expanded its general-purpose integer registers multiple
times. Thus, the 16-bit \AX{} register has been extended to 32-bit \EAX{}
and 64-bit \RAX{}.
We propose extending each general-purpose integer register to a tagged, 128-bit register
able to contain a single capability.
The capability-sized registers would be named with a `C' prefix in place
of the `R' prefix used for 64-bit registers
(\CAX{}, \CBX{}, etc.).
As with CHERI-RISC-V,
we recommend that the bottom 64 bits of capability registers contain
the integer value (virtual address) and the upper 64 bits contain
capability metadata.
Reads of capability registers as integers return the integer value.
Integer writes to capability registers
should clear the tag and upper 64 bits of capability metadata, storing the
desired integer value in the bottom 64 bits.
The \RIP{} register (which contains the address of the current
instruction in the existing x86 architecture)
would also be extended into a \CIP{} capability. This would function as
the equivalent of \PCC{}.
\subsection{Additional Capability Registers}
\label{sec:x86:additional-caps}
Additional capability registers beyond those present in the general-purpose
integer
register set will also be required.
A new register will be required to hold \DDC{} for controlling
non-capability-aware memory accesses.
The x86 architecture currently uses the \FS{} and \GS{} segment selector registers
to provide thread-local storage (TLS). In the 64-bit x86 architecture,
these selectors are mostly reduced to holding an alternate base address
that is added as an offset to the virtual address of existing instructions.
For CHERI-x86-64 we recommend replacing these segment registers with two
new capability registers: \CFS{} and \CGS{}.
In addition, new capability control registers will be required to
manage user to kernel transitions as described in
Section~\ref{sec:x86:capability-control-registers}.
These additional registers will be stored as a separate bank of
capability registers. As with other x86 register banks such as
control registers and debug registers, additional capability registers
cannot be used
as operands (with a limited exception for \CFS{} and \CGS{} described
below) in existing instructions.
\subsection{Capability Mode}
As with other CHERI architectures, CHERI-x86-64 should support running existing
x86-64 code, capability-aware code, and hybrid code. This
requires the architecture to support an additional addressing mode
using capabilities as well as a new operand size for instructions
that use capabilities as operands.
The x86 architecture has supported similar extensions in the past when it was
extended to support 32-bit operation.
When x86 was extended from 16 bits to 32 bits, the architecture
included the ability to run existing 16-bit code without modification
as well as execute individual 16-bit or 32-bit instructions within a
32-bit or 16-bit codebase. The support for 16-bit versus 32-bit
operation was
split into two categories: operand size and addressing modes. The
code segment descriptor contains a single-bit `D' flag, which sets the
default operand size and addressing mode. These attributes can then
be toggled to the non-default setting via opcode prefixes. The 0x66
prefix is used to toggle the operand size, and the 0x67 prefix is used
to toggle the addressing mode.
In 64-bit (``long'') mode, the `D' flag is always set to
0 to indicate 32-bit operands and 64-bit addressing. A value of
1 for `D' is reserved. The 0x67 opcode prefix is used to toggle
between 32-bit and 64-bit addresses, but a few other single-byte opcodes
are invalid in 64-bit mode and could be repurposed as a prefix.
For CHERI support, we propose a similar scheme of using a default
execution mode along with prefixes to toggle the individual addressing
mode and operand size of individual instructions. We define a new
\textbf{capability mode}. As with CHERI-RISC-V, this mode is enabled
by setting the low bit of the \cflags{} field in \CIP{}. This mode is
valid only in 64-bit mode. A far call or jump that uses a 32-bit
code segment along with a target code capability with this flag set
will raise a General Protection Fault with the error code set to the
target segment selector.
In capability mode, instructions will use capability-aware addressing
(Section~\ref{sec:x86:capability-addressing}) by default. Some existing
opcodes will also assume a capability sized operand in this mode.
Finally, instructions which work with the stack would use \CSP{} as
the implicit stack pointer.
\subsubsection{Removed Instructions in Capability Mode}
In capability mode, the following 64-bit mode instructions would no
longer be valid:
\begin{itemize}
\item \insnnoref{PUSH FS}
\item \insnnoref{POP FS}
\item \insnnoref{PUSH GS}
\item \insnnoref{POP GS}
\item \insnnoref{LFS}
\item \insnnoref{LGS}
\item \insnnoref{LSS}
\item \insnnoref{LAR}
\item \insnnoref{LSL}
\item Direct memory-offset \insnnoref{MOV}
\item Far branches (\insnnoref{CALL}, \insnnoref{JMP}, and \insnnoref{RET})
\item Segment Prefixes for \CS{}, \DS{}, \ES{}, and \SSreg{}
\end{itemize}
\subsection{Using Capabilities with Memory Address Operands}
\label{sec:x86:capability-addressing}
We propose a new capability-aware addressing mode that can be
toggled via a new 0x07
opcode prefix. (In 32-bit x86, the 0x07 opcode is the
\insnnoref{POP ES} instruction, which is invalid in 64-bit mode.)
In capability mode, instructions will use
the capability-aware addressing mode by default. Individual
instructions can toggle between capability-aware and ``plain''
64-bit addressing via the 0x07 opcode prefix. Addresses using the
``plain'' 32-bit or 64-bit addressing will be constrained by \DDC{}
(for example, bounds and permissions).
Instructions using capability-aware addressing
would always use 64-bit virtual addresses.
The 0x07 prefix would be a Group 4 prefix meaning that a single
instruction would not be permitted to use both 0x67 and 0x07 prefixes.
In addition, the use of the 0x67 prefix in capability mode would not
be permitted.
\subsubsection{Capability-Aware Addressing}
For instructions with register-based memory operands, capability-aware
addressing would use the capability version of the register rather
than the integer register as a virtual address constrained by \DDC{}.
For example:
\begin{verbatim}
mov 0x8(%cbp),%rax
\end{verbatim}
would read the 64-bit value at offset 8 from the capability described
by the \CBP{} register.
On the other hand,
\begin{verbatim}
mov 0x8(%rbp),%rax
\end{verbatim}
would read the 64-bit value at the address \RBP{}+8 constraining the
memory access to the bounds and permissions of the \DDC{} capability.
Both instructions would use the same opcode aside from the addition of
an 0x07 opcode prefix. In capability mode, the second
instruction would require the prefix. In plain 64-bit mode,
the first instruction would require the prefix.
\subsubsection{Scaled-Index Base Addressing}
x86 also supports an addressing mode that combines the values of two
registers to construct a virtual address known as scaled-index base
addressing. These addresses use one register, the \emph{base}, and a
second register, the \emph{index}, multiplied by a scaling factor of 1, 2,
4, or 8. For these addresses, capability-aware addresses would select
a capability for the base register, but the index register would use
the integer value of the register. For example:
\begin{verbatim}
mov (%rax,%rbx,4),%rcx
\end{verbatim}
This computes an effective address of \RAX{} + \RBX{} * 4 and loads the value
at that address into \RCX{}, The capability-aware version would be:
\begin{verbatim}
mov (%cax,%rbx,4),%rcx
\end{verbatim}
That is, starting with the \CAX{} capability, \RBX{} * 4 would be added to the
offset, and the resulting address validated against the \CAX{} capability.
\subsubsection{RIP-Relative Addressing}
The 64-bit x86 architecture added a new addressing mode to support more
efficient Position-Independent Code (PIC) performance.
This addressing mode uses an immediate offset
relative to the current value of the instruction
pointer. These addresses are known as \RIP{}-relative addresses.
To support existing code, \RIP{}-relative addresses should be constrained
by \DDC{} when using ``plain'' 64-bit addressing.
When capability-aware addressing is used, \RIP{}-relative addresses
would instead be treated as \CIP{}-relative addresses
constrained by the bounds and permissions of \CIP{}.
\subsubsection{Absolute Addresses}
Memory operands can be encoded without a base register, either as an
absolute address, or an absolute address added to a scaled index
register. If these addresses are not used as offsets relative to
\CFS{} or \CGS{} as described below in Section~\ref{sec:x86:cfs-cgs},
they are always constrained by
\DDC{}, including in capability mode.
\subsubsection{Direct Memory-Offset MOVs}
The direct memory-offset \insnnoref{MOV} instructions store the
absolute address of a memory operand as an immediate operand.
Extending these instructions to support capability immediates would
require padding nops to align the capability immediate as well as text
relocations (even for position-dependent code). However, we do not
anticipate wide use of these instructions so instead choose to
remove them in capability mode and restrict them to using integer
operands and integer addressing in 64-bit mode. Attempting to use these instructions
with capability-aware addressing would be reserved and raise a UD\#
exception.
\subsubsection{Addresses Relative to CFS and CGS}
\label{sec:x86:cfs-cgs}
Capability-aware addressing must also permit addresses defined as
offsets relative to \CFS{} and \CGS{} to support TLS with
capability-aware addresses. When an instruction uses the \FS{} or
\GS{} segment prefix with capability-aware addressing, the memory
operand (registers and displacement) is interpreted as an integer
offset relative to the \CFS{} or \CGS{} capability register,
respectively.
Other segment prefixes are not permitted in capability-aware
addressing. Attempting to use a segment prefix other than \FS{} or
\GS{} with a capability-aware address should raise an illegal
instruction exception.
\subsubsection{Instructions with Implicit Memory Operands}
Some x86 instructions have implicit memory operands addressed by a
register. These instructions should support addressing memory with
capabilities.
The ``string''
instructions use \RSI{} as source address and \RDI{} as a destination address.
For example, the
\insnxesref{STOS} instruction stores the value in \AL{}/\AX{}/\EAX{}/\RAX{} to the address in
\RDI{}, and then either increments or decrements the destination
index register (depending on the Direction Flag). When capability
addressing mode is enabled,
these string instructions should use \CSI{} instead of \RSI{} and \CDI{} instead of
\RDI{}.
\insnnoref{XLAT} should use \CBX{} as the implicit table address when
using capability-aware addressing.
\subsubsection{Stack Address Size}
Instructions that work with the stack such as \insnxesref{PUSH} or
\insnxesref{CALL} use the stack pointer as an implicit operand. In
32-bit x86, the `B' flag of the stack segment selector determines if
the 16-bit or 32-bit stack pointer register is used. In 64-bit long
mode, \RSP{} is always used as the stack pointer. In capability mode,
\CSP{} would always be used as the stack pointer.
Code that needs to use the alternate stack pointer
interpretation would simulate these instructions using \insnxesref{MOV}
instructions and adjusting the desired stack pointer using
instructions such as \insnxesref{ADD} or \insnxesref{SUB}. Emulation of
\insnxesref{CALL} or \insnxesref{RET} would use \insnxesref{JMP} to
adjust the instruction pointer.
\subsection{Capability-Aware Instructions}
CHERI-x86-64 will require new instructions to examine and modify
capabilities. Many of these new instructions can be implemented as
new variants of existing instructions that use an opcode that
specifies a capability operation rather than an integer operation.
Existing x86 toolchains already use instruction suffixes such as
\texttt{b}, \texttt{w}, \texttt{l}, and \texttt{q} to explicitly state
the operand size. We recommend that the \texttt{c} suffix be used to
explicitly state a capability operand size.
\subsubsection{Capability Operands for Existing Opcodes}
Previous extensions to the x86 architecture have relied on opcode
prefixes combined with the `D' and `L' flags of the current code
segment to determine the operand size. We propose a similar
scheme for supporting capability-sized operands with existing
opcodes.
First, we propose reusing a single-byte opcode declared invalid in
64-bit mode such as 0x06 (\insnnoref{PUSH ES}) as an opcode prefix
(\textbf{capability operand prefix}). This prefix would be classified
as a Group 3 prefix meaning that a single instruction would not be
permitted to use both 0x66 and 0x06 prefixes.
When not executing in capability mode, existing instructions will
follow the existing rules for 64-bit long mode as defined by the
0x66 prefix and \texttt{REX.W} flag to set the operand size. If an
instruction supports capability-sized operands, the capability operand
prefix can be used to use a capability-sized operand instead. This
prefix would have higher precedence than \texttt{REX.W}.
In capability mode, most instructions that can operate on either
integer or capability-sized values would follow the same logic in the
previous paragraph to determine the operand size. However, two groups
of existing instructions would default to using a capability-sized
operand when executed in capability mode:
\begin{itemize}
\item Near branches.
\item Instructions that implicitly reference
the stack pointer (\CSP{}).
\end{itemize}
This matches the approach used to select a default operand size of 64
bits in 64-bit long mode. For some of these instructions, the
capability operand prefix could be used to revert to a smaller operand
size. The effective operand size would then determined by \texttt{REX.W}.
\subsubsection{Extending Existing Instructions to Support Capability Operands}
Several existing instructions should be extended to support
capability operands:
\begin{itemize}
\item \insnxesref[mov]{MOVC} would handle loads and stores of
capabilities similar to \insnriscvref{CLC} and \insnriscvref{CSC} as well as
copying capabilities between registers similar to \insnriscvref{CMove}.
To permit moving the contents of an additional capability register
to a general-purpose register or vice versa, two new
\insnxesref[movcap]{MOV} opcodes would be
used. These opcodes would permit access to \CFS{}, \CGS{}, and
\DDC{} in all privilege levels. Access to other additional
capability registers would be permitted only in privilege level 0.
\item \insnxesref[movnti]{MOVNTIC} would store a single capability to memory
using a non-temporal hint.
\item The string instructions \insnxesref{LODS}, \insnxesref{MOVS},
and \insnxesref{STOS} would be extended to support capability
operands.
We do not currently foresee a need to extend \insnnoref{CMPS} and
\insnnoref{SCAS} with support for capability operands. If that
did prove necessary, they could be extended.
\item \insnxesref[cmov]{CMOVC} would handle conditional loads and stores of
capabilities.
\item \insnxesref[add]{ADDC} and \insnxesref[sub]{SUBC} would be used to adjust
the \textbf{address} field of a capability similar to \insnriscvref{CIncOffset}. Note
that for these instructions, the source operand would either be a
sign-extended immediate or a 64-bit integer register whose value
is either added to or subtracted from the \textbf{address} field of the
capability-sized destination operand.
For example:
\begin{verbatim}
add %csp,$16
\end{verbatim}
would move the capability stack pointer up by 16 bytes.
We do not anticipiate a need for capability-sized variants of
\insnnoref{ADC} or \insnnoref{SBB}.
\item \insnxesref[inc]{INCC} and \insnxesref[dec]{DECC} would permit
simple increments and decrements of the \textbf{address} field of
capabilites.
\item \insnxesref[and]{ANDC}, \insnxesref[or]{ORC}, and \insnxesref[xor]{XORC} would
permit bit manipulation of the \textbf{address} field of a capability. As
with \insnxesref[add]{ADDC}, the second operand would always be an
integer operand.
\item \insnxesref[cmp]{CMPC} would permit comparison of capability values
including the functionality of both \insnref{CSetEqualExact} (via
\texttt{ZF}) and \insnref{CTestSubset} (via \texttt{SF}). This is
somewhat different from the existing variants of \insnnoref{CMP}
that perform the equivalent \insnnoref{SUB} instruction and then
discard the result as in this case the flags set would not be
identical to the flags set as a result of \insnxesref[sub]{SUBC}.
We do not anticipate a need for a capability-sized variant of
\insnnoref{TEST}.
\item \insnxesref[cmpxchg]{CMPXCHGC} will be required to support atomic
operations on capabilities. (Note that \insnnoref{CMPXCHG16B}'s
existing semantics are not suitable for capabilities as it divides
the values into register pairs.)
\item \insnxesref{CMPXCHG2C} will be required to support atomic
operations on pairs of capabilities.
\item \insnxesref[xchg]{XCHGC} will also be required to support atomic
operations on capabilities.
\item It may also be desirable to support \insnxesref[xadd]{XADDC}. For
this instruction, only the integer portion of the second (source)
operand would be added to the first
(destination) operand to determine the value stored to the
destination. Any tag or capability metadata in the second operand
would be ignored and would be overwritten with the original value
of the first operand.
\item \insnxesref[push]{PUSHC} and \insnxesref[pop]{POPC} would be used to save
and restore capability registers on the stack.
\item \insnxesref[lea]{LEAC} would store the resulting address in a
destination capability register.
\insnxesref{LEA} would not support the 0x07 opcode prefix. The
address size would always match the operand size. Storing an
integer address in a capability register would have the same
effect as the equivalent version of \insnxesref{LEA} storing the
integer address to the integer alias register. Using a
capability-aware address with an integer \insnxesref{LEA} would
also be identical in effect to using ``plain'' addressing.
\item \insnnoref{ENTER} and \insnnoref{LEAVE} could be extended to
support implicit capability operands, or they could be deprecated
and remain as integer-only instructions.
If these instructions were extended to support capability
operands, the capability-sized versions would operate on \CSP{}
and \CBP{} rather than \RSP{} and \RBP{}. These instructions
would also default to capability operands in capability mode
if extended.
If these instructions were deprecated then they would would be
removed in capability mode.
\end{itemize}
\subsection{Control-Flow Instructions}
Absolute near branches would be extended to support capability operands.
In 64-bit long mode, a capability operand prefix would select a
capability operand size. In capability mode, absolute near branches would
support only capability operands.
Absolute near branches that use an integer operand would set the
\textbf{address} field of the
\CIP{} capability while absolute near branches using a capability operand would
load a new capability into \CIP{}.
Relative near branches would always modify the \textbf{address} field of the \CIP{}
capability and would not support the capability operand prefix.
The size of return addresses pushed to and popped from the
stack for near branches would be determined by the operand size.
Capability-sized branches would save and restore a full capability on
the stack while integer-sized branches would save and restore an
integer address.
Far calls, jumps, and returns would not support capability operands
and would be invalid in capability mode.
Far branches would
set the \textbf{address} field of \CIP{}.
If the resulting value of \CIP{} after any branch
is invalid, a capability violation fault would be raised on the branch
instruction (see Section~\ref{sec:x86:capability-fault}).
\insnnoref{IRETC} should pop a capability exception frame (see
Section~\ref{sec:x86:interrupt-exception}) from the stack loading
capabilities into \CIP{} and \CSP{}. This instruction would require
the capability operand prefix. An attempt to restore a 32-bit code
segment paired with a \CIP{} that uses capability mode should raise a
General Protection fault with the error code set to the destination
code segment.
Note that attempting to push or pop a misaligned capability will raise
an exception. The stack pointer must be suitably aligned before the
use of \insnxesref[call]{CALLC}, \insnnoref{IRETC}, and \insnxesref[ret]{RETC}.
\subsection{New CHERI Instructions}
For other capability operations we
propose adding new CHERI-specific instructions.
Existing general-purpose x86 instructions support two operands rather
than three operands. To avoid requiring a \VEX{} prefix for all new
CHERI instructions, most instructions are defined with two operands
rather than three. New instructions that require three operands must
be encoded using a \VEX{} prefix.
Note that all of these instructions would only be valid in 64-bit mode
and capability mode.
\subsubsection{Capability-Inspection Instructions}
These instructions fetch a single field from a capability.
\begin{itemize}
\item \insnxesref{GCPERM} -- Get Capability Permissions
\item \insnxesref{GCTYPE} -- Get Capability Object Type
\item \insnxesref{GCBASE} -- Get Capability Base
\item \insnxesref{GCLEN} -- Get Capability Length
\item \insnxesref{GCTAG} -- Get Capability Tag
\item \insnxesref{GCOFF} -- Get Capability Offset
\item \insnxesref{GCHI} -- Get Capability High Half
\item \insnxesref{GCLIM} -- Get Capability Limit
\item \insnxesref{GCFLAGS} -- Get Capability Flags
\end{itemize}
\subsubsection{Capability-Modification Instructions}
If these instructions fail, they should clear the tag in the resulting
capability.
\begin{itemize}
\item \insnxesref{SEAL} -- Seal Capability
\item \insnxesref{UNSEAL} -- Unseal Capability
\item \insnxesref{ANDCPERM} -- Mask Capability Permissions
\item \insnxesref{SCOFF} -- Set Capability Offset
\item \insnxesref{SCADDR} -- Set Capability Address
\item \insnxesref{SCBND} -- Set Capability Bounds
\item \insnxesref{SCBNDE} -- Set Exact Capability Bounds
\item \insnxesref{SCHI} -- Set Capability High Half
\item \insnxesref{SCFLAGS} -- Set Capability Flags
\item \insnxesref{CLCTAG} -- Clear Capability Tag
\item \insnxesref{BUILDCAP} -- Construct Capability
\item \insnxesref{CPYTYPE} -- Construct Sealing Capability
\item \insnxesref{CSEAL} -- Conditional Capability Seal
\item \insnxesref{SENTRY} -- Seal Capability as a Sentry
\end{itemize}
\subsubsection{Control-Flow Instructions}
\begin{itemize}
\item \insnxesref{CINVOKE} -- Invoke sealed capability pair
\end{itemize}
\subsubsection{Adjusting to Compressed Capability Precision
Instructions}
\begin{itemize}
\item \insnxesref{CRRL} -- Round Representable Length
\item \insnxesref{CRAM} -- Representable Alignment Mask
\end{itemize}
\subsubsection{Tag-Memory Access Instructions}
These instructions permit bulk access to a set of in-memory tags.
Each instruction accesses the tags in a ``stride'' of capabilities.
The size of a stride is implementation dependent. It must be a power
of two, and it is suggested that a stride contain the number of tags
in a single cache line. The stride size should either be reported in
a new \insnnoref{CPUID} leaf or be defined as equal to the value
returned by an existing \insnnoref{CPUID} leaf.
\begin{itemize}
\item \insnxesref{LCTAGS} -- Load Capability Tags
\item \insnxesref{CLCTAGS} -- Clear Capability Tags
\end{itemize}
\subsection{Interactions with Vector Extensions}
CHERI should have minimal impact on existing vector extensions to the
x86 architecture including MMX, SSE, AVX, and AVX-512.
\subsubsection{Vector Registers and Memory Tags}
We propose that vector registers should not contain tags. Loads of
vector registers should ignore tags in memory, and stores of vector
registers to memory should always clear tags. Existing vector
instructions that manipulate vector register contents do not make
sense for tagged capability values. However, vector extensions are
also used to perform certain classes of memory loads and stores, which
may require additional care.
\subsubsection{Memory Copies}
Vector loads and stores are often used to implement \ccode{memcpy()}.
In CHERI C, \ccode{memcpy()} must preserve tags. A \ccode{memcpy()}
implementation that uses \insnxesref[mov]{MOVC} will operate at the same
width as existing memory copies implemented using SSE, which may
mitigate some of the cost. Another option may be to support an
optimized \insnxesref[movs]{REP MOVSC} similar to the existing optimization
for \insnnoref{REP MOVSB} where the former instruction would preserve
tags during a copy unlike the latter.
\subsubsection{Non-Temporal Stores}
Blocks of data stored to memory mapped with write-combining (WC) are
often written via non-temporal vector register stores. However, such
data is generally consumed by an I/O device via DMA and rarely
contains pointers. We believe that permitting a non-temporal store of
a single capability via \insnxesref[movnti]{MOVNTIC} is sufficient for cases
requiring non-temporal stores of tagged capabilities.
\subsubsection{Memory Addressing}
Vector instructions with memory operands would support
capability-aware addressing in the same manner as general-purpose
register instructions. For scatter/gather instructions using VSIB,
the base address register would use a capability register instead of
an integer address when using capability-aware addressing.
\subsection{Capability Violation Faults}
\label{sec:x86:capability-fault}
For reporting capability violations, we propose reserving a new
exception vector. This new exception would report an error code
pushed as part of the exception frame similar to GP\# and PF\# faults.
This error code would contain the capability exception code as
described in Table~\ref{table:x86:capability-cause} to indicate
the specific violation.
\begin{table}
\begin{center}
\begin{tabular}{ll}
\toprule
Value & Description \\
\midrule
0x0 & Tag Violation \\
0x1 & Length Violation \\
0x2 & Seal Violation \\
0x3 & Type Violation \\
0x4 & Software-defined Permission Violation \\
0x5 & \cappermG Violation \\
0x6 & \cappermX Violation \\
0x7 & \cappermL Violation \\
0x8 & \cappermS Violation \\
0x9 & \cappermLC Violation \\
0xa & \cappermSC Violation \\
0xb & \cappermSLC Violation \\
0xc & \cappermASR Violation \\
0xd & \cappermInvoke Violation \\
0xe & \cappermCid Violation \\
\bottomrule
\end{tabular}
\end{center}
\caption{CHERI-x86-64 Capability Exception Error Codes}
\label{table:x86:capability-cause}
\end{table}
If an instruction could potentially throw more than one capability exception,
the capability exception error code is set to the highest priority exception (numerically lowest
priority value) as shown in Table~\ref{table:x86:exception-priority}.
\begin{table}
\begin{center}
\begin{tabular}{ll}
\toprule
Priority & Description \\
\midrule
1 & \cappermASR Violation \\
2 & Tag Violation \\
3 & Seal Violation \\
4 & Type Violation \\
5 & \cappermInvoke Violation \\
& \cappermCid Violation \\
6 & \cappermX Violation \\
7 & \cappermL Violation \\
& \cappermS Violation \\
8 & \cappermLC Violation \\
& \cappermSC Violation \\
9 & \cappermSLC Violation \\
10 & \cappermG Violation \\
11 & Length Violation \\
12 & Software-defined Permission Violation \\
\bottomrule
\end{tabular}
\end{center}
\caption{CHERI-x86-64 Capability Exception Priority}
\label{table:x86:exception-priority}
\end{table}
CHERI-RISC-V includes the name of the register, which
triggers a capability violation. It is not feasible to provide a
direct analog of this on x86. Indirect jumps and calls may raise an
exception while loading a capability from memory that is not present
in any register at the start of the instruction. However, unlike page
faults, capability violation faults are not generally restartable and
the register name's primary use is for debugging convenience rather than
correctness. There are a few possible options for providing similar
information:
\begin{enumerate}
\item Provide a copy of the faulting capability via a new capability
control register similar to the PF\# virtual address stored in
\CRTWO{}. This faulting capability would include the result of any
offset adjustments from immediates or scaled indices. If the result
of offset adjustments made the capability unrepresentable, the
faulting capability would have its tag cleared.
\item Similar to the above, but ignore offset adjustments and provide
only the base capability value.
\item Provide the virtual address from the faulting capability in
\CRTWO{} similar to PF\#. A debugger could examine the faulting
instruction's operands to determine which capability triggered the fault.
\item Do nothing as the prior approaches may be too expensive to
implement.
\end{enumerate}
Like Morello and CHERI-RISC-V, CHERI-x86-64 would
raise capability violation faults when a invalid memory access is
performed such as an out-of-bounds access or access via an untagged
capability. Instructions which modify
capabilities should not raise capability violation faults (for
example, when a capability becomes unrepresentable) but should instead
clear the tag of the resulting capability. This permits compilers to
speculatively reorder these instructions without raising spurious
faults during execution.
\subsection{Call Gates}
We do not recommend extending call gates to support capabilities.
Supporting capabilities with call gates would likely require the
following changes:
\begin{itemize}
\item Extending the global and local descriptor table format to
support a new capability call gate that stores a full capability
rather than a 64-bit address. This will be more invasive than the
64-bit call gate that depends on the ability to force a number
of reserved bits in the fourth double word to zero as a sentinel
type for the second half of a 64-bit call gate.
\item As with 64-bit call gates, capability call gates would not support
parameter copying.
\item Calls to a capability call gate would need to push a modified
call frame containing both a code segment and code capability that
would be returned from via \insnnoref{RETFC}.
\end{itemize}
\subsection{Interrupt and Exception Handling}
\label{sec:x86:interrupt-exception}
For interrupt and exception handling, we propose a new overall CPU
mode that enables the use of capabilities. The availability of this
mode would be indicated by a new \insnnoref{CPUID} flag. The mode
would be enabled by setting a new bit in \CRFOUR{}. When this mode is
enabled, exceptions would push a new type of interrupt frame. As with
exceptions in long mode, the stack pointer would be 16-byte aligned
prior to pushing the exception frame to ensure capabilities are
aligned. The \RIP{} and \RSP{} fields in the exception frame would be
replaced with the full \CIP{} and \CSP{} capabilities. Other fields
in this frame would be padded to 16 bytes. To minimize padding, it
may be desirable to pack multiple smaller registers into a single
16-byte slot; for example, \SSreg{}, \CS{}, and \RFLAGS{} could be stored
in a single slot. However, this would result in a frame layout
inconsistent with far calls. \insnnoref{IRETC} would be used in
interrupt service routines to unwind this frame.
\subsubsection{Capability Control Registers}
\label{sec:x86:capability-control-registers}
Interrupt and exception handlers require new capabilities for the
program counter (\CIP{}) and stack pointer (\CSP{}) registers. These
values must be derived from valid, privileged capabilities. To
support this, we propose the addition of a new class of capability
registers: capability control registers.
Capability control registers are capability-sized control registers.
As with other control registers such as \CRFOUR, direct access to
capability control registers would be restricted to supervisor mode as
well as requiring \cappermASR{} in \CIP{}. Unlike other control
registers, however, capability control registers would not be accessed
via the \texttt{0F 20} and \texttt{0F 22} opcodes of \insnnoref{MOV}.
Instead, capability control registers would be named as additional
capability registers as described in
Section~\ref{sec:x86:additional-caps}.
We consider two possible approaches for deriving \CIP{} and \CSP{} at
the start of an interrupt or exception.
\subsubsection{Kernel Code and Stack Capabilities}
The first approach would add two new capability control registers: the Kernel
Code Capability (\KCC{}) and Kernel Stack Capability (\KSC{}). Transitions into
supervisor mode would load new addresses from
existing data structures and tables to derive the new \CIP{} and \CSP{}
register values. For example, the current virtual address stored in
each Interrupt Descriptor Table (\IDT{}) entry would be used as an
address to derive a new \CIP{} from \KCC{}, and the address stored in the Interrupt
Stack Table (\IST{}) entry in the current Task-State Segment (\TSS{}) would
be used as an address to derive a new \CSP{} from \KSC{}. Transitions via
the \insnnoref{SYSCALL} instruction would use the address from \LSTAR{} to
construct the new \CIP{}.
This approach does require broad capabilities
for \KCC{} and \KSC{} that can accommodate any desired entry point or stack
location. However, it will require minimal changes to existing systems
code such as operating-system kernels.
\subsubsection{Capabilities in Entry Points}
The second approach would replace virtual addresses stored in
existing entry points with complete capabilities. This is a more
invasive change, requiring larger changes to existing systems code, but
it enables the use of more fine-grained capabilities for each entry
point.
Setting the desired kernel stack pointers in \CSP{} would require a new
\TSS{} layout that expanded the existing \RSP{} and \IST{} entries to
capabilities.
For \insnnoref{SYSCALL}, a new capability control register \CSTAR{} would be
added to hold the target instruction pointer.
Entries in the \IDT{} would be expanded to 32-bytes, appending a capability
code pointer in the last 16 bytes. This would double the size of the
\IDT{}, and most of the bytes would be unused. However, it would
ensure that all of the information currently stored in an \IDT{} entry
(such as the segment selector, \IST{} index, and descriptor type) would
be configurable.
\subsubsection{\insnnoref{SWAPGS} and Capabilities}
The \insnnoref{SWAPGS} instruction is used in user-to-kernel
transitions for the 64-bit x86 architecture to permit separate TLS
pointers for user and kernel mode. We recommend defining a new
capability control register \KGS{}. \insnnoref{SWAPGS} in capability
mode would swap the \CGS{} and \KGS{} registers.
\subsection{FS and GS Aliases}
The \FS{} and \GS{} segment descriptors have grown several related
aliases over time such as the \FSBASE{} and \GSBASE{} MSRs and
\insnnoref{RDFSBASE} family of instructions. These aliases should be
implemented as the addresses of the appropriate capability register.
Reads of the \FSBASE{}, \GSBASE{}, and \KGSBASE{} MSRs should return
the \textbf{address} field of the \CFS{}, \CGS{}, and \KGS{} capabilities,
respectively. Writes to these MSRs should set the \textbf{address} field of the
respective capability equivalent to \insnxesref{SCADDR}. Similarly,
the \insnnoref{RDFSBASE} and \insnnoref{RDGSBASE} instructions should
return the \textbf{address} field of the \CFS{} and \CGS{} capabilities,
respectively. The \insnnoref{WRFSBASE} and \insnnoref{WRGSBASE}