Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hook失效 #351

Open
eVen-p opened this issue Dec 22, 2023 · 15 comments
Open

hook失效 #351

eVen-p opened this issue Dec 22, 2023 · 15 comments

Comments

@eVen-p
Copy link

eVen-p commented Dec 22, 2023

Hi

我在linux下发现可能存在hook未生效的问题,一些系统api会阻塞线程,而hook后依旧阻塞了线程(没有变成阻塞协程)。我写了一个小demo来进行验证,发现如果我的demo链接静态库libco.a那么hook是正常的,如果链接动态库libco.so,看起来hook没有生效。

如下是demo代码:

// test.cpp
#include "co/co.h"
#include "co/os.h"
#include <iostream>

void f() {
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd == -1) {
        std::cout << "invalid socket" << std::endl;
        return;
    }        

    struct sockaddr_in serverAddr;
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_port = htons(5064); // 某个端口
    serverAddr.sin_addr.s_addr = inet_addr("xx.xx.xx.xx"); // 某个ip地址

    // 阻塞api1
    if (connect(sockfd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) == -1) {
        std::cout << "connect failed!" << std::endl;
        return;
    }

    std::cout << "connected" << std::endl;

    char buffer[1024];
    size_t numBytes = recv(sockfd, buffer, 1023, 0); // 阻塞api2
    if (numBytes == -1) {
        std::cout << "recv failed" << std::endl;
    } else {
        std::cout << "good" << std::endl;
    }

    close(sockfd);
}

int main() {
    co::wait_group wp(1);

    for(int i = 0; i < os::cpunum(); i++) {
        go(f);
    }

    go([&wp] {
        // 如果所有线程都被阻塞,这里就不会被执行
        std::cout << "hi" << std::endl;
        wp.done();
    });

    std::cout << "start to wait" << std::endl;

    wp.wait();
    std::cout << "ok" << std::endl;

    co::sleep(1000);

    return 0;
}

最终结果是,如果链接libco.a,程序可以正常退出;如果链接了动态库libco.so,那么程序阻塞,无法退出。

我尝试了3.0.0和3.0.1,现象都是一样的。我的libco.a和libco.so均由xmake编译得到。

另外我也尝试了在3.0.0上使用cmake编译,发现无论是静态库还是动态库,均无法正常退出。请问是否是我使用的方式不对呢?

附上我demo的CMakeLists.txt:

cmake_minimum_required(VERSION 3.13)
project(test)

#aux_source_directory(<dir> <variable>):查找指定目录下的所有源文件,然后将结果存进指定变量。
aux_source_directory(${CMAKE_CURRENT_SOURCE_DIR}/../src SOURCE_SRC)

set(EXECUTABLE_OUTPUT_PATH ../out)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -ldl -lpthread")

include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../include/)

MESSAGE(STATUS "\n\n==== Starting ${PROJECT_NAME} CMAKE Build ====")

add_executable(${PROJECT_NAME} ${SOURCE_SRC})

target_link_libraries(${PROJECT_NAME}
                            ${CMAKE_CURRENT_SOURCE_DIR}/../lib/libco.so
                            )

install(TARGETS ${PROJECT_NAME} DESTINATION bin)

demo目录结构:

├── build
│ └── CMakeLists.txt
├── include
│ └── co...
├── src
│ └── test.cpp
├── lib
│ └── libco.a
└── out

系统:CentOS7,gcc 4.8.5

@idealvin
Copy link
Owner

idealvin commented Dec 22, 2023

@eVen-p
动态库,链接时 libco.so 放最前面

@eVen-p
Copy link
Author

eVen-p commented Dec 25, 2023

@eVen-p 动态库,链接时 libco.so 放最前面

@idealvin
我尝试了不使用cmake,而是在命令行直接编译我的demo,保证-l链接时libco在最前面:

g++ -std=c++11 test.cpp -o ../out/test -I/my_path_to_libco/include/ -L/my_path_to_libco/lib -lco -ldl -lpthread

然后运行demo程序,得到的结果还是一样的

@idealvin
Copy link
Owner

    co::wait_group wp(os::cpunum());      // 初始值与 done()协程数一致

    for(int i = 0; i < os::cpunum(); i++) {
        go(f);
    }

    go([wp] {   // --------------传值
        // 如果所有线程都被阻塞,这里就不会被执行
        std::cout << "hi" << std::endl;
        wp.done();
    });

    std::cout << "start to wait" << std::endl;

    wp.wait();

@eVen-p
Copy link
Author

eVen-p commented Dec 26, 2023

感谢指正,我已经把wp改成传值。

另外wait_group的初始值,原来的初值1已经是和done()调用次数一致,下面的go([wp] ...)只调用一次,for里面的go不涉及wait_group

这两处更改我都做了尝试,demo的运行结果还是没有变化,看起来阻塞的依旧是线程。
另外我在hook.cc中的函数加了打印,demo中的两个调用connect, recv没有走到这里面去,没被hook掉。

@idealvin
Copy link
Owner

idealvin commented Dec 26, 2023

对端有发送数据吗?没数据的话,"recv"就是卡住的

另外,程序启动时加上 -co_hook_log 可以打印 hook 相关的日志,看看 hook 是否正常

@eVen-p
Copy link
Author

eVen-p commented Dec 26, 2023

对端有发送数据吗?没数据的话,"recv"就是卡住的

对端没有发送数据,我构造这个demo的想法就是,让这个api阻塞住,如果阻塞的是线程,那main就会卡在wp.wait()无法退出,如果阻塞的是协程,那么main函数就可以正常退出。现在我使用libco.a静态库就可以正常退出,而libco.so动态库就发现退出不了,程序卡在那不动

@idealvin
Copy link
Owner

idealvin commented Dec 27, 2023

不会阻塞线程

hook日志有吗?

@eVen-p
Copy link
Author

eVen-p commented Jan 2, 2024

我尝试了换了一个系统,macOS上面是正常的,demo行为都符合预期

关于hook日志,运行时我加了-co_hook_log,不过没看到有多的打印在终端出现。我已在main开头加了flag::parse(argc, argv);。我手动把DEF_bool(co_hook_log, false, ...改成true也是没看到其他打印

@idealvin
Copy link
Owner

idealvin commented Jan 2, 2024

我尝试了换了一个系统,macOS上面是正常的,demo行为都符合预期

关于hook日志,运行时我加了-co_hook_log,不过没看到有多的打印在终端出现。我已在main开头加了flag::parse(argc, argv);。我手动把DEF_bool(co_hook_log, false, ...改成true也是没看到其他打印

mac 与 linux hook机制不一样

日志默认打印到文件中,加 -cout 参数可以打印到终端

@eVen-p
Copy link
Author

eVen-p commented Jan 4, 2024

linux下使用静态库时demo打印如下(os::cpunum() == 2):

start to wait
hi
ok
connected
connected
D0104 09:15:58.877 2413 hook.cc:209] hook socket, sock: 9, non_block: false
D0104 09:15:58.877 2414 hook.cc:209] hook socket, sock: 10, non_block: false
D0104 09:15:58.877 2414 hook.cc:471] hook connect, fd: 10, r: 0
D0104 09:15:58.877 2413 hook.cc:471] hook connect, fd: 9, r: 0
D0104 09:15:59.890 2412 hook.cc:906] hook nanosleep, ms: 1000, r: 0

链接动态库时打印如下:

start to wait
connected
connected
// 此时阻塞在这
^C

@idealvin
Copy link
Owner

idealvin commented Jan 4, 2024

@eVen-p

抱歉,近期事多,回复晚了

看起来是动态库中符号没导出的问题, 这里 改成下面这样试试:

#define _hook(f) __coapi f

@eVen-p
Copy link
Author

eVen-p commented Jan 5, 2024

尝试了一下动态库,输出如下:

[root@localhost build]# ../out/test -co_hook_log -cout
start to wait
connected
connected
D0105 09:28:17.811 3393 hook.cc:209] hook socket, sock: 9, non_block: false
D0105 09:28:17.811 3392 hook.cc:209] hook socket, sock: 10, non_block: false
^C

多了socket的hook log,但是后面的log就没有了;程序依旧阻塞。

另外查看了符号表:

[root@localhost out]# nm test | grep socket
                 U socket
[root@localhost out]# nm test | grep connect
                 U connect@@GLIBC_2.2.5
[root@localhost out]# nm test | grep recv
                 U recv@@GLIBC_2.2.5

@idealvin
Copy link
Owner

idealvin commented Jan 6, 2024

connect, recv 链接的是系统中的,没 hook 到

看看 libco.so 的符号表,有没有 connect...

@eVen-p
Copy link
Author

eVen-p commented Jan 8, 2024

动态库如下

[root@localhost lib]# nm libco.so | grep connect
0000000000025a50 T connect
00000000002a1dc0 b _sys_connect
0000000000041990 T _ZN2co7connectEiPKvii
000000000006a0e0 t _ZN3rpc10ClientImpl7connectEv
0000000000066c40 t _ZN3rpc10ServerImpl13on_connectionEN3tcp10ConnectionE
0000000000075e40 T _ZN3ssl7connectEPvi
0000000000074290 t _ZN3tcp10ServerImpl17on_ssl_connectionEi
0000000000074130 t _ZN3tcp10ServerImpl17on_tcp_connectionEi
00000000000727f0 T _ZN3tcp6Client10disconnectEv
0000000000072860 T _ZN3tcp6Client7connectEi
00000000000717e0 T _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
0000000000079840 t _ZN4http10ServerImpl13on_connectionEN3tcp10ConnectionE
[root@localhost lib]# 
[root@localhost lib]# nm libco.so | grep recv
00000000002a2130 b FLG_http_recv_timeout
00000000002a20fc b FLG_rpc_recv_timeout
000000000002a8f0 T recv
000000000002ac80 T recvfrom
000000000002b040 T recvmsg
00000000002a1da0 b _sys_recv
00000000002a1d98 b _sys_recvfrom
00000000002a1d90 b _sys_recvmsg
0000000000041df0 T _ZN2co4recvEiPvii
00000000000421d0 T _ZN2co5recvnEiPvii
00000000000422d0 T _ZN2co8recvfromEiPviS0_Pii
0000000000075e50 T _ZN3ssl4recvEPvS0_ii
0000000000075e60 T _ZN3ssl5recvnEPvS0_ii
000000000006c0e0 T _ZN3tcp10Connection4recvEPvii
000000000006c0f0 T _ZN3tcp10Connection5recvnEPvii
0000000000071a60 T _ZN3tcp6Client4recvEPvii
0000000000071a80 T _ZN3tcp6Client5recvnEPvii
0000000000075140 t _ZN3tcp7SSLConn4recvEPvii
0000000000075150 t _ZN3tcp7SSLConn5recvnEPvii
0000000000075070 t _ZN3tcp7TcpConn4recvEPvii
0000000000075080 t _ZN3tcp7TcpConn5recvnEPvii
[root@localhost lib]# 
[root@localhost lib]# nm libco.so | grep socket
0000000000022600 T socket
0000000000022d90 T socketpair
00000000002a1e20 b _sys_socket
00000000002a1e18 b _sys_socketpair
000000000003ff00 T _ZN2co6socketEiii
0000000000075170 t _ZN3tcp7SSLConn6socketEv
0000000000074ff0 t _ZN3tcp7TcpConn6socketEv
000000000006c0d0 T _ZNK3tcp10Connection6socketEv

另外静态库的符号表如下:

[root@localhost lib]# nm libco.a | grep connect
0000000000003390 T connect
00000000000000c0 B _sys_connect
                 U _ZN2co7connectEiPKvii
                 U _sys_connect
0000000000001ae0 T _ZN2co7connectEiPKvii
0000000000003d10 T _ZN3rpc10ClientImpl7connectEv
0000000000000760 T _ZN3rpc10ServerImpl13on_connectionEN3tcp10ConnectionE
                 U _ZN3tcp6Client10disconnectEv
                 U _ZN3tcp6Client7connectEi
                 U _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
                 U _ZN2co7connectEiPKvii
                 U _ZN3ssl7connectEPvi
0000000000008050 T _ZN3tcp10ServerImpl17on_ssl_connectionEi
0000000000007ef0 T _ZN3tcp10ServerImpl17on_tcp_connectionEi
00000000000065a0 T _ZN3tcp6Client10disconnectEv
0000000000006610 T _ZN3tcp6Client7connectEi
0000000000005590 T _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
00000000000006a0 T _ZN3ssl7connectEPvi
                 U _ZN3tcp6Server13on_connectionEOSt8functionIFvNS_10ConnectionEEE
0000000000003870 T _ZN4http10ServerImpl13on_connectionEN3tcp10ConnectionE
[root@localhost lib]# 
[root@localhost lib]# nm libco.a | grep recv
0000000000008000 T recv
0000000000008380 T recvfrom
0000000000008730 T recvmsg
00000000000000a0 B _sys_recv
0000000000000098 B _sys_recvfrom
0000000000000090 B _sys_recvmsg
                 U _sys_recv
                 U _sys_recvfrom
0000000000001f40 T _ZN2co4recvEiPvii
0000000000002320 T _ZN2co5recvnEiPvii
00000000000023f0 T _ZN2co8recvfromEiPviS0_Pii
0000000000000014 B FLG_rpc_recv_timeout
                 U _ZN3tcp10Connection4recvEPvii
                 U _ZN3tcp10Connection5recvnEPvii
                 U _ZN3tcp6Client5recvnEPvii
                 U _ZN2co4recvEiPvii
                 U _ZN2co5recvnEiPvii
                 U _ZN3ssl4recvEPvS0_ii
                 U _ZN3ssl5recvnEPvS0_ii
0000000000000070 T _ZN3tcp10Connection4recvEPvii
0000000000000080 T _ZN3tcp10Connection5recvnEPvii
0000000000005810 T _ZN3tcp6Client4recvEPvii
0000000000005830 T _ZN3tcp6Client5recvnEPvii
0000000000000000 W _ZN3tcp7SSLConn4recvEPvii
0000000000000000 W _ZN3tcp7SSLConn5recvnEPvii
0000000000000000 W _ZN3tcp7TcpConn4recvEPvii
0000000000000000 W _ZN3tcp7TcpConn5recvnEPvii
00000000000006b0 T _ZN3ssl4recvEPvS0_ii
00000000000006c0 T _ZN3ssl5recvnEPvS0_ii
0000000000000010 B FLG_http_recv_timeout
                 U _ZN3tcp10Connection4recvEPvii
                 U _ZN3tcp10Connection5recvnEPvii
[root@localhost lib]# 
[root@localhost lib]# nm libco.a | grep socket
0000000000000000 T socket
00000000000007c0 T socketpair
0000000000000120 B _sys_socket
0000000000000118 B _sys_socketpair
                 U _sys_socket
0000000000000080 T _ZN2co6socketEiii
                 U _ZNK3tcp10Connection6socketEv
                 U _ZN2co6socketEiii
0000000000000000 W _ZN3tcp7SSLConn6socketEv
0000000000000000 W _ZN3tcp7TcpConn6socketEv
0000000000000060 T _ZNK3tcp10Connection6socketEv
                 U _ZNK3tcp10Connection6socketEv

@idealvin
Copy link
Owner

libco.so里已经导出 connectrecv 这些函数了,那就是链接的问题。
用 xmake 编译试试,xmake -v 打印详细的编译参数,看看链接顺序..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants