Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multithreaded access to properties is slower than serial access #1084

Open
veber-alex opened this issue Jul 20, 2024 · 4 comments
Open

multithreaded access to properties is slower than serial access #1084

veber-alex opened this issue Jul 20, 2024 · 4 comments
Labels

Comments

@veber-alex
Copy link

veber-alex commented Jul 20, 2024

Describe the bug

I noticed that accessing host properties from multiple threads is slower than doing so serialy.
I wrote a script to reproduce the issue:

# ruff: noqa

import ssl
from threading import Thread
import time

from pyVim.connect import SmartConnect
from pyVmomi import vim

NUM_THREADS = 8
HOST = ""
PASSWORD = ""

context = ssl._create_unverified_context()
con = SmartConnect(host=HOST, pwd=PASSWORD, sslContext=context)

host = con.content.viewManager.CreateContainerView(con.content.rootFolder, [vim.HostSystem], True).view[0]


threads = []
for i in range(NUM_THREADS):

    def print_driver(i):
        print(host.config.network.pnic[i].driver)

    t = Thread(target=print_driver, args=(i,))
    threads.append(t)

start = time.time()
for t in threads:
    t.start()
for t in threads:
    t.join()
end = time.time()
print(f"multi threaded: {end - start}")


start = time.time()
for i in range(NUM_THREADS):
    print(host.config.network.pnic[i].driver)
end = time.time()
print(f"single threaded: {end - start}")

On my host with 8 vmnics I get:

multi threaded: 11.908450603485107
single threaded: 3.76969313621521

The single threaded performance is stable around 4 seconds but the multithreaded performance jumps around between 6-12 seconds each run.
The script can be changed to always access pnic[0] with the same result.
The more threads run at the same time, the slower it gets.

Reproduction steps

  1. set NUM_THREADS, HOST, PASSWORD
  2. run the repro script

Expected behavior

I expect multithreaded performance to be better or equal to serial performance.

Additional context

No response

@veber-alex veber-alex added the bug label Jul 20, 2024
@veber-alex
Copy link
Author

I did another test where I connect to 2 different hosts.
Here is the code:

# ruff: noqa

import ssl
from threading import Thread
import time

from pyVim.connect import SmartConnect
from pyVmomi import vim

NUM_VMNICS = 4
HOST = ""
HOST2 = ""
PASSWORD = ""

context = ssl._create_unverified_context()

start = time.time()
for host in [HOST, HOST2]:
    con = SmartConnect(host=host, pwd=PASSWORD, sslContext=context)
    host = con.content.viewManager.CreateContainerView(con.content.rootFolder, [vim.HostSystem], True).view[0]
    for i in range(NUM_VMNICS + 1):
        print(f"host {host.name} - {host.config.network.pnic[i].driver}")

end = time.time()
print(f"single threaded: {end - start}")

threads = []
for host in [HOST, HOST2]:

    def print_driver(host):
        con = SmartConnect(host=host, pwd=PASSWORD, sslContext=context)
        host = con.content.viewManager.CreateContainerView(con.content.rootFolder, [vim.HostSystem], True).view[0]
        for i in range(NUM_VMNICS + 1):
            print(f"host {host.name} - {host.config.network.pnic[i].driver}")

    t = Thread(target=print_driver, args=(host,))
    threads.append(t)

start = time.time()
for t in threads:
    t.start()
for t in threads:
    t.join()
end = time.time()
print(f"multi threaded: {end - start}")

My results are:

single threaded: 5.33910870552063
multi threaded: 5.063408136367798

This tells me there is a bottleneck in pyvmomi itself and not in the esxi host.

@veber-alex
Copy link
Author

I did more tests and it looks like the performance issues are caused by python 3.7.
Testing with python 3.11 and 3.12 the performance is much better.

@prziborowski
Copy link

Also you could consider using multiprocessing instead of threading, and create a service instance (i.e. SmartConnect) in each of them, so you aren't leveraging the same connection.

@veber-alex
Copy link
Author

I decided to reopen the issue after further testing.

While the performance numbers with newer versions of python are better the trend is still the same.
Connecting from multiple threads to the same host is slower than using one thread and running the code serialy and when connecting to two different hosts the performance improvement of using two threads is tiny when it theory it should be almost linear with the number of hosts.

Also you could consider using multiprocessing instead of threading, and create a service instance (i.e. SmartConnect) in each of them, so you aren't leveraging the same connection.

Thanks but that's not an option in my codebase.

@veber-alex veber-alex reopened this Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants