Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config directories don't mount inside VM #45

Open
iron-udjin opened this issue Feb 22, 2023 · 23 comments
Open

Config directories don't mount inside VM #45

iron-udjin opened this issue Feb 22, 2023 · 23 comments
Labels
base system FreeBSD base system issue, report to upstream bug Something isn't working

Comments

@iron-udjin
Copy link

Hello,

I'm trying to configure wifibox as AP. I added passthru device and started wifibox:
service wireguard start

wifibox console:

localhost:~# ifconfig 
localhost:~# 
localhost:~# mount
/dev/root on / type squashfs (ro,relatime,errors=continue)
devtmpfs on /dev type devtmpfs (rw,nosuid,noexec,relatime,size=10240k,nr_inodes=13653,mode=755)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run type tmpfs (rw,nosuid,nodev,size=22496k,nr_inodes=819200,mode=755)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /tmp type tmpfs (rw,relatime,size=128k)
localhost:~# cat /etc/fstab 
tmpfs      /tmp       tmpfs    size=128K                                     0 0
config     /media/etc 9p       trans=virtio,ro,noatime,nodiratime,norelatime 0 0
var        /var       9p       trans=virtio,rw                               0 0
app_config /etc/hostapd 9p trans=virtio,rw 0 0
localhost:~# mount app_config
mount: permission denied (are you root?)
localhost:~# id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)

As you see, VM doesn't mount config directories. And as a result, doesn't bringing up network interfaces.

Debug logs:

2023-02-22T11:38:23+0200 DEBUG Program started as /usr/local/sbin/wifibox, with arguments: start
2023-02-22T11:38:23+0200 INFO  Begin: wifibox start
2023-02-22T11:38:23+0200 DEBUG start=GN
2023-02-22T11:38:23+0200 INFO  Creating bridge interface: wifibox0
2023-02-22T11:38:23+0200 DEBUG [ifconfig] wifibox0
2023-02-22T11:38:23+0200 INFO  Linking tap interface to wifibox0: tap0
2023-02-22T11:38:23+0200 INFO  vmm.ko is expected at path: /boot/kernel/vmm.ko
2023-02-22T11:38:23+0200 INFO  vmm.ko is found at path: /boot/kernel/vmm.ko
2023-02-22T11:38:23+0200 DEBUG assert loaded: kmod=vmm, kmod_file=/boot/kernel/vmm.ko
2023-02-22T11:38:24+0200 DEBUG assert hardware support present: iommu=1, amdvi=0
2023-02-22T11:38:24+0200 DEBUG Backends reported by bhyve:
2023-02-22T11:38:24+0200 DEBUG [bhyve] ahci
2023-02-22T11:38:24+0200 DEBUG [bhyve] ahci-hd
2023-02-22T11:38:24+0200 DEBUG [bhyve] ahci-cd
2023-02-22T11:38:24+0200 DEBUG [bhyve] e1000
2023-02-22T11:38:24+0200 DEBUG [bhyve] dummy
2023-02-22T11:38:24+0200 DEBUG [bhyve] hda
2023-02-22T11:38:24+0200 DEBUG [bhyve] fbuf
2023-02-22T11:38:24+0200 DEBUG [bhyve] amd_hostbridge
2023-02-22T11:38:24+0200 DEBUG [bhyve] hostbridge
2023-02-22T11:38:24+0200 DEBUG [bhyve] lpc
2023-02-22T11:38:24+0200 DEBUG [bhyve] nvme
2023-02-22T11:38:24+0200 DEBUG [bhyve] passthru
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-9p
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-blk
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-console
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-input
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-net
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-rnd
2023-02-22T11:38:24+0200 DEBUG [bhyve] virtio-scsi
2023-02-22T11:38:24+0200 DEBUG [bhyve] uart
2023-02-22T11:38:24+0200 DEBUG [bhyve] xhci
2023-02-22T11:38:24+0200 INFO  Bringing up null-modem devices for console connection
2023-02-22T11:38:24+0200 DEBUG assert loaded: kmod=nmdm, kmod_file=nmdm
2023-02-22T11:38:24+0200 INFO  Null-modem devices are available
2023-02-22T11:38:24+0200 INFO  Waiting for bhyve to start up
2023-02-22T11:38:24+0200 DEBUG Program started as /usr/local/sbin/wifibox, with arguments: _manage_vm
2023-02-22T11:38:24+0200 DEBUG assert daemonized: parent=daemon
2023-02-22T11:38:24+0200 INFO  VM manager launched
2023-02-22T11:38:24+0200 INFO  Gathering necessary configuration files for launching the guest
2023-02-22T11:38:24+0200 INFO  Pulling bhyve options from configuration file
2023-02-22T11:38:24+0200 DEBUG cpus=1, memory=128M, passthru=0/20/3, console=yes
2023-02-22T11:38:24+0200 INFO  Guest console is configured to use
2023-02-22T11:38:24+0200 INFO  Passthru device is configured: 0/20/3
2023-02-22T11:38:24+0200 INFO  PPT driver is configured for pci0:20:3 device
2023-02-22T11:38:24+0200 INFO  tap interface is configured: tap0
2023-02-22T11:38:24+0200 INFO  Launching guest wifibox from /usr/local/share/wifibox with grub-bhyve
2023-02-22T11:38:24+0200 INFO  Application config is found at /usr/local/etc/wifibox/hostapd
2023-02-22T11:38:24+0200 INFO  Application config will be mounted writeable
2023-02-22T11:38:24+0200 DEBUG Devices: virtio-blk,/usr/local/share/wifibox/disk.img virtio-9p,config=/usr/local/etc/wifibox/appliance,ro virtio-9p,var=/var/run/wifibox/appliance virtio-9p,app_config=/usr/local/etc/wifibox/hostapd
2023-02-22T11:38:24+0200 DEBUG Arguments:  -S -M 128M -r host -c /dev/nmdm-wifibox.1A -d /usr/local/share/wifibox wifibox
2023-02-22T11:38:24+0200 INFO  Launching guest wifibox from /usr/local/share/wifibox with bhyve
2023-02-22T11:38:24+0200 DEBUG Arguments:  -c 1 -m 128M -AHP -u -S -l com1,/dev/nmdm-wifibox.1A -s 0,hostbridge -s 31,lpc -s 4:0,virtio-blk,/usr/local/share/wifibox/disk.img -s 4:1,virtio-9p,config=/usr/local/etc/wifibox/appliance,ro -s 4:2,virtio-9p,var=/var/run/wifibox/appliance -s 4:3,virtio-9p,app_config=/usr/local/etc/wifibox/hostapd -s 5:0,e1000,tap0 -s 6:0,passthru,0/20/3 wifibox
2023-02-22T11:38:24+0200 DEBUG [bhyve] wrmsr to register 0x140(0) on vcpu 0
2023-02-22T11:38:24+0200 DEBUG [bhyve] rdmsr to register 0x64d on vcpu 0
2023-02-22T11:38:24+0200 DEBUG [bhyve] rdmsr to register 0x64e on vcpu 0
2023-02-22T11:38:24+0200 DEBUG [bhyve] rdmsr to register 0x34 on vcpu 0
2023-02-22T11:38:25+0200 INFO  Guest wifibox has started up
2023-02-22T11:38:25+0200 INFO  Guest is managed by PID 7301
2023-02-22T11:38:25+0200 INFO  Bringing up Unix Domain Socket pass-through
2023-02-22T11:38:25+0200 INFO  Found guest IP address: 10.0.0.2
2023-02-22T11:38:25+0200 INFO  Configured sockets: [path=/var/run/hostapd/wlan0,user=root,group=0,mode=770,port=1200]
2023-02-22T11:38:25+0200 INFO  Hooking up 10.0.0.2:1200 as /var/run/hostapd/wlan0 (root:0@770)
2023-02-22T11:38:25+0200 INFO  End: wifibox start
2023-02-22T11:38:28+0200 DEBUG Program started as /usr/local/sbin/wifibox, with arguments: console
2023-02-22T11:38:28+0200 INFO  Begin: wifibox console

Don't have any idea where should I dig to solve this problem. Could you please advise me?

Also please add sample configuration about creating AP. man 8 wifibox has nice samples but nothing regarding hostapd config or setup AP in general. It would be great to have hostapd/hostapd.conf with commented sample configuration. Users could just uncomment sample config and edit it according to their needs instead of reading man hostapd.

Thank you!

@iron-udjin
Copy link
Author

Additional info:
OS: FreeBSD 13.1-STABLE

wifibox-core-0.11.0:

Options        :
	BHYVE_PLUS     : off
	RECOVER_NONE   : off
	RECOVER_RESTART_VMM: on
	RECOVER_SUSPEND_GUEST: off

wifibox-alpine-20230213:

Options        :
	APP_HOSTAPD    : on
	APP_WPA_SUPPLICANT: off
	COMP_GZIP      : off
	COMP_LZ4       : off
	COMP_LZO       : off
	COMP_XZ        : off
	COMP_ZSTD      : on
	FW_ATH10K      : off
	FW_ATH11K      : off
	FW_ATMEL       : off
	FW_B43         : off
	FW_B43LEGACY   : off
	FW_BRCM        : off
	FW_IPW2100     : off
	FW_IPW2200     : off
	FW_IWL3945     : off
	FW_IWL4965     : off
	FW_IWLWIFI     : on
	FW_MARVELL     : off
	FW_MEDIATEK    : off
	FW_RT61        : off
	FW_RTLWIFI     : off
	FW_RTW88       : off
	FW_RTW89       : off
	FW_TI          : off
	UDS_PASSTHRU   : on
	XX_DRIVER_WL   : off
	XX_FORWARDING  : off
	XX_MDNS        : off
	XX_TCPDUMP     : off

Intel(R) Wireless-AC 9560 160MHz, REV=0x318

@pgj pgj self-assigned this Feb 24, 2023
@pgj pgj added the bug Something isn't working label Feb 24, 2023
@pgj
Copy link
Owner

pgj commented Feb 24, 2023

Hi @iron-udjin, thanks for the report!

I could not yet reproduce this on 12.4-RELEASE but I have not yet given it up... Based on the log, everything goes as expected. Could you please include the VM's logs as well? You can find this at /var/run/wifibox/appliance/log/, these are the dmesg and messages files.

Regarding the AP sample configuration, I am not sure this is the responsibility of Wifibox. Similarly to wpa_supplicant.conf(5), the hostapd.conf(5) might be different by each use case, there is no generic recipe for either of them. What Wifibox currently does is that it points to the authoritative sources: the "Configuration" section of wifibox(8) says that "check the sample configuration files provided in the %%ETCDIR%%/wifibox directory and follow the instructions to create a working configuration". There you shall find a hostapd/hostapd.conf.sample file, which will then point to /usr/share/examples/hostapd/hostapd.conf on your local system, the file you were asking for. I admit the level of indirection may be too deep here, but it is flexible.

Of course, I can include here the hostapd.conf(5) that I used for testing if you wish, but that is not so generic.

@iron-udjin
Copy link
Author

Could you please include the VM's logs as well? You can find this at /var/run/wifibox/appliance/log/

This directory is empty:

IRON ~ # wifibox status
wifibox is run by PID 54913.
IRON ~ # ls -la /var/run/wifibox/appliance/log
IRON ~ #

Of course, I can include here the hostapd.conf(5) that I used for testing if you wish, but that is not so generic.

Generic config could be something like https://wiki.gentoo.org/wiki/Hostapd:

# the interface used by the AP
interface=wlan0
# "g" simply means 2.4GHz band
hw_mode=g
# the channel to use
channel=10
# limit the frequencies used to those allowed in the country
ieee80211d=1
# the country code
country_code=FR
# 802.11n support
ieee80211n=1
# QoS support, also required for full speed on 802.11n/ac/ax
wmm_enabled=1
# the name of the AP
ssid=somename
# 1=wpa, 2=wep, 3=both
auth_algs=1
# WPA2 only
wpa=2
wpa_key_mgmt=WPA-PSK
rsn_pairwise=CCMP
wpa_passphrase=somepassword

@pgj
Copy link
Owner

pgj commented Feb 25, 2023

Looks like the VirtFS/9P mounts do not work at all, although the messages in the log do not indicate any related problems.

@pgj
Copy link
Owner

pgj commented Feb 25, 2023

Is not there anything unusual present the dmesg(8) output of the host itself? Could you please try to grab the VM dmesg output from the console for me? I am mostly interested in the potential error messages, i.e. missed mounts and such.

@iron-udjin
Copy link
Author

dmesg from host system: https://pastebin.com/QRkCkG5A
dmesg from VM: https://pastebin.com/87AmxRrE

@pgj
Copy link
Owner

pgj commented Feb 25, 2023

Thanks! No indication of any mount errors in either of the logs, though -- that is even stranger. I should really try to reproduce this locally. By the way, does wpa_supplicant work for you on the same configuration?

@iron-udjin
Copy link
Author

By the way, does wpa_supplicant work for you on the same configuration?

No, it doesn't work. The same problem with config direstories mount.

@iron-udjin
Copy link
Author

Where is VirtFS functionality is located? Possibly there is some module missing or something like that... Here is my kernel config: https://pastebin.com/A4B401Wy

@iron-udjin
Copy link
Author

Here is my /etc/src.conf: https://pastebin.com/E7senNkT

@pgj
Copy link
Owner

pgj commented Feb 25, 2023

If I recall correctly, the implementation of VirtFS/9p does not require any special support from the side of the kernel and the base system. It is a shipped as a library (lib9p.so) which is then linked to the bhyve binary. If you can see virtio-9p in the output of bhyve -s help you shall be set. That is confirmed in your case, because Wifibox checks this on startup so it would refuse to work anyway, but I could see that in the output of the debug log too.

However, something else might be broken that affects the lib9p.so library and causes this kind of trouble.

@iron-udjin
Copy link
Author

I'm afraid it could be bug in bhyve which is in -STABLE. How can I debug it? I guess I need to send bug report before -STABLE become 13.2-RELEASE.

@pgj
Copy link
Owner

pgj commented Feb 25, 2023

Unfortunately I do not have any direct experience with tracking down issues like that one. It might be a good idea to report this problem in the FreeBSD Project right away (given the timing) where developers will tell you what to do or even quickly realize what to revert.

@iron-udjin
Copy link
Author

Ok, thank you. Let's close this issue as the problem is not within wifibox itself.

@pgj
Copy link
Owner

pgj commented Feb 25, 2023

What about posting the link to the upstream bug report and leave it open until resolved? This may help others to learn about the problem. We have done something similar in the past.

@pgj pgj added base system FreeBSD base system issue, report to upstream and removed bug Something isn't working labels Feb 25, 2023
@iron-udjin
Copy link
Author

Ok, let's leave it as is.

@iron-udjin
Copy link
Author

iron-udjin commented Feb 25, 2023

To track the issue with bhyve: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269827

@pgj
Copy link
Owner

pgj commented Feb 26, 2023

I tried to reproduce the problem with 13.2-STABLE, version n254667-bc68c55166ad (snapshot from 2023-02-23), I could not. This version might not be recent enough or this issue has something to do with your custom configuration.

@iron-udjin
Copy link
Author

I tried to rebuild kernel with GENERIC config or even deploy kernel image from here - still have no luck. It seems the problem somewhere inside bhyve. Trying to figure out where exactly.

@pgj
Copy link
Owner

pgj commented Feb 26, 2023

Here is how you can investigate the behavior of the virtio-9p backend in bhyve. First create and install a version of the bhyve binary that includes debug symbols and not optimized.

# cd /usr/src/usr.sbin/bhyve
# make DEBUG_FLAGS="-O0 -g"
# make install

In result, debug symbols for bhyve will be installed as /usr/lib/debug/sbin/bhyve.debug, along with an updated version of /usr/sbin/bhyve (this is for 12.4-RELEASE but it should be the same on other major versions):

# file /usr/lib/debug/usr/sbin/bhyve.debug
/usr/lib/debug/usr/sbin/bhyve.debug: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), no program header, for FreeBSD 12.4, FreeBSD-style, with debug_info, not stripped

Then create a dummy bhyve VM, which will be called test here, that mimics the execution of wifibox. Start with preparing the launch by grub-bhyve:

# /usr/local/sbin/grub-bhyve -S -M 64M -r host -d /usr/local/share/wifibox test

And then it becomes possible to start bhyve through lldb, the system debugger:

# lldb -- /usr/sbin/bhyve -c 1 -m 64M -AHP -u -S -s 0,hostbridge -s 31,lpc -s 4:0,virtio-blk,/usr/local/share/wifibox/disk.img -s 4:1,virtio-9p,config=/usr/local/etc/wifibox/appliance,ro -s 4:2,virtio-9p,var=/var/run/wifibox/appliance -s 4:3,virtio-9p,app_config=/usr/local/etc/wifibox/wpa_supplicant test
(lldb) target create "/usr/sbin/bhyve"
Current executable set to '/usr/sbin/bhyve' (x86_64).
(lldb) settings set -- target.run-args  "-c" "1" "-m" "64M" "-AHP" "-u" "-S" "-s" "0,hostbridge" "-s" "31,lpc" "-s" "4:0,virtio-blk,/usr/local/share/wifibox/disk.img" "-s" "4:1,virtio-9p,config=/usr/local/etc/wifibox/appliance,ro" "-s" "4:2,virtio-9p,var=/var/run/wifibox/appliance" "-s" "4:3,virtio-9p,app_config=/usr/local/etc/wifibox/wpa_supplicant" "test"

Use lldb commands to check how the arguments of the pci_vt9p_init function are flowing through. This C function is responsible for making the 9P protocol work.

(lldb) b pci_vt9p_init
Breakpoint 1: where = bhyve`pci_vt9p_init + 35 at pci_virtio_9p.c:242:11, address = 0x000000000023e663

Once the breakpoint is set, start the application, which should immediately be stopped in response to that:

(lldb) r
Process 19852 launched: '/usr/sbin/bhyve' (x86_64)
Process 19852 stopped
* thread #1, name = 'bhyve', stop reason = breakpoint 1.1
    frame #0: 0x000000000023e663 bhyve`pci_vt9p_init(ctx=0x000000080029c180, pi=0x0000000800b2c500, opts="config=/usr/local/etc/wifibox/appliance,ro") at pci_virtio_9p.c:242:11
   239           bool ro = false;
   240           cap_rights_t rootcap;
   241   
-> 242           if (opts == NULL) {
   243                   printf("virtio-9p: share name and path required\n");
   244                   return (1);
   245           }

The execution of the function can be stepped by the s (step) command and the state of the variables in scope can be investigated by the var command:

(lldb) var
(vmctx *) ctx = 0x000000080029c180
(pci_devinst *) pi = 0x0000000800b2c500
(char *) opts = 0x000000080029c10e "config=/usr/local/etc/wifibox/appliance,ro"
(cap_rights_t) rootcap = {
  cr_rights = ([0] = 140737488347616, [1] = 0)
}
(bool) ro = false
(char *) rootpath = 0x0000000000000000
(char *) sharename = 0x0000000000000000
(char *) opt = <variable not available>

(int) rootfd = <variable not available>

(pci_vt9p_softc *) sc = <variable not available>

That way, for example, it can be tracked how the sharename variable gets its value:

[..]
(lldb) s
Process 19852 stopped
* thread #1, name = 'bhyve', stop reason = step in
    frame #0: 0x000000000023e6db bhyve`pci_vt9p_init(ctx=<unavailable>, pi=0x0000000800b2c500, opts="ro") at pci_virtio_9p.c:255:15
   252                           }
   253   
   254                           sharename = strsep(&opt, "=");
-> 255                           rootpath = opt;
   256                           continue;
   257                   }
   258   
(lldb) p sharename
(char *) $0 = 0x000000080029c10e "config"

Ideally, this should give us some ideas about why the things are not going right.

After finished working with lldb (use q to leave the debugger) , remember to destroy the VM:

#  bhyvectl --destroy --vm=test

@Marietto2008
Copy link

Marietto2008 commented Feb 26, 2023 via email

@iron-udjin
Copy link
Author

Here is how you can investigate

Oh... thanks for such detailed instructions! Using debugger I found that the problem is in lack of CAPSUIM support. I've rebuilded world + kernel and everything works as expected:

wifibox:~# mount | grep config
config on /media/etc type 9p (ro,sync,dirsync,noatime,nodiratime,access=client,trans=virtio)
app_config on /etc/wpa_supplicant type 9p (rw,sync,dirsync,relatime,access=client,trans=virtio)

That's really weird that bhyve doesn't allow to mount VirtFS/9p share without any error or warning in logs.

Thanks a lot for your help!

@pgj
Copy link
Owner

pgj commented Mar 4, 2023

I am glad that finally we could find the root cause. I have posted the conclusion to the FreeBSD ticket in case folks do not want to read through the comments here. I may even submit a patch for bhyve to address the issue, if time permits. Until it is resolved, I would leave this ticket open.

@pgj pgj added the bug Something isn't working label Mar 18, 2023
@pgj pgj removed their assignment Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base system FreeBSD base system issue, report to upstream bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants