Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor "Pod Sandbox" to use Virtualization #436

Open
krisnova opened this issue Feb 27, 2023 · 10 comments
Open

Refactor "Pod Sandbox" to use Virtualization #436

krisnova opened this issue Feb 27, 2023 · 10 comments

Comments

@krisnova
Copy link
Contributor

We need to form an opinion on which virtualization library to use, as mentioned in #433.

Options that I am aware of:

After we establish a way of running a virtualized workload we need to replace the current pod sandbox implementation detail with two things:

  1. A switching mechanism similar to our init crate that allows us to detect if virtualization is possible at runtime.
  2. Implementation detail for running pods as VMs with a spawned auraed.
@krisnova
Copy link
Contributor Author

I think this example should give us what we need to run a simple linux kernel and schedule auraed as /bin/init

https://github.com/rust-vmm/linux-loader

@krisnova
Copy link
Contributor Author

krisnova commented Feb 27, 2023

So here is where I think we start.

  1. Check out the try_from function here

It looks like we can pass Boot Arguments and Init Arguments to the linux loader crate which gives us the ability to define our init process similar to any bootloader.

We can hook in here and generate the string to boot a nested auraed as a guest for a pod.

@JeroenSoeters
Copy link
Contributor

I was going to take a shot at this. Wondering, though, if it makes sense to just implement the VmsService and then build the PodSandbox stuff on top. This keeps the scope somewhat contained and we need it anyways. Happy to create a new issue for that work, and link that issue here. Thoughts?

@JeroenSoeters
Copy link
Contributor

JeroenSoeters commented Feb 27, 2023

Issue for VmsService which we can then leverage for the "Pod Sandbox": #439

@MalteJ
Copy link
Contributor

MalteJ commented Feb 28, 2023

Can we maybe create a good abstraction so we can replace the virtualization implementation later on?
I have great sympathy for Firecracker as this is used in production by AWS. When I look at the current state of the aurae project, I think we should try to not get distracted by implementing/extending a hypervisor.

@krisnova
Copy link
Contributor Author

krisnova commented Mar 1, 2023

I think staying out of the hyper visor details is a good move for right now -- I do think it should remain compiled into the auraed binary -- but ideally we should be able to consider other hypervisor implementations at compile time

@JeroenSoeters
Copy link
Contributor

JeroenSoeters commented Mar 1, 2023

The more I look at the FC code, the more I do not want to implement our own hypervisor :) I will create an RFC once I have better organized my thoughts around this topic. I'm currently exploring Dragonball, which might or might not suit our needs better. https://github.com/kata-containers/kata-containers/tree/main/src/dragonball

Can we maybe create a good abstraction so we can replace the virtualization implementation later on?

This is what kata containers does as well, they abstract the hypervisor and make it pluggable.

@MalteJ
Copy link
Contributor

MalteJ commented Jan 23, 2024

@JeroenSoeters what do you think about using cloud-hypervisor for this?
I think we should create a nice interface and then write an implementation, that leverages cloud-hypervisor underneath. This way we could replace cloud-hypervisor with something else later on. Also, I'd like to have support for classical VMs - which would be a problem with firecracker, as it just supports a very limited set of (virtual) hardware.

@JeroenSoeters
Copy link
Contributor

Last time I looked at this cloud-hypervisor seemed like the best choice yea because of what you mention as well as vhost-net support. I had started some of that work around an interface, I believe the next step was creating TUN/TAP devices from out networking code.

@dmah42
Copy link
Contributor

dmah42 commented Jun 21, 2024

looks like we've started landing on cloud-hypervisor (which is good).

once that's in place we should circle back to the Pod service per the original issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants