Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional installation of Samba and CTDB from git sources #94

Merged
merged 5 commits into from
May 28, 2024

Conversation

anoopcs9
Copy link
Collaborator

For smooth and better development workflow a new mechanism to optionally install Samba and CTDB components from git sources is introduced. Following format can be used in settings to indicate and specify the git repository details:

install:
  samba:
    git:
      Indicates installation from git repository.

      repo: (optional, default: git://git.samba.org/samba.git)
        Git repository url to be cloned for installation.

      dest: (optional, default: /tmp/samba.git)
        The path where the repository should be checked out.

      version: (optional, default: HEAD)
        Branch, tag or commit to be fetched.

      mr: (optional)
        GitLab merge request number to be fetched.

      pr: (optional)
        GitHub pull request number to be fetched.

      refspec: (optional)
        Additional refspec to specify branch or tag containing a commit.

Examples:

install:
  samba:
    git:
      repo: https://gitlab.com/samba-team/samba.git
      version: v4-20-test
      dest: /opt/samba.git
install:
  samba:
    git:
      repo: https://gitlab.com/samba-team/samba.git
      mr: 3636
install:
  samba:
    git:
      repo: https://gitlab.com/samba-team/devel/samba.git
      version: 612488146c943e83f58247414c6bc678364fca93
      refspec: heads/anoopcs-vfs-gluster-async-dos-mode

@anoopcs9 anoopcs9 marked this pull request as ready for review April 26, 2024 10:50
@anoopcs9 anoopcs9 force-pushed the samba-git-install branch from 97e20d6 to 7391cd7 Compare April 27, 2024 16:59
@anoopcs9 anoopcs9 requested a review from xhernandez April 28, 2024 04:18
@anoopcs9
Copy link
Collaborator Author

It takes ~10m more than the current rpm based installation method to complete a full test run. View temporary GitLab jobs from jenkins to see it in action.

Copy link
Collaborator

@xhernandez xhernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having several conditional blocks to install from a repo or install from git, maybe having different tasks files for each option could help separate things and make them cleaner.

@xhernandez
Copy link
Collaborator

Another thing to consider: if we are installing from git for development/debugging purposes, why we don't install support for all supported backends and build all vfs modules ?

@anoopcs9
Copy link
Collaborator Author

Another thing to consider: if we are installing from git for development/debugging purposes, why we don't install support for all supported backends and build all vfs modules ?

Now as you mention it I think this can get rid of backend specific tasks making it even more simpler. But we won't be having necessary repositories enabled to install matching devel packages for other backends for which the environment is not invoked for setup.

@anoopcs9
Copy link
Collaborator Author

Instead of having several conditional blocks to install from a repo or install from git, maybe having different tasks files for each option could help separate things and make them cleaner.

I'll consider defining the tasks separate for git and repo installation.

@xhernandez
Copy link
Collaborator

Another thing to consider: if we are installing from git for development/debugging purposes, why we don't install support for all supported backends and build all vfs modules ?

Now as you mention it I think this can get rid of backend specific tasks making it even more simpler. But we won't be having necessary repositories enabled to install matching devel packages for other backends for which the environment is not invoked for setup.

If we install from source and using the bootstrap.sh script, shouldn't it already install all required dependencies for all modules, independently of the selected backend ?

@anoopcs9 anoopcs9 force-pushed the samba-git-install branch 2 times, most recently from 837ba4c to 44c400c Compare May 2, 2024 13:45
@anoopcs9
Copy link
Collaborator Author

anoopcs9 commented May 2, 2024

Another thing to consider: if we are installing from git for development/debugging purposes, why we don't install support for all supported backends and build all vfs modules ?

Now as you mention it I think this can get rid of backend specific tasks making it even more simpler. But we won't be having necessary repositories enabled to install matching devel packages for other backends for which the environment is not invoked for setup.

If we install from source and using the bootstrap.sh script, shouldn't it already install all required dependencies for all modules, independently of the selected backend ?

As replied earlier, since we use the very same role to install from git sources for both client and server based on a particular backend I have added a generic bootstrap-centos.sh to install common dependencies.

@anoopcs9 anoopcs9 requested a review from xhernandez May 2, 2024 15:32
@spuiuk
Copy link
Collaborator

spuiuk commented May 3, 2024

I am not in favour of this approach.
There are a few problems here -
a) We will be installing development packages on the cluster machines.
b) Are we building samba on each separate cluster machine?
c) We will basically repeat the same build for each backend filesystem we are testing.
d) We will not have these test samba packages available for testing on other environments outside of sit-environment.

I propose instead a different approach. We go through a pipeline where we first build a lite version of samba rpm packages from a git repo and then make these available to the environments generated by sit-environment.
This be far easier to maintain and will be more useful.

@anoopcs9
Copy link
Collaborator Author

anoopcs9 commented May 4, 2024

I am not in favour of this approach.

The primary use case for the proposed optional installation of samba and ctdb from git sources is to have a pipeline setup for upstream samba GitLab instance so that the merge requests can be tested before any breaking changes gets merged. But that's not the only intention as this can help with development where changes can be tested in a quick fashion once such an environment is configured. Please keep this in mind while reading through my responses below.

There are a few problems here

I find it difficult to digest the following as problems. Still..

a) We will be installing development packages on the cluster machines.

Yes, can you explain how it can be a real problem on servers while running tests?

b) Are we building samba on each separate cluster machine?

Yes, and again how does it affect the test runs?

c) We will basically repeat the same build for each backend filesystem we are testing.

Yes, as stated earlier we don't have a plan to switch the installation method from rpms to git sources for our currently configured nightly runs or for PR triggered test runs in case you were anticipating such a change. If then where do we foresee a problem?

d) We will not have these test samba packages available for testing on other environments outside of sit-environment.

I am not sure about the concern here. We don't have packages and that's the whole point. Even otherwise, at present, how do you expect other environments(different platform etc) to replicate a particular test run? Considering your below suggested approach how the temporarily built packages are made available for consumption outside sit-environment?

I propose instead a different approach. We go through a pipeline where we first build a lite version of samba rpm packages from a git repo and then make these available to the environments generated by sit-environment. This be far easier to maintain and will be more useful.

Sorry, I have to disagree. We are looking at considerable amount of time in building samba and ctdb rpms every night from master(and other branches) which ultimately adds to the overall test setup time even if its built once per test run. Whereas installation from git sources, in parallel, on cluster nodes can finish within 3-4 minutes. Despite all these facts I don't think we have an option to build a "lite version of samba"(whatever you had in mind) packages suitable for our preferred test environment. Or are you suggesting to build and store each and every build created as part of individual backend specific test run?

@spuiuk
Copy link
Collaborator

spuiuk commented May 4, 2024

a) We will be installing development packages on the cluster machines.

Yes, can you explain how it can be a real problem on servers while running tests?

You would not install devel packages on a production machine. Aren't we trying to replicate a production environment here? This is not a major problem but can lead to unforseen problems like dependency issues later.

b) Are we building samba on each separate cluster machine?
Yes, and again how does it affect the test runs?
build time on a machine x number of machines on cluster x number of configurations we are testing it on.
This is a waste of resources.

c) We will basically repeat the same build for each backend filesystem we are testing.

Yes, as stated earlier we don't have a plan to switch the installation method from rpms to git sources for our currently configured nightly runs or for PR triggered test runs in case you were anticipating such a change. If then where do we foresee a problem?
Same answer as b.

d) We will not have these test samba packages available for testing on other environments outside of sit-environment.

I am not sure about the concern here. We don't have packages and that's the whole point. Even otherwise, at present, how do you expect other environments(different platform etc) to replicate a particular test run? Considering your below suggested approach how the temporarily built packages are made available for consumption outside sit-environment?

I propose instead a different approach. We go through a pipeline where we first build a lite version of samba rpm packages from a git repo and then make these available to the environments generated by sit-environment. This be far easier to maintain and will be more useful.

Sorry, I have to disagree. We are looking at considerable amount of time in building samba and ctdb rpms every night from master(and other branches) which ultimately adds to the overall test setup time even if its built once per test run. Whereas installation from git sources, in parallel, on cluster nodes can finish within 3-4 minutes. Despite all these facts I don't think we have an option to build a "lite version of samba"(whatever you had in mind) packages suitable for our preferred test environment. Or are you suggesting to build and store each and every build created as part of individual backend specific test run?

Maybe we need to look at the build process then and see why it takes time. Build only for the actual environment being used(centos 9?). Maybe skip packaging devel packages, debuginfo packages, etc? we don't need to rebuild ctdb and just reuse those? We could try with a simpler spec file with a narrower scope?

By separating the rpm build process from the actual sit-environment repos, we reduce maintanance costs on sit-environment as well. Sit-environment is primarily used to build a Samba test environment with a clustered backend to test against.

In my opinion, we should keep the samba build processes out of the scope of this repository.

@anoopcs9
Copy link
Collaborator Author

anoopcs9 commented May 5, 2024

a) We will be installing development packages on the cluster machines.

Yes, can you explain how it can be a real problem on servers while running tests?

You would not install devel packages on a production machine. Aren't we trying to replicate a production environment here? This is not a major problem but can lead to unforseen problems like dependency issues later.

If not devel packages our current process of setting up the environment already calls for installation of many dependencies(mostly for ansible and python) and that's a fact. From the top of my mind I can only assume the frequency of such issues w.r.t devel packages installation to be way less than what we normally encounter to successfully maintain error free ansible based setup. Moreover ansible and related issues are found to be more complex than just adding to the list of devel packages for installation from git sources.

I propose instead a different approach. We go through a pipeline where we first build a lite version of samba rpm packages from a git repo and then make these available to the environments generated by sit-environment. This be far easier to maintain and will be more useful.

Sorry, I have to disagree. We are looking at considerable amount of time in building samba and ctdb rpms every night from master(and other branches) which ultimately adds to the overall test setup time even if its built once per test run. Whereas installation from git sources, in parallel, on cluster nodes can finish within 3-4 minutes. Despite all these facts I don't think we have an option to build a "lite version of samba"(whatever you had in mind) packages suitable for our preferred test environment. Or are you suggesting to build and store each and every build created as part of individual backend specific test run?

Maybe we need to look at the build process then and see why it takes time. Build only for the actual environment being used(centos 9?).

Even with your suggested approach I don't think we had an intention to build for multiple platforms as it is pointless to create packages for rest other than the target platform. My previous response was based on just CentOS 9(Stream).

Maybe skip packaging devel packages, debuginfo packages, etc?

You really want to skip debuginfo packages? In my opinion these play a vital role in debugging process while investigating a failure/crash in samba/ctdb. We will require matching debug packages while replicating the environment.

we don't need to rebuild ctdb and just reuse those?

Reuse ctdb for what? I don't get it. Looks like you are over complicating things at the cost of additional 3 minutes of installing from sources.

We could try with a simpler spec file with a narrower scope?

So you are asking for another spec file which can drastically reduce the build time(if at all possible) to be maintained separately, huh?

By separating the rpm build process from the actual sit-environment repos, we reduce maintanance costs on sit-environment as well.

Other than the addition of required dependencies we are not anticipating any complex setup issues compared to what we have currently with ansible and python.

Sit-environment is primarily used to build a Samba test environment with a clustered backend to test against.

I agree and installation of components is an important part of this project. Proposed change is an optional method for the source of installation which can be considered valid. On top of that the requirement for this change, a CI/CD pipeline for upstream merge requests, calls for a quick installation using git sources which is perfectly normal in a testing automation workflow.

In my opinion, we should keep the samba build processes out of the scope of this repository.

I tend to disagree. My strong request is to have an installation method from git sources. If not here then where?

@spuiuk
Copy link
Collaborator

spuiuk commented May 5, 2024

I think there could be a better way to proceed with this issue. Let's discuss this at the team call.

anoopcs9 added 5 commits May 14, 2024 19:42
Put the rpm installation of samba and ctdb components behind a
configuration object to make way for optional installation from
git sources.

Signed-off-by: Anoop C S <[email protected]>
Introduce a new role "samba.install" dedicated for handling the
installation of samba/ctdb components.

Signed-off-by: Anoop C S <[email protected]>
Here we add new configuration objects to settings file which can be used
to specify the git source details for installation of samba/ctdb.

Signed-off-by: Anoop C S <[email protected]>
Performing the task of ceph repo creation from common.prep role makes it
possible to include it as part of preparing client machines.

Signed-off-by: Anoop C S <[email protected]>
Add necessary tasks to get samba and ctdb components installed from git
sources.

Signed-off-by: Anoop C S <[email protected]>
@anoopcs9 anoopcs9 force-pushed the samba-git-install branch from 44c400c to 8c6fe8d Compare May 14, 2024 14:12
@anoopcs9
Copy link
Collaborator Author

Let me know if the current version is ready to be taken in.

@anoopcs9
Copy link
Collaborator Author

Let me know if the current version is ready to be taken in.

@xhernandez @spuiuk

@anoopcs9 anoopcs9 merged commit a05bd46 into samba-in-kubernetes:main May 28, 2024
9 checks passed
@anoopcs9 anoopcs9 deleted the samba-git-install branch May 28, 2024 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants