Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace source with POSIX-compliant . #88

Closed
wants to merge 1 commit into from

Conversation

Jamesits
Copy link

. is a POSIX-compliant command, but source is not. Debian 10's sh does not recognize source, which causes some error messages to be printed out during probe startup.

@michel-stam
Copy link
Contributor

@Jamesits can you verify that other than this everything works with the Debian shell, or is it better to require /bin/bash to be installed on Debian? (bash/ash is what we test on, I have seen dash behaving differently).

@AtlasMUC
Copy link

AtlasMUC commented Oct 24, 2023

@michel-stam, I think the point was not that another shell should be used on Debian. Rather the other way round: Using the POSIX-compliant . command instead of the functionally equivalent, but shell-specific source command, would work with a broader range of shells, i.e., be more portable.

Apart from that, this would lead to a more consistent use of commands/a common style across all scripts. Vs. the current mixture of . and source within the ATLAS script, and the ATLAS script being the only one to use source, as far as I can see.

As far as things working is concerned: I just raised PR #93, which is required to make the ATLAS script work for deb packages built with the make-deb script if the changes proposed in the present PR are implemented (which I support).

That is because the lack of the source command itself simply triggers an error message from the shell, but the calling ATLAS script continues, thus hiding the fact that the helper scripts that were supposed to be called do not exist (seems they are also not essential to core probe functionality as far as I can see). Whereas trying to source another, non-existing script with the . command seems to lead to complete abortion of the calling ATLAS script (as would trying to source with the source command in shells that support it, e.g., OpenWrt's BusyBox sh, if the respective helper scripts were not available there).

@michel-stam
Copy link
Contributor

Hey @AtlasMUC,

I've recently been testing on Debian 11 / 12 using a new package system built on dpkg-buildpackage. In this case the source vs . does not seem to make a difference (not completely finished testing yet). This will complement the existing spec files for rpmbuild, as well as an OpenWRT package (which is also needed for the hardware probes).

With Debian 10 marked oldoldstable, what are your thoughts on this for the future?

@Jamesits
Copy link
Author

Jamesits commented Oct 25, 2023

This script having a hashbang of /bin/sh should remain POSIX-compliant. Bash can execute a POSIX-compliant script with no issue, I've tested it. If the script is bash-only, we should mark it as bash-only.

Plus I'm still maintaining Debian 10 based packaging script due to an incompatiblity in libseccomp (Jamesits/docker-ripe-atlas#19), likely I'm going to keep it until Debian 10 is out of support.

@michel-stam
Copy link
Contributor

michel-stam commented Oct 26, 2023

Hello @Jamesits ,

I agree that POSIX compliancy should be pursued if this can be had without requiring extensive testing. However, parts of the code may use bash-isms, I cannot guarantee its compliancy without reviewing the code base in its entirety. Can your tell me if you have performed any such review?

With the limited resources available to me I am currently looking at FHS compliancy. The goal of this is to get DPKG/RPM and IPKG packages in the public 'contrib' repositories of Debian/RedHat and OpenWRT, which will allow me to focus resources on other matters than release management. It may also help towards your initiatives to provide Docker images.

I have looked at Debian 10 and did not manage to get systemd-tmpfiles / systemd-sysusers working properly, most likely because this version of systemd does not support that yet. I would welcome a discussion on this if this can add Debian 10 support without introducing too much extra complexity. Please reach out on email as this goes beyond the scope of this issue.

@AtlasMUC
Copy link

AtlasMUC commented Oct 28, 2023

Hello @michel-stam,

POSIX compliancy should be pursued if this can be had without requiring extensive testing

I think the point is that the scripts currently are mostly POSIX compliant (*), and using source breaks that.

I have not done any systematic review of the code base. But from practical perspective, as @Jamesits points out, the use of the /bin/sh shebang hints at that fact, and the intention of the existing codebase (prior to the introduction of the source statements).

If the scripts weren't POSIX compliant, there would have been a lot more breakage in more crucial places, and the need to fix that. I.e., "extensive testing" has already been done (**).

So from practical perspective, I think the only thing needed for now is to replace the use of the source statement, as per the present PR. (Which anyhow makes sense from a perspective of consistency/style. I.e., do not mix use of functionally equivalent . and source within a single script, or even within the same code base.)

Regarding Debian versions, I am running a probe built with the make-deb script from ripe-atlas-software-probe master, ripe-atlas-probe-measurements tag 2.6.3, and the tweaks from PRs #88, #93, and RIPE-NCC/ripe-atlas-probe-measurements#12 on Debian 11 (actually, the armv6l Raspberry Pi OS derivative thereof), without issues that I can see at least.

* I am writing "mostly" because from a practical perspective, I don't think strict compliance to the POSIX text is required, but what the common shells that are called via the /bin/sh shebang support in POSIX mode is good enough.

** I think the fact that the use of the non-POSIX source statement wasn't noticed earlier is that various shells that are called on the different distributions via the /bin/sh shebang support this extension (e.g., OpenWrt's BusyBox ash, bash, e.g., on CentOS). And that it isn't used in a critical place. I.e., the probe essentially works also on Debian, and if one doesn't look closely at the logs, the error messages by Debian's dash are easily missed (especially with perd's current log verbosity). But they don't seem to impact the core functionality of the probe, as far as I can see.

@AtlasMUC
Copy link

The goal of this is to get [...] IPKG packages in the public 'contrib' repositories of [...] OpenWRT

Hmm, with regards to OpenWrt, I believe this is already there:

https://git.openwrt.org/?p=feed/packages.git;a=tree;f=net/atlas-probe
https://git.openwrt.org/?p=feed/packages.git;a=tree;f=net/atlas-sw-probe

And working well, as far as I can see from one probe I run based on this.

@Jamesits
Copy link
Author

@michel-stam

I agree that POSIX compliancy should be pursued if this can be had without requiring extensive testing. However, parts of the code may use bash-isms, I cannot guarantee its compliancy without reviewing the code base in its entirety. Can your tell me if you have performed any such review?

I stand behind my PRs. I've tested this specific change and has confirmed that it is compatible with either bash or a POSIX-compliant shell. If we have any scripts that must be run under bash, we should really change the shebang. After all, on the verious distros in the wild, /bin/sh can literally be anything (though they are mostly POSIX-compliant).

Another advice is that we should run shellcheck againsta all the shell scripts we have. It discovers a lot incompatible shell usages automatically, and really helps a lot based on my previous experience.

@michel-stam
Copy link
Contributor

Hello @Jamesits,

To conclude this, I will consider this patch when current work on restructuring the probe has been completed. If you have any further POSIX compliancy additions, please send pull requests. This includes any contributions to keep support for Debian 10 going.

Thanks,

Michel

@michel-stam
Copy link
Contributor

The goal of this is to get [...] IPKG packages in the public 'contrib' repositories of [...] OpenWRT

Hmm, with regards to OpenWrt, I believe this is already there:

https://git.openwrt.org/?p=feed/packages.git;a=tree;f=net/atlas-probe https://git.openwrt.org/?p=feed/packages.git;a=tree;f=net/atlas-sw-probe

And working well, as far as I can see from one probe I run based on this.

Hi @AtlasMUC,

Correct, there is another Atlas package, but this package reengineered the scripts and is not supported by RIPE NCC. I've tried to reach out to the author to collaborate, to no avail. Because I need a working software probe package for OpenWRT on the hardware probes (with a few minor differences to upgrade firmware or to drive LEDs), I am forced to create a "fork", or rather a merge of sorts. This one will follow the exact same structure used for RPM and DEB so that users can expect to find the same files in the same paths. This has been different in the past.

Cheers,

Michel

@AtlasMUC
Copy link

AtlasMUC commented Oct 30, 2023

Hello Michel,

will follow the exact same structure used for RPM and DEB so that users can expect to find the same files in the same paths

Indeed, this is a bit confusing. Even the structure of and dependencies between the scripts on CentOS and Debian are rather intricate (to me anyways).

At the same time, having worked with OpenWrt for some time now, I am not sure whether the goal to fully have the same structure as the RPM or the deb will be feasible. Being targeted at embedded systems, it has certain constraints it is adapted to, which cause various differences to the "big" desktop/server distributions. I think that is what lead to the "re-engineering" that you refer to.

Though the scripting itself seems to be taken as is from upstream, i.e., you guys, including some OpenWrt-specific aspects, I guess from previous OpenWrt-based hardware probes. It mostly seems that its "just" the way its integrated into the system that differs. That is due to, e.g., a proprietary init mechanism being used, configuration being handled differently (primary mechanism unifying configuration across all packages), extensive use of volatile filesystems to avoid wearing out flash memory (same goal as with the Debian package, but not specific to a single package only), packages typically being built by cross-compilation rather than building on system itself (reflected, e.g., in software not being built locally and installed into /usr/local, but compiled elsewhere and installed via package management), ...

But anyway, moving it at least a bit closer to the structure of the packages on OpenWrt's big siblings wouldn't hurt. Good luck for that!

@michel-stam
Copy link
Contributor

Hi @AtlasMUC,

Glad you agree. And yes, UCI integration is key so there will be tiny differences here and there.
On OpenWRT, of course, as you said, procd is required, whereas on Debian and RedHat systems this will be systemd. I'm trying to create functional compatibility here.
Luckily OpenWRT 22 (my target platform) seems to use overlayfs to map RAM systems (or a flash filesystem) on top of the root, such that writes go to the RAM/flash filesystem. I think /var is a tmpfs which would work very well with all the temporary files created.

I'm still a ways out of having a solution in the field. Right now I'm trying to have 3 platforms supported in the software probe. Next, I would like to upgrade the core OpenWRT for the probe v3/v4/v5 to OpenWRT 22 (the latest OpenWRT to have support for all 3 platforms). When that is done, the scripts will need some TLC as both you and @Jamesits very well highlighted. Ideally, systemd/procd would handle starting the daemons, instead of the ripe-atlas script. That, however, is for later, because as soon as I have the software probe working properly I will have to conduct thorough testing to see that I did not break anything.

Currently supported in the development version:

  • Oracle EL9 (RedHat and RedHat-ish)
  • Debian 11, 12 (I cannot get 10 to work because of lack of systemd-tmpfs / systemd-sysusers support)
  • OpenWRT (currently 22.03)

Cheers,

Michel

@grische
Copy link

grische commented Jun 13, 2024

I just ran into the same problem and after reading through this PR, I am not sure I understand the hesitation to merge it.

. works on all shell implementations, source works only on a subset of implementations and already causes problems with some specific versions of Debian and Ubuntu.

Why can't this PR be merge as-is?

@michel-stam
Copy link
Contributor

Hello @grische,

Thank you for your suggestion. Let me explain.

The probe has been written and tested to work with a select combination of platforms using /bin/bash or /bin/ash (busybox). This should work. Any other shell, notably dash, does not (necessarily) work. there may be other bash-isms that prevent a fully working probe. To guarantee this PR to work means reviewing the code base and testing it against other desired shells, which is currently not scheduled for development.

Im willing to put this in as an audit ticket if people are keen on having this change, with the added comment that this does not mean dash is supported. With that in mind I'm wondering what the use case is, other than having dash support?

Regards,

Michel

@grische
Copy link

grische commented Jun 15, 2024

Thanks for the explanation.

We are currently using the beforementioned Docker container of the probe to easily deploy it without having to worry about the host OS.
This container throws error about not understanding source that could easily be fixed by this PR. Apart from this error, I do not see any other problems with the script.

If you want to get an more in-depth understanding if this script compatible with other POSIX shells, I highly recommend shellcheck. This has support for bash and busybox-sh.
By the way, busybox-sh in all distros known to me is way closer to bash than it is to ash or dash. If you want to see some examples, see koalaman/shellcheck#2865 .

@michel-stam
Copy link
Contributor

Hey @Jamesits,

I've had a good few questions on getting an officially supported docker for the probe. Problem is that IPv6 requires a bit of tinkering on the docker system itself, rather than the probe. This is why it hasn't received the attention it probably deserves. I see you've done a good bit of work on documenting this, can we get together at some point to discuss?

Word of warning, the next release (which we are testing right now for RPM, next month for Debian, OpenWRT after that) will have a significant restructure of all the files and directories. Documentation will need to be updated.

Regards,

Michel

@Jamesits
Copy link
Author

@michel-stam Sure, anytime. It's nice to have a officially supported Docker image.

@michel-stam
Copy link
Contributor

Hi @Jamesits

It seems that your patch got pulled in but for some reason it was squashed. I'm truly sorry about that. Normally I do try to attribute contributions to the author.

I've just reviewed and merged in a commit that will allow you to change the shebang on shell scripts to a value of your choosing. This is going to be used in the debian package (upcoming in the next release) to fix the shebang to /bin/bash, so that users that prefer to use dash do not have to change their default shell.

Hope this helps.

Regards,

Michel

@Jamesits
Copy link
Author

@michel-stam Commit 6970d59 looks good to me, so I think this PR can be closed safely. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants