-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Regression] security.json is not uploaded during the first initialization of SolrCloud #720
Comments
Hi there! We get the same issue when initializing a new solr cloud cluster with version 9.7.0. Solr is then exposed to the internet but with no basic authentication(or any authentication), consistent with security.json not being loaded at all. We cannot bypass this by restarting solr, as the file is never uploaded. A workaround that I've found to work, is to initialize the cluster with Solr 9.6, then upgrade in place to 9.7. However I am not sure about this approach and its implications. Thank you! |
Just chiming in; we're also unable to leverage |
This is also blocking an upgrade for our team. Per @erwanval , it does seem like the zkcli returns error code |
I tagged it as a bug, feel free to work on a Pull Request trying to fix the issue. I did a quick test of the first command line on a 9.7 docker: ZK_SECURITY_JSON=$(/opt/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd get /security.json || echo 'failed-to-get-security.json'); I have an empty security.json file, so I'd expect the variable to be
When we deprecated the script in v9.7 we added an This will fix the case where we have a security.json file which is empty |
I guess it would be possible to patch solr 9.7 docker container by mounting a fixed script as a volume on the POD at |
Committed the fix to solr repo. It will (likely) make things work again with Solr 9.8, just like it worked with 9.6. Unfortunately there was also a need for a java level fix, so the proposed workaround mounting a patched script won't work since there was more deprecation noise ending up in the output, see SOLR-17586 for details. If you want to test this right now, you'd need to build a custom docker image for branch_9x and use that as your solr image. Easiest way is to wait for Jenkins to build the next nightly version of |
Another way to fix the issue is to change the operator code so that it uses solr zk cp zk:/security.json /tmp/current_security.json >/dev/null 2>&1
if [ $? -eq 0 ]; then
if [ ! -s /tmp/current_security.json ] || grep -q '^{}$' /tmp/current_security.json; then
echo $SECURITY_JSON > /tmp/security.json
solr zk cp /tmp/security.json zk:/security.json >/dev/null 2>&1
echo "put security.json in ZK"
fi
fi I think perhaps that if /security.json does NOT exist in ZK, we should perform the bootstrap instead of silently continue? |
I created #731 to track the need for a change to support Solr 10.0. |
Thanks @janhoy. I'll move forward with the |
+1 for the |
If you replace the usage of We are currently overwriting |
Maybe for this, we have flags in our extra volume specification that says |
This is a regression caused by #660.
During the first initialization of SolrCloud, the security.json doesn't exists in Zookeeper, which cause an exception during the
setup-zk
initContainer zkcli.sh commandWith the change introduced in above PR, when there is an error, it now completely skip the upload. Then the SolR just starts with no security.json at all.
It seems an empty one is created during the main container initialization, so upon manual restart (or if another Solr pod takes longer to start),
setup-zk
run as expected and the real security.json is uploaded.I'm not sure how to solve this while retaining the reason why this change has been introduced in the first place. Maybe the
zkcli.sh
could return a different error code depending on the kind of exception it encounters (missing security.json, or everything else)?setup-zk
logs during first init:ERROR: KeeperErrorCode = NoNode for /solr/solr
Creating ZooKeeper path /solr/solr on ZooKeeper at zookeeper.solr.svc:2181
INFO - 2024-08-23 08:25:26.748; org.apache.solr.common.cloud.ConnectionManager; Waiting for client to connect to ZooKeeper
INFO - 2024-08-23 08:25:26.786; org.apache.solr.common.cloud.ConnectionManager; zkClient has connected
INFO - 2024-08-23 08:25:26.786; org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper
Exception in thread "main" org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /security.json
at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2358)
at org.apache.solr.common.cloud.SolrZkClient.lambda$getData$6(SolrZkClient.java:349)
at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:79)
at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349)
at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:328)
The text was updated successfully, but these errors were encountered: