Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment 2021-1.0 checklist #18

Closed
20 of 22 tasks
mrakitin opened this issue Jan 11, 2021 · 10 comments
Closed
20 of 22 tasks

Deployment 2021-1.0 checklist #18

mrakitin opened this issue Jan 11, 2021 · 10 comments

Comments

@mrakitin
Copy link
Member

mrakitin commented Jan 11, 2021

Previsit

  • Make sure that the current latest environments are pushed to all the bl machines
  • Skim the IPython startup files in profile_collection
    • Check if it's safe to start bsui remotely (does it touch any hardware?) - use git grep .put or git grep caput
  • Check for outstanding PRs and issues on beamline repos
  • Fork and branch https://github.com/NSLS-II/playbooks for the purpose of updating the SRX current_env_tag in the production file

Housekeeping

  • Update conda

    $ su - <your-controls-account>
    $ sudo su -
    # conda activate base  # should be activated by default, but just in case.
    # conda install conda -c https://repo.anaconda.com/pkgs/main/
  • Add BL staff to the BL GitHub organization as owners

  • Work with BL staff to commit any un-committed changes to their profiles

  • Tag the profile as-found as 2020C2.1

    $ git tag -a 2020C2.1

    Enter a message such as Before 2021C1.0 deployment.

  • Discuss with BL staff which conda envs they want to keep/delete and perform the cleanup

  • Check/update the beamline's inventory with the BL staff (in https://github.com/NSLS-II/playbooks/blob/master/production)

  • In ~/.bashrc, if necessary, update the logging environment variables to use directory /var/log/bluesky/... and add umask 0002.

    umask u=rwx,g=rwx,o=rx  # 0002
    export BLUESKY_LOG_FILE=/var/log/bluesky/bluesky.log
    export BLUESKY_IPYTHON_LOG_FILE=/var/log/bluesky/bluesky_ipython.log

    and (if needed) create that directory with the following permissions and ownership:

    sudo mkdir /var/log/bluesky
    sudo chown -Rv xf05id1:srx /var/log/bluesky
    sudo chmod -Rv g+rws /var/log/bluesky

    Remember to source ~/.bashrc.

  • Remove explicit setting of ophyd logging level from the first startup script

  • Double-check that if there is an open PR removing handlers, it is merged (and tested).
    Migrate databroker handlers #9 needs a rebase.

  • Update vendored copy of PersistentDict to bug-fixed version (see the updated gist snippet: https://gist.github.com/jklynch/a4366b8900ec0c03883403455ae711b2).

  • Make sure all repos have the BSD-3-Clause licenses (see Add a BSD-3-Clause license #16 as an example)

Test deployment

  • Run profile against new environment
    • BS_ENV=collection-2021-1.0 bsui
  • Update the profile as needed to run
  • Test databroker v2
    from databroker import catalog
    cat = catalog['beamline_name']
    cat[-1].primary.read()
  • Run acceptance tests (if any) and run a representative set of scans for the beamline

Finish

@andrewmkiss
Copy link
Contributor

@mrakitin I ran my acceptance tests at the beamline. Overall, there is nothing major and I think we can switch to this deployment. If you go into that branch, you will see one change in 90-user, which is simply changing the user cycle. Can be committed or skipped. I did not test databroker v2.

However, I do have a few comments:

  • While loading bsui, we see a message [TerminalIPythonApp] Running file in user namespace...[filepath + name]. And then I have my own (much shorter), Loading file {filename}.... Can we remove this TerminalIPython message?
  • We were doing the FlyingMono testing where we enable those PVs. When I did my test of energy.move(), it failed because I could not control the mono/undulator. Is it possible to disable FlyingMono in order to complete an energy.move()? Or maybe we need to look at how we stage/unstage FlyingMono so that it defaults to a disabled stage?
  • Our xs3 has too many read attributes. Right now, we have a 4-element detector, and on the IOC there are 16 ROIs for each element. We only allow the user to set 3 ROIs so we have an addition 13x4, 52 rois that are useless (plus the sum for the roi). Can we decrease this so we only have the first 3 ROIs for each element?
  • @dmgav I did a nano_xrf scan from 61-xrf, and make_hdf said "Xspress3 detector MNT: move tests to canonical location #1 (three channels)" This detector has 4 channels. I confirmed that it only reads the first 3 with create_each_det=True. I have some changes I would like to make to our metadata structure and also make_hdf so this isn't a critical fix, but it is not currently pulling all the data.
  • @dmgav The new metadata structure and make_hdf I mentioned above will also affect this, but we do not have position data for our slow axis for fly scans (nano_scan_and_fly)

@dmgav
Copy link
Contributor

dmgav commented Jan 15, 2021

@andrewmkiss As far as I remember, we didn't complete work related to loading data from nanostage because start document did not contain sufficient metadata. As far as I remember, in current implementation the 'slow' axis is always vertical and 'fast' axis is always horizontal. Also, position data is not loaded if one of the scan axes is Z, so maps could be plotted only in pixels coordinates. Also PyXRF will now load data only from 3 detector channels. It can work with arbitrary number of channels, but there is some work that needs to be done. We may also need additional metadata that will allow to consistently identify the number of detector channels.

Also I would like to mention that the sum is also computed using data from only 3 channels, we need to implement support for more detectors in order to be able to use them.

@mrakitin
Copy link
Member Author

@mrakitin I ran my acceptance tests at the beamline. Overall, there is nothing major and I think we can switch to this deployment. If you go into that branch, you will see one change in 90-user, which is simply changing the user cycle. Can be committed or skipped. I did not test databroker v2.

Thanks for the tests, @andrewmkiss. Please commit & push that change.

However, I do have a few comments:

* While loading bsui, we see a message [TerminalIPythonApp] Running file in user namespace...[filepath + name]. And then I have my own (much shorter), Loading file {filename}.... Can we remove this TerminalIPython message?

I think it can be disabled, we need to experiment how to do that. I've created an issue: NSLS-II/nslsii#115.

* We were doing the FlyingMono testing where we enable those PVs. When I did my test of energy.move(), it failed because I could not control the mono/undulator. Is it possible to disable FlyingMono in order to complete an energy.move()? Or maybe we need to look at how we stage/unstage FlyingMono so that it defaults to a disabled stage?

Can you point me to the code? I guess the failure is legitimate due to the PVs are non-writable at that time. I guess it will fail too with the older conda env, so it's not directly relevant to the deployment process. Please create an issue in this repo, so we can spend some development/debugging time later this cycle.

* Our xs3 has too many read attributes. Right now, we have a 4-element detector, and on the IOC there are 16 ROIs for each element. We only allow the user to set 3 ROIs so we have an addition 13x4, 52 rois that are useless (plus the sum for the roi). Can we decrease this so we only have the first 3 ROIs for each element?

Yes, I think the unused ROIs can be disabled. What does xs.channel1.rois.read() show? Come components can be set as "omitted", so they are not captured by databroker. Something like:

for i, cpt in enumerate(xs.channel1.rois.component_names):
    if i < 3:
        getattr(xs.channel1.rois, cpt).kind = 'normal'
    else:
        getattr(xs.channel1.rois, cpt).kind = 'omitted'

(repeat the same for all channels)

@mrakitin
Copy link
Member Author

@andrewmkiss, did my reply answer your questions?

@andrewmkiss
Copy link
Contributor

@andrewmkiss, did my reply answer your questions?

I'll look at this right now. I got distracted with other work.

@andrewmkiss
Copy link
Contributor

@mrakitin Okay, so changes were pushed to deploy-2021-1.0.

With the ROIs, the code was actually in there, but would only be applied if TOUCHBEAMLINE == 1. I have changed it so these will be applied whether or not TOUCHBEAMLINE is 1. **TOUCHBEAMLINE variable was setup so that multiple instances of bluesky could be opened without affecting each other. Basically, we needed bluesky open on a separate machine to get Merlin image data because it had the correct handler configured.

I'll create an issue for the energy/undulator issue.

@andrewmkiss
Copy link
Contributor

#25 Undulator Fly-scanning issue created here.

@mrakitin
Copy link
Member Author

@andrewmkiss, thanks for the update. I don't see the aforementioned commit in https://github.com/NSLS-II-SRX/profile_collection/pull/24/commits. If I read it right, the changes were pushed to the branch which has already been merged into master and deleted from GitHub, and then recreated from a local push. I will create a new PR, so that we merge it to master, tag, and use it from there.

@andrewmkiss
Copy link
Contributor

@mrakitin Okay, I pushed my latest changes to github. I tagged it as described above. When I click on the playbooks link, I get an error for page not found.

Also @mrakitin I saw a step in there with ensible. can you deploy the collection-2021-1.0 profile to xf05id1-ws4? This is our backup data collection computer.

@mrakitin
Copy link
Member Author

Thanks, @andrewmkiss! This is done. The new collection conda environment is also om ws4 now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants