Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make API info available - CLI via Hub/proxy and/or UIS #396

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

dwsutherland
Copy link
Member

@dwsutherland dwsutherland commented Nov 22, 2022

These changes partially address cylc/cylc-flow#5235
Sibling to cylc/cylc-flow#5267

On spawning the UI Server (Jupyter server), the hub negotiates an API token and puts it as JUPYTERHUB_API_TOKEN, along with other variables, into the the environment. If not hub spawned the UI Server will generate one itself.

Using the server app's information (self.serverapp.server_info()) and overwriting it with HUB info, we can make this available by writing it to a (user read-only) file:

$ pwd
/home/sutherlander/.cylc/uiserver
$ ls -larth api_info.json 
-rw------- 1 sutherlander sutherlander 288 Nov 22 11:25 api_info.json
$ cat api_info.json
{"url": "http://127.0.0.1:39409/user/sutherlander/", "hostname": "127.0.0.1", "port": 39409, "sock": "", "secure": false, "base_url": "/user/sutherlander/", "token": "2f2abf2e4bcd43589711c1554ce2cc7e", "root_dir": "/home/sutherlander", "password": false, "pid": 6610, "version": "1.21.0"}

(this token changes on each UIS spawn)

This can be used to access the UIS via the UIS url (as above) but also via the hub proxy (http://localhost:8000/user/sutherlander/)..
And copied (manually only?) to machines on the network for API access there.

Tasks:

  • - Write API info, including hub token if available, to a user read-only file.
  • - Clean up file on UI Server shutdown (use absence as indicator that UIS isn't running?).
    • Hubless clean-up.
  • - Find out hub proxy URL (if available), and use that instead or in addition to UIS URL.

Sample:

#!/usr/bin/env python3

import json
import requests

from cylc.uiserver.app import API_INFO_FILE

f = open(API_INFO_FILE, "r")
api_info = json.loads(f.read())
f.close()

query = '''
query {
  workflows {
    id
    stateTotals
  }
}
'''

r = requests.post(api_info["url"] + 'cylc/graphql',
    headers={
        'Authorization': f'token {api_info["token"]}',
    },
    json={'query': query}
)


r.raise_for_status()
data = r.json()

print(json.dumps(data, indent=4))

$ ./uis_api.py
{
    "data": {
        "workflows": [
            {
                "id": "~sutherlander/linear/run1",
                "stateTotals": {
                    "waiting": 2,
                    "expired": 0,
                    "preparing": 0,
                    "submit-failed": 1,
                    "submitted": 0,
                    "running": 0,
                    "failed": 0,
                    "succeeded": 0
                }
            }
        ]
    }
}

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.

  • Contains logically grouped changes (else tidy your branch by rebase).

  • Does not contain off-topic changes (use other PRs for other changes).

  • Appropriate tests are included (test for token use should be included).

  • Already covered by existing tests (file is written on UIS start).

  • Appropriate change log entry included.

  • I have opened a documentation PR at cylc/cylc-doc/pull/XXXX.

@dwsutherland dwsutherland self-assigned this Nov 22, 2022
@dwsutherland dwsutherland changed the title api info file written Make API info available - CLI via Hub/proxy and/or UIS Nov 22, 2022
@codecov-commenter
Copy link

codecov-commenter commented Nov 22, 2022

Codecov Report

Base: 80.00% // Head: 79.84% // Decreases project coverage by -0.15% ⚠️

Coverage data is based on head (6765e1e) compared to base (c204b88).
Patch coverage: 80.95% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #396      +/-   ##
==========================================
- Coverage   80.00%   79.84%   -0.16%     
==========================================
  Files          12       12              
  Lines        1160     1181      +21     
  Branches      197      199       +2     
==========================================
+ Hits          928      943      +15     
- Misses        195      198       +3     
- Partials       37       40       +3     
Impacted Files Coverage Δ
cylc/uiserver/app.py 87.40% <80.95%> (-1.19%) ⬇️
cylc/uiserver/authorise.py 88.64% <0.00%> (-0.88%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -515,6 +524,22 @@ def set_auth(self):
def initialize_templates(self):
"""Change the jinja templating environment."""

def write_api_info(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get away without this write.
Apologies if this is stuff you already know, not sure how familiar you are with #370

In ~/.cylc/uiserver/info_files for every instance there are two files created.. a jpserver-<pid>-open.html file, which we use for opening the gui with an existing instance (see the code in scripts/gui.py) and another one, jpserver-<pid>.json it is this one that contains the same info as the api_info file.
I've compared the files and the info is the same... until you open a new instance using cylc gui --new, sorry I think that pr has complicated matters a little. At this point, we have one api_info.json file for two of the jpserver.json files. The api_info file, for me, has recorded a different port. I don't think this is a major problem but I think we can select one of these existing jpserver files and open that in the async request? This should get around the duplication.

The only potential problem I can think of at the moment is in the selection of the file. At the minute, we select a random file for re-using guis. I suspect we may want to select the particular file for the instance that we are interacting with, although I am not 100% sure about this. If so, the file name will be of the format jpserver-<pid>.json so we only need the process id to get the correct file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think it is a bit more complicated.

The above scenario was for running the cylc gui command. For cylc hub I didn't change the JUPYTER_RUNTIME_DIR variable, so the files are still generated per instance but they are still saved to the default .local/share/jupyter directory... see: https://docs.jupyter.org/en/latest/use/jupyter-directories.html#envvar-JUPYTER_RUNTIME_DIR

The info contained looks the same. Not sure what goes on with different users attaching though. Might need some investigation.

@dwsutherland
Copy link
Member Author

Thanks for looking at this @datamel !

Oh that's handy to know, found the hub created one.. a whole pile of them (~600 or so)

We should use it, it worked setting os.environ["JUPYTER_RUNTIME_DIR"] in the hubapp.py ..

$ ls .cylc/uiserver/info_files/
jpserver-7002.json  jpserver-7002-open.html  jupyter_cookie_secret

However, a couple of things:

  • How do we set it so it uses a relative path (i.e. ~/.cylc...), or evaluates the users home directory each time?
  • How do we get it to clean up the files when stopped by the hub? (at the moment they just collect in that .local/share/jupyter and are never cleaned up)

We could possibly just do something like:

list_of_paths = folder_path.glob('*.json')
latest_path = max(list_of_paths, key=lambda p: p.stat().st_ctime)

(found on some substack)
to use the latest UIS.

Another thing is, I want to be able to use the hub proxy URL instead of UIS directly (as that may be exposed to a different network).. But I need to be able to scrape the URL from somewhere, and probably add it to this file.. Any ideas?

@datamel
Copy link
Contributor

datamel commented Jan 31, 2023

Oh that's handy to know, found the hub created one.. a whole pile of them (~600 or so)

Yes, Jupyter do clean them but I think we need to add our own clean logic as it seems patchy (this was discussed recently on element). They don't get cleaned when cylc gui/hub is not shut down cleanly, e.g. killed with a kill -9.
I think when we reuse these files, it may be necessary to check that the process is still running and then clean the files if it isn't and reselect until we get a working one. That should be a quick pr to do as we have the details of the process number in the file name. Although I think some investigation needs to happen in terms of what host the process is on and where the file is - is a file also generated for each user connecting to the hub?

We could possibly just do something like:

list_of_paths = folder_path.glob('*.json')
latest_path = max(list_of_paths, key=lambda p: p.stat().st_ctime)
(found on some substack)
to use the latest UIS.

At the moment we do:

existing_guis = glob(os.path.join(INFO_FILES_DIR, "*open.html"))

and then randomly select one, rather than using the latest file (note that this file is the html open one, not the json one that you need - they are generated in a pair).
I don't fully understand the issue of implementing this comms method - would any uiserver info be suitable, or do we need the particular uiserver that the workflow will be going through?

@datamel
Copy link
Contributor

datamel commented Jan 31, 2023

We should use it, it worked setting os.environ["JUPYTER_RUNTIME_DIR"] in the hubapp.py ..

Just thinking now that we may want to put them in a separate dir to the one used for cylc gui files? Since we are reusing guis (and updating the html file with a new url) now I think it best to keep these separate. This will stop us selecting a hub file when we want to open the gui (specifically not going through the hub). Perhaps we could redesign the structure of the dir so we keep gui and hub files in their own sub dirs? ~/.cylc/uiserver/info_files/hub/ and ../gui/?

Another thing is, I want to be able to use the hub proxy URL instead of UIS directly (as that may be exposed to a different network).. But I need to be able to scrape the URL from somewhere, and probably add it to this file.. Any ideas?

I'll read up about this and get back to you. I'm not fully sure how to do this at the moment!

@dwsutherland
Copy link
Member Author

dwsutherland commented Feb 1, 2023

Just thinking now that we may want to put them in a separate dir to the one used for cylc gui files? Since we are reusing guis (and updating the html file with a new url) now I think it best to keep these separate.

The hub located files and the other files look to have identical infomation (not hub specific, tho might pay to check), so I think it might be safe to locate these in the same place.

However, for the CLI (and maybe the gui), I would like to have the hub proxy URL available to use (maybe preferentially).. And it would help if we didn't have to read two files (i.e. have the same file contain hub_url field)..

I would go further to say, instead of starting up a new UIS for the gui, we should reuse existing for gui, no? (regardless of hub started or non-hub)

@dwsutherland
Copy link
Member Author

I don't fully understand the issue of implementing this comms method - would any uiserver info be suitable, or do we need the particular uiserver that the workflow will be going through?

Any UIS works for the CLI via UIS, and we can also use the hub proxy url (which may be preferable, as it might be intentionally exposed to a wider network)

@dwsutherland
Copy link
Member Author

They don't get cleaned when cylc gui/hub is not shut down cleanly, e.g. killed with a kill -9.

Also, they are not cleaned up when stopping the UIS via the hub interface...

@oliver-sanders oliver-sanders modified the milestones: 1.3.0, 1.4.0 Jul 17, 2023
@MetRonnie MetRonnie modified the milestones: 1.4.0, 1.5.0 Aug 17, 2023
@oliver-sanders oliver-sanders modified the milestones: 1.5.0, 1.6.0 Apr 23, 2024
@oliver-sanders oliver-sanders modified the milestones: 1.6.0, 1.7.0 Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants