Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move GCD uncompress to server-side #150

Open
mlincett opened this issue Mar 8, 2023 · 6 comments
Open

Move GCD uncompress to server-side #150

mlincett opened this issue Mar 8, 2023 · 6 comments
Assignees

Comments

@mlincett
Copy link
Collaborator

mlincett commented Mar 8, 2023

Currently, the Skymap Scanner implements frame object diff logic for GCD information.

Alerts coming from the pole carry so-called "GCD diff" frames, in order to reduce bandwidth compared to sending the whole GCD package to North. In presence of a GCD diff, the Skymap Scanner fetches the baseline GCD and rebuilds the full GCD information (uncompress).

The current functionality is as follows:

  • the GCD is uncompressed server-side for the purpose of pre-processing the event (prepare_frames.py) but then the uncompressed information is deleted from the framepacket that is then passed to the client;
  • the client, in turn, applies the same logic to rebuild the GCD to work on.

I see no benefit in performing this operation both server-side and client-side. It should be possible to move the uncompress stage to the server side and have the client work on the full GCD only.

@mlincett mlincett self-assigned this Mar 8, 2023
@dsschult
Copy link
Member

dsschult commented Mar 8, 2023

There is one benefit: sending a smaller GCD file to each client. I'm not sure if there are any performance implications of sending the full GCD to each client, since 30MB x 1000 clients is a bit large if they all start at the same time. We can of course work around that with some cleverness (host the GCD on a fast server).

@mlincett
Copy link
Collaborator Author

mlincett commented Mar 8, 2023

Thanks for pointing this out, I did not consider this aspect.

If this is really a concern then we should take the problem backwards. There are situations in which the full GCD transfer is currently forced and that may potentially be avoided:

  • for GCD-less events, we can replicate the server logic on the client side (look up the "closest" GCD based on run number and copy over the GCD frames)
  • for simulated events it is a bit more complicate as the set of possible GCDs is not defined a priori, maybe for batch processing of large datasets we may encourage embedding the GCD in the image.

@dsschult
Copy link
Member

I did some basic testing, and I think now that we're using S3 to transfer input files with skydriver compression isn't an issue. We probably do want to separate the GCD from the json (otherwise that's a huge base64 blob in there).

@mlincett
Copy link
Collaborator Author

I did some basic testing, and I think now that we're using S3 to transfer input files with skydriver compression isn't an issue. We probably do want to separate the GCD from the json (otherwise that's a huge base64 blob in there).

So the idea is that SkyDriver would write the GCD to an S3 bucket and the server would pass the object URL to the client?

@dsschult
Copy link
Member

I did some basic testing, and I think now that we're using S3 to transfer input files with skydriver compression isn't an issue. We probably do want to separate the GCD from the json (otherwise that's a huge base64 blob in there).

So the idea is that SkyDriver would write the GCD to an S3 bucket and the server would pass the object URL to the client?

Close. It can actually pass the object URL to HTCondor, which will download it and put it in the directory the client starts in. So for the client, it looks like the GCD file is in $PWD.

@dsschult
Copy link
Member

I should note that if running this manually, you can also transfer the GCD directly via condor file transfer or any other method. So this would work outside SkyDriver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants