Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cron] could you generate data for OpenGalaxy? #1208

Open
tyn1998 opened this issue Feb 22, 2023 · 13 comments · May be fixed by #1209
Open

[Cron] could you generate data for OpenGalaxy? #1208

tyn1998 opened this issue Feb 22, 2023 · 13 comments · May be fixed by #1209
Assignees
Labels
waiting for repliers need other's feedback

Comments

@tyn1998
Copy link
Member

tyn1998 commented Feb 22, 2023

Description

Reference: X-lab2017/open-galaxy#34

Hi guys, may I request for a cron task generating data for OpenGalaxy? If you can help, please describe how you will have filtered nodes and links to decrease the size of the graph data when the PR is created.

Cron Expression

every month

@github-actions github-actions bot added the waiting for repliers need other's feedback label Feb 22, 2023
@frank-zsy
Copy link
Contributor

/self-assign

I can do this for you. To generate OpenGalaxy global data for every month, I think we can reuse the condition of basic metrics export task which is openrank > e.

We can export all the repos and users nodes with monthly OpenRank value larger than e and activity larger than 2 to avoid too many edges.

With the value above, we can get a graph with 94,789 nodes and 133,960 edges for 2023-01 which is a desirable graph size.

And also I will formalize the edge length into 10 - 30 due to the activity score so the graph will be rendered to OpenGalaxy in a proper way.

A 3,000 iteration layout calculation process will be used to generate position in 3D space, I will try my best to give a continuous position layout result.

@github-actions github-actions bot added waiting for author need issue author's feedback and removed waiting for repliers need other's feedback labels Feb 22, 2023
@frank-zsy
Copy link
Contributor

I think we can also export label data to a new file so you can use in OpenGalaxy to render different color for the nodes.

@tyn1998
Copy link
Member Author

tyn1998 commented Feb 22, 2023

Hi @frank-zsy, thank you for the explanation on export details.

By "also export label data", what do you mean? Is it another file different from https://oss.x-lab.info/open_galaxy/v2/labels.json?

@github-actions github-actions bot added waiting for repliers need other's feedback and removed waiting for author need issue author's feedback labels Feb 22, 2023
@frank-zsy
Copy link
Contributor

The label data here means the label data in OpenDigger, like if the repo is from a company or a foundation. Currently we have more than 10,000 repos and 380 orgs with label, so it will cover lots of the repo nodes and good for color rendering.

@github-actions github-actions bot added waiting for author need issue author's feedback and removed waiting for repliers need other's feedback labels Feb 22, 2023
@tyn1998
Copy link
Member Author

tyn1998 commented Feb 22, 2023

That sounds great!

@github-actions github-actions bot added waiting for repliers need other's feedback and removed waiting for author need issue author's feedback labels Feb 22, 2023
@frank-zsy frank-zsy linked a pull request Feb 22, 2023 that will close this issue
4 tasks
@frank-zsy
Copy link
Contributor

frank-zsy commented Feb 23, 2023

I would like to generate all the data for OpenGalaxy by month from 201501 to 202301 with continuous layout positions.

And I will upload the data to OSS under folders named 201501 - 202301, which means you can set ?v=201812 in URL to load the data of 201812. And set the default version to latest month, like 202301 for now.

Does this make sense to you? @tyn1998

@github-actions github-actions bot added waiting for author need issue author's feedback and removed waiting for repliers need other's feedback labels Feb 23, 2023
@tyn1998
Copy link
Member Author

tyn1998 commented Feb 23, 2023

Is yyyy-mm a valid folder name and a valid url param? If so, I prefer yyyy-mm.

@github-actions github-actions bot added waiting for repliers need other's feedback and removed waiting for author need issue author's feedback labels Feb 23, 2023
@frank-zsy
Copy link
Contributor

OK, I think it is

@github-actions github-actions bot added waiting for author need issue author's feedback and removed waiting for repliers need other's feedback labels Feb 23, 2023
@frank-zsy
Copy link
Contributor

frank-zsy commented Feb 26, 2023

@tyn1998 Do you think you will put more effort in OpenGalaxy, I found it is really hard to give a continuous layout for 3d galaxy, is data in 2023-01 enough for now? I can not find a proper way to generate the data.

The parameters we should consider are:

  • How to set the bounds due to current nodes' count or total OpenRank, with different nodes' count, the galaxy size should be different.
  • How to determine whether the layout calculation is converged or not? Currently we use ngaph.offline.layout which is an offline tool to calculate the layout by iterations count but we can not tell if the layout has been converged.
  • How to check if the layouts are continuous or not. Even if I use last month layout positions as the initial positions for next month calculation. Still I can not know if the layouts generated are continuous or not. In ECharts force layout graph components, the graph may rotate for several rounds before it converged. So maybe the layouts are not continuous but I can not tell.

So I think this will be a long term task to generate layouts for timeline.

@tyn1998
Copy link
Member Author

tyn1998 commented Feb 26, 2023

Hi, @frank-zsy, thanks a lot for your effort!

The knowledges and skills for generating continuous layouts for OpenGalaxy are indeed complex. I agree with you that more time and energy should be involved to complete the challenge.

For now, the data of 2023-01 is enough for building a demo application.

@github-actions github-actions bot added waiting for repliers need other's feedback and removed waiting for author need issue author's feedback labels Feb 26, 2023
@frank-zsy
Copy link
Contributor

@tyn1998 Thanks, I will look into the details in the future.

@tyn1998
Copy link
Member Author

tyn1998 commented Mar 18, 2023

Hi @frank-zsy, could you export OpenGalaxy data of 2023-02 and set it as the default data?

@github-actions github-actions bot added waiting for repliers need other's feedback and removed waiting for author need issue author's feedback labels Mar 18, 2023
@tyn1998
Copy link
Member Author

tyn1998 commented Oct 4, 2023

Hi @frank-zsy, could you export OpenGalaxy data of 2023-02 and set it as the default data?

Hello @frank-zsy, could you export the data of 2023-09 and make it default?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for repliers need other's feedback
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants