Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ieee-rawbib fetcher #24

Merged
merged 8 commits into from
Nov 4, 2023
Merged

Conversation

stefanomunarini
Copy link
Contributor

No description provided.

@stefanomunarini
Copy link
Contributor Author

@ronaldtse (cc @andrew2net)
As I do not have access to this repository secrets, can you please check (or point me to someone I can ask to) if amongst the secrets of this repository exists a IETF_BIB_BOT_PAT token? And, in particular, if this token is valid to clone the private repository https://github.com/ietf-ribose/ieee-rawbib ?

As can be seen in the GHA logs for this PR, the token currently used in the Run bundle exec ruby crawler.rb step seems like to not have enough permissions to clone the above mentioned repository.

@stefanomunarini
Copy link
Contributor Author

#20

crawler.rb Outdated Show resolved Hide resolved
@stefanomunarini
Copy link
Contributor Author

stefanomunarini commented Oct 13, 2023

It seems like this secret has been removed from this project.

I added a "test" step to check if secrets.IETF_BIB_BOT_PAT is present among the repository secrets (which replicate previous implementation behaviour, see https://github.com/ietf-tools/relaton-data-ieee/pull/24/files#diff-b2195247a7dfa266381b77e7cbef41ca7cbf143fd06128db55a71de7043507d0R13-R22).

@CAMOBAP we might not need to use username:password (as I was suggesting here), but simply provide this secret and clone the repository using the canonical system("git clone https://oauth2:#{token}@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib").

Alex, could you point me at someone who has access to this repository settings?

@CAMOBAP
Copy link

CAMOBAP commented Oct 13, 2023

@stefanomunarini IDK, @ronaldtse maybe you can suggest a right person

@stefanomunarini
Copy link
Contributor Author

stefanomunarini commented Oct 25, 2023

Hi @kesara ,

It seems like this secret has been removed from this project.

I added a "test" step to check if secrets.IETF_BIB_BOT_PAT is present among the repository secrets (which replicate previous implementation behaviour, see https://github.com/ietf-tools/relaton-data-ieee/pull/24/files#diff-b2195247a7dfa266381b77e7cbef41ca7cbf143fd06128db55a71de7043507d0R13-R22).

@CAMOBAP we might not need to use username:password (as I was suggesting here), but simply provide this secret and clone the repository using the canonical system("git clone https://oauth2:#{token}@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib").

Alex, could you point me at someone who has access to this repository settings?

@kesara as outlined above, this repository is missing a secret, AKA secrets.IETF_BIB_BOT_PAT. Could you please check the settings or point me at someone that has access to this repo settings, to verify that this secret is set? And, if not, help providing it? Thanks

@kesara
Copy link
Member

kesara commented Oct 26, 2023

Hi @kesara ,

It seems like this secret has been removed from this project.
I added a "test" step to check if secrets.IETF_BIB_BOT_PAT is present among the repository secrets (which replicate previous implementation behaviour, see https://github.com/ietf-tools/relaton-data-ieee/pull/24/files#diff-b2195247a7dfa266381b77e7cbef41ca7cbf143fd06128db55a71de7043507d0R13-R22).
@CAMOBAP we might not need to use username:password (as I was suggesting here), but simply provide this secret and clone the repository using the canonical system("git clone https://oauth2:#{token}@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib").
Alex, could you point me at someone who has access to this repository settings?

@kesara as outlined above, this repository is missing a secret, AKA secrets.IETF_BIB_BOT_PAT. Could you please check the settings or point me at someone that has access to this repo settings, to verify that this secret is set? And, if not, help providing it? Thanks

This is set. It's a personal access token for a bot account that has access to a private repository. Earlier versions of the GHA used this to access the ieee-rawbib repository.
I guess the best way to test this is just mock it with a different private repository.
See https://github.com/ietf-tools/relaton-data-ieee/blob/5a30bfed1905c73ddcf61136f6413bac6d904d8f/.github/workflows/generate.yml

@kesara
Copy link
Member

kesara commented Oct 26, 2023

The secrets are not passed to workflows that are triggered by a pull request from a fork. So the PR tests will continue to fail.

@stefanomunarini
Copy link
Contributor Author

The secrets are not passed to workflows that are triggered by a pull request from a fork. So the PR tests will continue to fail.

I see, this was the issue then!

Would it be possible to enable passing secrets to workflows that are triggered by a PR? This is only to test the implementation before merging the PR, it could be disabled right after testing is completed.

@kesara

@ronaldtse
Copy link
Contributor

@stefanomunarini I don't think GitHub supports passing secrets to forks due to security concerns, for example, if a public user can create a PR, that means they can send the secret elsewhere eg via a remote call.

@stefanomunarini
Copy link
Contributor Author

stefanomunarini commented Oct 29, 2023

@stefanomunarini I don't think GitHub supports passing secrets to forks due to security concerns, for example, if a public user can create a PR, that means they can send the secret elsewhere eg via a remote call.

Right, makes total sense. I wonder if the only way to test this implementation is merging it (in this case, can you @CAMOBAP review this PR, please?)? Is there no other way otherwise?

@ronaldtse
Copy link
Contributor

I don't think there's a common alternative. @CAMOBAP can you please help check, thanks!

@kesara
Copy link
Member

kesara commented Oct 31, 2023

When this PR is ready, I'm happy to merge and I can revert the change if it doesn't work.

@stefanomunarini
Copy link
Contributor Author

Hi @kesara , we are waiting for final review from @CAMOBAP . After he's done with it, this PR can be merged. Thanks

@CAMOBAP
Copy link

CAMOBAP commented Nov 1, 2023

Hi @kesara , we are waiting for final review from @CAMOBAP . After he's done with it, this PR can be merged. Thanks

@stefanomunarini if PR is ready for review please press "Ready for review" because usually "draft" PRs mean that the work is in progress and it doesn't ready for review

BTW there is no objections from my side, once workflow actually works

@stefanomunarini stefanomunarini marked this pull request as ready for review November 1, 2023 17:32
@stefanomunarini
Copy link
Contributor Author

Hi @kesara , we are waiting for final review from @CAMOBAP . After he's done with it, this PR can be merged. Thanks

@stefanomunarini if PR is ready for review please press "Ready for review" because usually "draft" PRs mean that the work is in progress and it doesn't ready for review

You're right, that was an oversight from my side. Sorry about that

@kesara feel free to merge the PR, thanks

@CAMOBAP
Copy link

CAMOBAP commented Nov 1, 2023

@stefanomunarini should we rerun the CI because last message was "fatal: Authentication failed for 'https://github.com/ietf-ribose/ieee-rawbib.git/'"

@stefanomunarini
Copy link
Contributor Author

stefanomunarini commented Nov 1, 2023

@stefanomunarini should we rerun the CI because last message was "fatal: Authentication failed for 'https://github.com/ietf-ribose/ieee-rawbib.git/'"

@CAMOBAP This issue has been discussed above, in particular Ronal mentioned that:

@stefanomunarini I don't think GitHub supports passing secrets to forks due to security concerns, for example, if a public user can create a PR, that means they can send the secret elsewhere eg via a remote call.

thus the authentication failed message.

Hence, we need to merge this PR in order to be able to test it. Kesara offered to roll it back immediately in the event of a CI failure.

crawler.rb Outdated Show resolved Hide resolved
@kesara kesara merged commit f3e6747 into ietf-tools:main Nov 4, 2023
1 check passed
@kesara
Copy link
Member

kesara commented Nov 4, 2023

GHA is running here: https://github.com/ietf-tools/relaton-data-ieee/actions/runs/6754213519/job/18361482188
There are errors but GHA is not complaining yet.

@kesara
Copy link
Member

kesara commented Nov 4, 2023

GHA is running here: https://github.com/ietf-tools/relaton-data-ieee/actions/runs/6754213519/job/18361482188 There are errors but GHA is not complaining yet.

And success. Can someone confirm the errors in Run bundle exec ruby crawler.rb *** are expected?

@stefanomunarini
Copy link
Contributor Author

@CAMOBAP @andrew2net can you have a look at the comment above? Is this expected or is there anything that needs to be fixed?

@CAMOBAP
Copy link

CAMOBAP commented Nov 4, 2023

@stefanomunarini crawler.rb looks right to me (probaby we can remove duration calculation logic out of it because fetch should already do this)

@andrew2net to me it looks like we have some issue in relaton-ieee (it looks like `country' is missing) could you please have a look?

File: ieee-rawbib/updates.20230831/IEEEUpdates_IEEEstd/week27b/3185.zip
undefined method `text' for nil:NilClass
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:160:in `block (2 levels) in parse_contributor'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:235:in `block in each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `upto'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:155:in `each_with_object'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:155:in `block in parse_contributor'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:235:in `block in each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `upto'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:153:in `map'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:153:in `parse_contributor'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:43:in `block in parse'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:43:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:43:in `parse'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:102:in `fetch_doc'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:62:in `block in fetch'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:57:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:57:in `fetch'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:47:in `fetch'
crawler.rb:13:in `<main>'

@andrew2net
Copy link
Contributor

File: ieee-rawbib/updates.20230831/IEEEUpdates_IEEEstd/week27b/3185.zip

@CAMOBAP The last files from 2023 caused the error because address country not found in them. But I'm unable to get and explore these files from the ietf-tools/ieee-rawbib repo. Is there any way to get one of the files using GHA?

@CAMOBAP
Copy link

CAMOBAP commented Nov 5, 2023

@andrew2net I think the right way will be to ask @stefanomunarini @kesara to provide required files directly (because they are in the private repo)

@stefanomunarini @kesara is it possible?

@kesara
Copy link
Member

kesara commented Nov 6, 2023

@andrew2net I think the right way will be to ask @stefanomunarini @kesara to provide required files directly (because they are in the private repo)

@stefanomunarini @kesara is it possible?

I just emailed the requested file.

@ronaldtse
Copy link
Contributor

@kesara could you please help provide us access to the ieee-rawbib repository so that we can address the newly introduced parsing errors from the new files? Thanks!

@andrew2net
Copy link
Contributor

File: ieee-rawbib/updates.20230831/IEEEUpdates_IEEEstd/week27b/3185.zip

The file contains an address without country:

    ...
    <address>
        <city>New York</city>
    </address>
    ...

@kesara @ronaldtse For the NY city we can add the USA country. All the other addresses also have US cities. It seems we can use USA as a default country, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants