Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend benchmarks with computation of scores for NR and DB metrics #270

Merged
merged 56 commits into from
Apr 19, 2022

Conversation

snk4tr
Copy link
Contributor

@snk4tr snk4tr commented Sep 5, 2021

Closes #265

Proposed Changes

  • Add computation of correlation scores for no-reference metrics on all datasets
  • Add computation of correlation scores for distribution-based (DB) metrics on all datasets
  • Add description of how DB metrics are computed + reference to the paper on IQA for MRI

Signed-off-by: Sergey Kastryulin <[email protected]>
@snk4tr snk4tr requested a review from zakajd September 5, 2021 17:52
@snk4tr snk4tr self-assigned this Sep 5, 2021
@snk4tr snk4tr added the enhancement Making some part of the codebase better without introduction of new features label Sep 5, 2021
@codecov
Copy link

codecov bot commented Sep 5, 2021

Codecov Report

Merging #270 (797502a) into master (de38340) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #270   +/-   ##
=======================================
  Coverage   94.04%   94.04%           
=======================================
  Files          34       34           
  Lines        2485     2485           
=======================================
  Hits         2337     2337           
  Misses        148      148           
Flag Coverage Δ
unittests 94.04% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@snk4tr
Copy link
Contributor Author

snk4tr commented Sep 5, 2021

Ready for review.

@sonarcloud
Copy link

sonarcloud bot commented Sep 6, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Collaborator

@zakajd zakajd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the implementation, thanks @snk4tr
Let's add benchmarks results into README table and also some explanation on how values for distribution-based metrics are computed.

@snk4tr
Copy link
Contributor Author

snk4tr commented Oct 12, 2021

@zakajd about including results on PIPAL to the table: the paper on the PIPAL dataset has the following table:
image

The problem with this table is that there is no single column for evaluation of metrics on all types of artefacts. Hence, we need to either

  • Select only one type of artefacts and use it for comparison between reference numbers and our results or
  • Compute the reference score for all types of artefacts, average it and provide in the table without any reference or
  • Not add PIPAL in the comparison table

At this point in time I prefer the first option but the the question of which distortion type to choose arises. Personally, I think that so called "Traditional distortions" (Gaussian noise, blur etc.) is the most logical choice but I would like to know what you think.

@snk4tr
Copy link
Contributor Author

snk4tr commented Oct 12, 2021

I also think that we should not add values for distribution-based metrics in the table even though the benchmarking tool lets us to compute them. The reason is that the main purpose of the table in the README file is to provide a reference on how close our implementations of metrics are to the original ones. At this point in time, there are no reference values for distribution-based metrics, which means that there is nothing to compare with.

@zakajd
Copy link
Collaborator

zakajd commented Oct 12, 2021

@snk4tr PIPAL paper is a bit confusing. They actually do evaluate metrics on all distortions (see Fig. 4), but don’t publish numbers. In their next work with SWDN metric (https://arxiv.org/pdf/2011.15002.pdf) they post some results in Table 5.

So we can

  1. Ask authors to provide the data
  2. Use available data from SWDN paper

@snk4tr
Copy link
Contributor Author

snk4tr commented Oct 13, 2021

@zakajd I sent a request to provide the data. I see that values from the SWDN metric paper are already there. I will add our values for all metrics in this PR. Hopefully the authors will kindly provide us the data soon.

@sonarcloud
Copy link

sonarcloud bot commented Oct 16, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@sonarcloud
Copy link

sonarcloud bot commented Nov 19, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@snk4tr snk4tr requested a review from zakajd March 26, 2022 08:49
Copy link
Collaborator

@denproc denproc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Just some minor changes.

Line 200: I would change it to "Feature Based"

README.rst Outdated Show resolved Hide resolved
tests/results_benchmark.py Show resolved Hide resolved
tests/results_benchmark.py Show resolved Hide resolved
tests/results_benchmark.py Show resolved Hide resolved
tests/results_benchmark.py Outdated Show resolved Hide resolved
tests/results_benchmark.py Show resolved Hide resolved
tests/results_benchmark.py Outdated Show resolved Hide resolved
tests/results_benchmark.py Outdated Show resolved Hide resolved
tests/results_benchmark.py Outdated Show resolved Hide resolved
tests/results_benchmark.py Show resolved Hide resolved
Signed-off-by: Sergey Kastryulin <[email protected]>
Signed-off-by: Sergey Kastryulin <[email protected]>
Signed-off-by: Sergey Kastryulin <[email protected]>
Signed-off-by: Sergey Kastryulin <[email protected]>
Signed-off-by: Sergey Kastryulin <[email protected]>
Signed-off-by: Sergey Kastryulin <[email protected]>
Signed-off-by: Sergey Kastryulin <[email protected]>
@snk4tr snk4tr requested a review from denproc April 17, 2022 14:33
@snk4tr
Copy link
Contributor Author

snk4tr commented Apr 17, 2022

@denproc @zakajd ready for review.

Copy link
Collaborator

@denproc denproc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor change to make documentation consistent

tests/results_benchmark.py Outdated Show resolved Hide resolved
denproc
denproc previously approved these changes Apr 18, 2022
@sonarcloud
Copy link

sonarcloud bot commented Apr 19, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@snk4tr snk4tr merged commit fad3bc3 into master Apr 19, 2022
@snk4tr snk4tr deleted the feature/extend_bench branch April 19, 2022 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making some part of the codebase better without introduction of new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend results benchmark
3 participants