Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable AuditSource during dotnet list package command #6206

Open
wants to merge 8 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ internal class ListPackageArgs
public ILogger Logger { get; }
public string Path { get; }
public List<PackageSource> PackageSources { get; }
public List<PackageSource> AuditSources { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern that most NuGet.CommandLine and NuGet.CommandLine.XPlat commands use is that these *Args classes hold the inputs, and business logic goes in the *Runner classes.

Since no CLI argument/option was added to dotnet list package for audit sources (should there be?), I feel like this AuditSources property doesn't belong here.

I had a quick look, and I see that ListCommand reads nuget.config and gets the sources at the same time as it's parsing the other CLI arguments and options into the ListPakcageArgs class. But I think this is not good software design. Following the Single Responsibility Principal, I think easier to understand code if ListPackageArgs is only parses CLI commands and arguments and puts them in the ListPackageArgs class, and then ListCommandRunner has all business logic regarding whether CLI sources are added to nuget.config sources, or used as a replacement instead, as well all the other work that list package does.

public List<string> Frameworks { get; }
public ReportType ReportType { get; }
public IReportRenderer Renderer { get; }
Expand All @@ -41,6 +42,7 @@ internal class ListPackageArgs
/// <param name="prerelease"> Bool for --include-prerelease present </param>
/// <param name="highestPatch"> Bool for --highest-patch present </param>
/// <param name="highestMinor"> Bool for --highest-minor present </param>
/// <param name="auditSources"> A list of sources for performing vulnerability auditing</param>
/// <param name="logger"></param>
/// <param name="cancellationToken"></param>
public ListPackageArgs(
Expand All @@ -53,6 +55,7 @@ public ListPackageArgs(
bool prerelease,
bool highestPatch,
bool highestMinor,
List<PackageSource> auditSources,
ILogger logger,
CancellationToken cancellationToken)
{
Expand All @@ -65,11 +68,27 @@ public ListPackageArgs(
Prerelease = prerelease;
HighestPatch = highestPatch;
HighestMinor = highestMinor;
AuditSources = auditSources;
Logger = logger ?? throw new ArgumentNullException(nameof(logger));
CancellationToken = cancellationToken;
ArgumentText = GetReportParameters();
}

public ListPackageArgs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment isn't as relevant given my other suggestion that AuditSources shouldnt be defined in this class (unless a CLI option is added), but this class isn't public, so it's not part of the public API, and therefore we can modify the existing constructor rather than adding a new overload.

If this was already understood, and this overload was added instead to avoid needing to modify existing tests that call the existing constructor, a trade-off needs to be made between ease of finishing your current work (this PR) vs ongoing tech debt in having more methods, risks of bugs because tests are using the production class differently to how production code is being used.

There will be times where adding an overload is the better choice, and other times when making the effort to update tests is the better choice to reduce risk. I haven't looked in detail at how many tests create instances of this class, but my gut feeling is that I would expect it not to be too difficult to update all tests, so we probably don't need a constructor overload.

string path,
List<PackageSource> packageSources,
List<string> frameworks,
ReportType reportType,
IReportRenderer renderer,
bool includeTransitive,
bool prerelease,
bool highestPatch,
bool highestMinor,
ILogger logger,
CancellationToken cancellationToken) : this(path, packageSources, frameworks, reportType, renderer, includeTransitive, prerelease, highestPatch, highestMinor, new List<PackageSource>(), logger, cancellationToken)
{
}

private string GetReportParameters()
{
StringBuilder sb = new StringBuilder();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ public static void Register(
isVulnerable: vulnerableReport.HasValue());

IReportRenderer reportRenderer = GetOutputType(outputFormat.Value(), outputVersionOption: outputVersion.Value());

var provider = new PackageSourceProvider(settings);
var packageRefArgs = new ListPackageArgs(
path.Value,
packageSources,
Expand All @@ -140,6 +140,7 @@ public static void Register(
prerelease.HasValue(),
highestPatch.HasValue(),
highestMinor.HasValue(),
provider.LoadAuditSources().ToList(),
logger,
CancellationToken.None);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
using NuGet.ProjectModel;
using NuGet.Protocol;
using NuGet.Protocol.Core.Types;
using NuGet.Protocol.Model;
using NuGet.Protocol.Providers;
using NuGet.Protocol.Resources;
using NuGet.Versioning;

namespace NuGet.CommandLine.XPlat
Expand Down Expand Up @@ -176,6 +179,18 @@ private static void WarnForHttpSources(ListPackageArgs listPackageArgs, ListPack
}
}

foreach (PackageSource packageSource in listPackageArgs.AuditSources)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have liked the PR description to have addressed my question, but from https://learn.microsoft.com/en-us/nuget/concepts/auditing-packages#audit-sources

If auditSources is not defined or is cleared without adding any sources, then packageSources will be used and warning NU1905 is suppressed.

So, for restore, when nuget.config defines one or more audit sources, then vulnerability information is not retrieved from the package sources, only from the audit sources.

This PR does not appear to follow this behaviour for dotnet list package.

In theory, if a package source lists a package as vulnerable, but the audit source does not, this will cause dotnet list package --vulnerable to have different results to dotnet restore.

If it's intentional that dotnet list package --vulnerable will work differently, I'm curious why

{
if (packageSource.IsHttp && !packageSource.IsHttps && !packageSource.AllowInsecureConnections)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no else block, so if any sources defined in nuget.config are not used because it's using insecure http, it will be silently ignored. I believe all or other commands report an error, so this should as well.

{
if (httpPackageSources == null)
{
httpPackageSources = new();
}
httpPackageSources.Add(packageSource);
}
}

if (httpPackageSources != null && httpPackageSources.Count != 0)
{
if (httpPackageSources.Count == 1)
Expand Down Expand Up @@ -276,6 +291,32 @@ private async Task<Dictionary<string, List<IPackageSearchMetadata>>> GetPackageM
List<FrameworkPackages> targetFrameworks,
ListPackageArgs listPackageArgs)
{
var vulnerabilityInfo = new List<IReadOnlyDictionary<string, IReadOnlyList<PackageVulnerabilityInfo>>>();

if (listPackageArgs.ReportType == ReportType.Vulnerable && listPackageArgs.AuditSources.Count > 0)
{
foreach (var source in listPackageArgs.AuditSources)
{
var repository = Repository.Factory.GetCoreV3(source);
var vulnerabilityProvider = new VulnerabilityInfoResourceV3Provider();
var result = await vulnerabilityProvider.TryCreate(repository, listPackageArgs.CancellationToken);
Comment on lines +298 to +302
Copy link
Member

@zivkan zivkan Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package sources can provide the VulnerabilityInfoResource as well as audit sources, and that's actually how audit during restore works. An issue was created for this before auditSources was implemented, so maybe some people would consider it scope creep because the issue to add support for auditSources didn't explicitly call it out, but it's very related to the work currently being done, so I feel like it's a good time to implement it at the same time:

Implementing this will give dotnet list package --vulnerable a significant performance boost when there are many packages to check, because NuGet can download the database once and then do a whole lot of lookups, instead of making an HTTP request to every source for every package id.

var vulnerabilityResource = result.Item2 as VulnerabilityInfoResourceV3;

if (vulnerabilityResource != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling when result.Item1 contains exceptions, or when vulnerabilityResource is null (the source doesn't support the vulnerability info resource)

{
var vulnerabilityInfoResult = await vulnerabilityResource.GetVulnerabilityInfoAsync(
new SourceCacheContext(),
listPackageArgs.Logger,
listPackageArgs.CancellationToken);

if (vulnerabilityInfoResult?.KnownVulnerabilities != null)
{
vulnerabilityInfo.AddRange(vulnerabilityInfoResult.KnownVulnerabilities);
}
}
}
}

List<string> allPackages = GetAllPackageIdentifiers(targetFrameworks, listPackageArgs.IncludeTransitive);
var packageMetadataById = new Dictionary<string, List<IPackageSearchMetadata>>(capacity: allPackages.Count);

Expand All @@ -284,7 +325,7 @@ private async Task<Dictionary<string, List<IPackageSearchMetadata>>> GetPackageM
: (Environment.ProcessorCount / listPackageArgs.PackageSources.Count) + 1;

await ThrottledForEachAsync(allPackages,
async (packageId, cancellationToken) => await GetPackageVersionsAsync(packageId, listPackageArgs, cancellationToken),
async (packageId, cancellationToken) => await GetPackageVersionsAsync(packageId, listPackageArgs, vulnerabilityInfo, cancellationToken),
packageMetadata => packageMetadataById[packageMetadata.Key] = packageMetadata.Value,
maxParallel,
listPackageArgs.CancellationToken);
Expand Down Expand Up @@ -514,70 +555,41 @@ private UpdateLevel GetUpdateLevel(NuGetVersion resolvedVersion, NuGetVersion la
/// </summary>
/// <param name="package">The package to get the latest version for</param>
/// <param name="listPackageArgs">List args for the token and source provider></param>
/// <param name="vulnerabilityInfo">Vulnerability information database</param>
/// <param name="cancellationToken"></param>
/// <returns>A list of tasks for all latest versions for packages from all sources</returns>
private async Task<KeyValuePair<string, List<IPackageSearchMetadata>>> GetPackageVersionsAsync(
string package,
ListPackageArgs listPackageArgs,
IReadOnlyList<IReadOnlyDictionary<string, IReadOnlyList<PackageVulnerabilityInfo>>> vulnerabilityInfo,
CancellationToken cancellationToken)
{
var results = new List<IPackageSearchMetadata>();
var sources = listPackageArgs.PackageSources;

await ThrottledForEachAsync(sources,
async (source, innerCancellationToken) => await GetLatestVersionPerSourceAsync(source, listPackageArgs, package, innerCancellationToken),
async (source, innerCancellationToken) => await GetLatestVersionPerSourceAsync(source, listPackageArgs, package, vulnerabilityInfo, innerCancellationToken),
continuation: results.AddRange,
maxParallel: listPackageArgs.PackageSources.Count,
cancellationToken);

return new KeyValuePair<string, List<IPackageSearchMetadata>>(package, results);
}

/// <summary>
/// Prepares the calls to sources for current versions and updates
/// the list of tasks with the requests
/// </summary>
/// <param name="packageId">The package ID to get the current version metadata for</param>
/// <param name="requestedVersion">The version of the requested package</param>
/// <param name="listPackageArgs">List args for the token and source provider></param>
/// <param name="packagesVersionsDict">A reference to the unique packages in the project
/// to be able to handle different sources having different latest versions</param>
/// <returns>A list of tasks for all current versions for packages from all sources</returns>
private IList<Task> PrepareCurrentVersionsRequests(
string packageId,
NuGetVersion requestedVersion,
ListPackageArgs listPackageArgs,
Dictionary<string, IList<IPackageSearchMetadata>> packagesVersionsDict)
{
var requests = new List<Task>();
var sources = listPackageArgs.PackageSources;

foreach (var packageSource in sources)
{
requests.Add(
GetPackageMetadataFromSourceAsync(
packageSource,
listPackageArgs,
packageId,
requestedVersion,
packagesVersionsDict));
}

return requests;
}

/// <summary>
/// Gets the highest version of a package from a specific source
/// </summary>
/// <param name="packageSource">The source to look for packages at</param>
/// <param name="listPackageArgs">The list args for the cancellation token</param>
/// <param name="package">Package to look for updates for</param>
/// <param name="vulnerabilityInfo">Vulnerability information database</param>
/// <param name="cancellationToken"></param>
/// <returns>An updated package with the highest version at a single source</returns>
private async Task<IEnumerable<IPackageSearchMetadata>> GetLatestVersionPerSourceAsync(
PackageSource packageSource,
ListPackageArgs listPackageArgs,
string package,
IReadOnlyList<IReadOnlyDictionary<string, IReadOnlyList<PackageVulnerabilityInfo>>> vulnerabilityInfo,
CancellationToken cancellationToken)
{
SourceRepository sourceRepository = _sourceRepositoryCache[packageSource];
Expand All @@ -593,50 +605,57 @@ await packageMetadataResource.GetMetadataAsync(
log: listPackageArgs.Logger,
token: listPackageArgs.CancellationToken);

if (listPackageArgs.ReportType == ReportType.Vulnerable && listPackageArgs.AuditSources.Count > 0)
{
if (vulnerabilityInfo?.Count > 0)
{
var updatedPackages = new List<IPackageSearchMetadata>();

foreach (var pkg in packages)
{
var pkgVulnerabilities = PackageVulnerabilities(vulnerabilityInfo, pkg.Identity.Id, pkg.Identity.Version.ToNormalizedString());

if (pkgVulnerabilities?.Any() == true)
{
// Update the package metadata with vulnerability information
var updatedPackage = PackageSearchMetadataBuilder.FromMetadata(pkg)
.WithVulnerabilities(pkgVulnerabilities)
.Build();

updatedPackages.Add(updatedPackage);
}
else
{
updatedPackages.Add(pkg);
}
}

packages = updatedPackages;
}
}

return packages;
}

/// <summary>
/// Gets the requested version of a package from a specific source
/// </summary>
/// <param name="packageSource">The source to look for packages at</param>
/// <param name="listPackageArgs">The list args for the cancellation token</param>
/// <param name="packageId">Package to look for</param>
/// <param name="requestedVersion">Requested package version</param>
/// <param name="packagesVersionsDict">A reference to the unique packages in the project
/// to be able to handle different sources having different latest versions</param>
/// <returns>An updated package with the resolved version metadata from a single source</returns>
private async Task GetPackageMetadataFromSourceAsync(
PackageSource packageSource,
ListPackageArgs listPackageArgs,
string packageId,
NuGetVersion requestedVersion,
Dictionary<string, IList<IPackageSearchMetadata>> packagesVersionsDict)
private static IEnumerable<PackageVulnerabilityMetadata> PackageVulnerabilities(IEnumerable<IReadOnlyDictionary<string, IReadOnlyList<PackageVulnerabilityInfo>>> vulnerabilities, string id, string version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In .NET the general naming convention is that methods should have names that include verbs that explains what it does. Properties, fields, and variables should have names that are nouns.

{
SourceRepository sourceRepository = _sourceRepositoryCache[packageSource];
var packageMetadataResource = await sourceRepository
.GetResourceAsync<PackageMetadataResource>(listPackageArgs.CancellationToken);

using var sourceCacheContext = new SourceCacheContext();
var packages = await packageMetadataResource.GetMetadataAsync(
packageId,
includePrerelease: true,
includeUnlisted: true, // Include unlisted because deprecated packages may be unlisted.
sourceCacheContext: sourceCacheContext,
log: listPackageArgs.Logger,
token: listPackageArgs.CancellationToken);

var resolvedVersionsForPackage = packagesVersionsDict
.Where(p => p.Key.Equals(packageId, StringComparison.OrdinalIgnoreCase))
.Single()
.Value;

var resolvedPackageVersionMetadata = packages.SingleOrDefault(p => p.Identity.Version.Equals(requestedVersion));
if (resolvedPackageVersionMetadata != null)
var packageVulnerabilities = new List<PackageVulnerabilityMetadata>().AsEnumerable();
if (vulnerabilities == null)
{
// Package version metadata found on source
resolvedVersionsForPackage.Add(resolvedPackageVersionMetadata);
return packageVulnerabilities;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to allocate an empty list on the heap when returning an empty IEnumerable. While dotnet list package isn't a perf critical command (the HTTP requests to get the vulnerability data will be much, much slower than unnecessary GC pauses), by practising these easy perf wins where it doesn't matter, you're more likely to do the right thing when it does.

}
foreach (var vulnFile in vulnerabilities)
{
vulnFile.TryGetValue(id, out IReadOnlyList<PackageVulnerabilityInfo> vulnPackages);
if (vulnPackages != null)
{
// The package has vulnerabilities
packageVulnerabilities = vulnPackages.Where(
package => package.Versions.Satisfies(new NuGetVersion(version))).Select(v => JsonExtensions.FromJson<PackageVulnerabilityMetadata>($"{{ \"AdvisoryUrl\": \"{v.Url}\", \"Severity\": \"{(int)v.Severity}\" }}"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the topic of easy perf wins that should be practised even in non-perf critical code, the new NuGetVersion is happening inside a LINQ method, which means it will happen for every item in the vulnPackages collection, even though the value will be the same each time. Parsing the version once outside the LINQ expression will reduce the allocation of NugetVeresion, not to mention all the allocations that happen to parse the string, when vulnPackages has more than one item.

return packageVulnerabilities;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning LINQ expressions without immediately materializing it (or put another way, deferred execution) has uses in certain perf scenarios, but it makes debugging harder and I think makes reading and understand the code more difficult.

It doesn't make sense for the caller to exit the enumeration early, so that perf advantage shouldn't be taken advantage of. The other advantage that might exist is avoiding materialization of a large collection to store the result, and just allocate the hopefully smaller state machine. If this was the goal, I think it should be commented, to make it clear that a tradeoff against debugging is being made. However, I wonder if such an optimization is needed for dotnet list package, which is a short lived process, and won't be running at the same time as other code within the same process.

}
}
return packageVulnerabilities;
}

/// <summary>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,14 @@ private static void WriteSources(TextWriter consoleOut, ListPackageArgs listPack
consoleOut.WriteLine();
consoleOut.WriteLine(Strings.ListPkg_SourcesUsedDescription);
PrintSources(consoleOut, listPackageArgs.PackageSources);

if (listPackageArgs.AuditSources.Count > 0)
{
consoleOut.WriteLine();
consoleOut.WriteLine(Strings.ListPkg_AuditSourcesUsedDescription);
PrintSources(consoleOut, listPackageArgs.AuditSources);
Copy link
Member

@zivkan zivkan Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The json renderer (for when --format json is used) should be updated as well.

But see my other comment about whether audit sources should be used in addition to, or instead of, package sources. Because if audit sources prevent package sources from being used, maybe we can re-use the existing sources message/property instead of introducing a new one.

}

consoleOut.WriteLine();
}
}
Expand Down
9 changes: 9 additions & 0 deletions src/NuGet.Core/NuGet.CommandLine.XPlat/Strings.Designer.cs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions src/NuGet.Core/NuGet.CommandLine.XPlat/Strings.resx
Original file line number Diff line number Diff line change
Expand Up @@ -978,4 +978,7 @@ Non-HTTPS access will be removed in a future version. Consider migrating to 'HTT
<value>Project '{0}' does not have MSBuild property ProjectAssetsFile defined. This may indicate that this project does not support NuGet PackageReference, or that project customization has prevented the .NET SDK setting default values.</value>
<comment>{0} - Path to the project with the error</comment>
</data>
<data name="ListPkg_AuditSourcesUsedDescription" xml:space="preserve">
<value>The following vulnerability audit sources were used:</value>
</data>
</root>
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ public class PackageSearchMetadataBuilder
private readonly IPackageSearchMetadata _metadata;
private AsyncLazy<IEnumerable<VersionInfo>> _lazyVersionsFactory;
private AsyncLazy<PackageDeprecationMetadata> _lazyDeprecationFactory;
private IEnumerable<PackageVulnerabilityMetadata> _packageVulnerabilities;

public class ClonedPackageSearchMetadata : IPackageSearchMetadata
{
Expand Down Expand Up @@ -111,7 +112,7 @@ public IPackageSearchMetadata Build()
PrefixReserved = _metadata.PrefixReserved,
LicenseMetadata = _metadata.LicenseMetadata,
LazyDeprecationFactory = _lazyDeprecationFactory ?? AsyncLazy.New(_metadata.GetDeprecationMetadataAsync),
Vulnerabilities = _metadata.Vulnerabilities,
Vulnerabilities = _packageVulnerabilities ?? _metadata.Vulnerabilities,
#pragma warning disable CS0618 // Type or member is obsolete
PackageReader =
(_metadata as LocalPackageSearchMetadata)?.PackageReader ??
Expand Down Expand Up @@ -139,6 +140,12 @@ public static PackageSearchMetadataBuilder FromIdentity(PackageIdentity identity
};
return FromMetadata(metadata);
}

public PackageSearchMetadataBuilder WithVulnerabilities(IEnumerable<PackageVulnerabilityMetadata> pkgVulnerabilities)
{
_packageVulnerabilities = pkgVulnerabilities;
return this;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretending to be a public API design reviewer, I don't like this as a public API.

If I didn't know a lot about the NuGet protocol, I'd assume that the package search and package details resources don't return vulnerability data, and that it's mandatory to inject vulnerability info this way. And in that case, I'd wonder why the IPackageSearchMetadata type even has a vulnerabilities property. If this was the design, I'd expect the caller to create its own type to associate vulnerabilities with an IPackageSearchMetadata instance.

However, I know that nuget.org does return vulnerabilities from its package metadata (registration) resource, Also, nuget.org doesn't (or didn't for many years) return vulnerability info from the search endpoint. However, PM UI handled this without modifying this package builder class, which is what I think list package should do as well.

I'm not sure what the equivalent of a leaky abstraction is when it's in reverse, when an API forces a public API on a dependency to implement something specific to the higher level code. But I feel like it's something list package should do internally, without affecting public APIs.

}
}

/// <summary>
Expand Down
Loading