Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add Precision:AtLeast and Precision::AtMost for more Statistics… precision #13293

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 52 additions & 1 deletion datafusion/common/src/stats.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,14 @@ pub enum Precision<T: Debug + Clone + PartialEq + Eq + PartialOrd> {
Exact(T),
/// The value is not known exactly, but is likely close to this value
Inexact(T),
/// The value is know to be at most (inclusive) of this value.
///
/// The actual value may be smaller, but it is never greater.
AtMost(T),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change -- adding these variants

/// The value is known to be at least (inclusive) of this value.
///
/// The actual value may be greater but it is never smaller.
AtLeast(T),
/// Nothing is known about the value
#[default]
Absent,
Expand All @@ -41,7 +49,40 @@ impl<T: Debug + Clone + PartialEq + Eq + PartialOrd> Precision<T> {
/// Otherwise, it returns `None`.
pub fn get_value(&self) -> Option<&T> {
match self {
Precision::Exact(value) | Precision::Inexact(value) => Some(value),
Precision::Exact(value)
| Precision::Inexact(value)
| Precision::AtLeast(value)
| Precision::AtMost(T) => Some(value),
Precision::Absent => None,
}
}

/// Returns the minimum possible value.
///
/// # Return Value
/// - `Some(value)`: actual value is at least `value` and may be greater.
/// It will never be less than the returned value.
///
/// - `None`: means the minimum value is unknown.
pub fn min_value(&self) -> Option<&T> {
match self {
Precision::Exact(value) | Precision::AtLeast(value) => Some(value),
Precision::Inexact(_) | Precision::AtMost(_) => None,
Precision::Absent => None,
}
}

/// Returns the maximum possible value.
///
/// # Return Value
/// - `Some(value)`: actual value is at most `value` and may be less.
/// It will never be greater than the returned value.
///
/// - `None`: means the maximum value is unknown.
pub fn max_value(&self) -> Option<&T> {
match self {
Precision::Exact(value) | Precision::AtMost(value) => Some(value),
Precision::Inexact(_) | Precision::AtLeast(_) => None,
Precision::Absent => None,
}
}
Expand Down Expand Up @@ -462,6 +503,16 @@ impl ColumnStatistics {
}
}

/// return the minimum value this column can have, if known
pub fn min_value(&self) -> Option<&ScalarValue> {
self.min_value.get_value()
}

/// return the maximum value this column can have, if known
pub fn max_value(&self) -> Option<&ScalarValue> {
self.max_value.get_value()
}

/// If the exactness of a [`ColumnStatistics`] instance is lost, this
/// function relaxes the exactness of all information by converting them
/// [`Precision::Inexact`].
Expand Down
Loading