Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enumerate Function #306

Open
mpscholten opened this issue Nov 16, 2024 · 34 comments
Open

enumerate Function #306

mpscholten opened this issue Nov 16, 2024 · 34 comments
Labels
vote-in-progress The CLC vote is in progress

Comments

@mpscholten
Copy link

It would be nice to have a function to retrieve all values from an enum:

data Color = Yellow | Red | Blue deriving (Enum)

let allColors = allEnumValues @Color
-- allColors is [Yellow, Red, Blue]

In IHP we have a helper function for this called allEnumValues:

allEnumValues :: forall enumType. Enum enumType => [enumType]
allEnumValues = enumFrom (toEnum 0)

E.g. here is an example I just randomly picked from one of our apps: Rendering all timezones:

    action TimeZonesAction = do
        let timezones = allEnumValues @TZ.TZLabel
        let
            renderTimeZone :: TZ.TZLabel -> Text
            renderTimeZone = cs . TZ.toTZName
        renderJson (renderTimeZone <$> timezones)

Here is a GitHub Search showing tons of adhoc implementations of this function across many haskell projects: https://github.com/search?q=allEnumValues+language%3AHaskell+&type=code

There is already a PR for this proposal via https://gitlab.haskell.org/ghc/ghc/-/merge_requests/11321 (there it's called enumerate instead of allEnumValues)

@Kleidukos
Copy link
Member

Relude has universe :: (Bounded a, Enum a) => [a] as well, although of course a compact array would be better than a linked list, if you need to materialise everything.

@mixphix
Copy link
Collaborator

mixphix commented Nov 16, 2024

allEnumValues @Int produces no negative numbers. What's wrong with [minBound .. maxBound]?

@mpscholten
Copy link
Author

Yes good point, should be [minBound .. maxBound] 👍

@mpscholten
Copy link
Author

kmett noted on X that [minBound..maxBound] doesn't necessarily return all values. So I propose we stick with the name suggested by ben in the original PR and call the function enumerate

@mixphix
Copy link
Collaborator

mixphix commented Nov 17, 2024

Such a function already exists in the extra package defined precisely like so. We recently added compareLength to base (#257) which caused compatibility issues with this library. Since we'd essentially be redefining it ourselves, perhaps it's best to contact the maintainers of that library, or simply to use it in your own project? It has a very small dependency footprint and several other useful functions.

@mpscholten
Copy link
Author

One of the issues with compareLength is that the type signature in both implementation was different. So In our case the function is exactly the same as in extra. So an updated extra package that would reexport our function would not be a breaking change.

@mpscholten mpscholten changed the title allEnumValues Function enumerate Function Nov 17, 2024
@hasufell
Copy link
Member

Such a function already exists in the extra package defined precisely like so. We recently added compareLength to base (#257) which caused compatibility issues with this library. Since we'd essentially be redefining it ourselves, perhaps it's best to contact the maintainers of that library, or simply to use it in your own project? It has a very small dependency footprint and several other useful functions.

I find it a bit of an odd argument.

On the one hand, the CLC constantly requests proposers to demonstrate that new API is actually useful, which is done by showing its use in an actual package.

But then we complain that it's already defined in some other package and will cause compatibility issues?

@tomjaguarpaw
Copy link
Member

kmett noted on X that [minBound..maxBound] doesn't necessarily return all values

For which instances? I can think of theoretical types where this would happen, say a type representing a bounded range of floating point numbers. If the bounded range was between -1 and 1 inclusive then [minBound..maxBound] would be [-1, 0, 1], missing every non-integer value. But I can't think of any type that actually exists in practice that hits that problem. Does anyone know one?

If there is one then I think the Enum+Bounded constraint is not enough. There is a package that exists to address this: universe. Perhaps if people want this kind of thing we should integrate that into base instead. It seems like a more principled way of achieving the same end.

@mixphix
Copy link
Collaborator

mixphix commented Nov 29, 2024

I find it a bit of an odd argument.

It's no argument for or against, just a gentle suggestion if the proposal author wishes to proceed, that the libraries exposing an identical or similar function be notified of this change, extra in particular due to other recent compatibility issues.

@Bodigrim
Copy link
Collaborator

See #110 (comment) for Neil's opinion on name clashes with extra.

@tomjaguarpaw
Copy link
Member

To save people a click:

I'm happy to add the necessary bounds if this gets added (both in a new release and in a revision) - I really wouldn't worry about extra to count as a point against this proposal.

@Bodigrim
Copy link
Collaborator

I'm supportive of the change. Bikeshedding:

  • Shall it go into Data.Enum and/or Data.Bounded and/or anything else?
  • Shall it be called enumerate or universe? There is prior art for both. I'm personally more inclined to call it enumerate.

Does anyone know one?

I cannot come up with a practical example.

@endgame
Copy link

endgame commented Dec 14, 2024

I wasn't previously aware that [minBound..maxBound] would not enumerate all values, and that seems like a mark against it. If it is only truly pathological types that can do this, then maybe it's not so bad?

universe has a better design in that it at least explicitly says that every element is returned. I think finitary:Data.Finitary has an even better design than universe, by surfacing more useful information about the type's cardinality, etc., at the type level. However, the last time I tried it I struggled to get it to build with some of the other libraries I was using, and I missed the ability to talk about enumerable-but-not-finite types. An "enumerable" superclass for Finitary could return a typelevel Maybe for its cardinality and provide conversion functions from Integer, and then Finitary could add a constraint (Cardinality a ~ Just n) to the class.

Unfortunately, all the machinery around Finite n types exists in external packages, so I don't see that coming into base any time soon, unless we really want to take the time to work through this (and possibly adjust Enum to make it a good superclass for Finitary).

Bringing things into base pretty much freezes them forever (cf. Python's "dead batteries" problem). Therefore, IMHO, we should be bringing the best known forms of an idea in rather than ones with known deficiencies. I have no problem with libraries like relude shipping "basic but incomplete" versions of certain ideas — it's part of their declared design sensibilities and I find relude pleasant to program with as a result (see also their basic lenses) — but I think it's worth taking the time to ensure base is as solid as possible.

My first preference then is to encourage [minBound..maxBound] to become a more widely-understood Haskell idiom (mumble mumble Fairbairn Threshold, perhaps?) and defer putting something in base until we can nail down a really good design for it. I think Finitary in particular is a much more versatile tool than universe, and shouldn't need that much more effort to define lawful instances (Use Enum/Bounded for the default, and a generics-based instance via the Generically newtype).


But if people want to push on with bringing a function like this into base, I believe:

  • [minBound .. maxBound] is better than enumFrom (toEnum 0)
  • enumerate is a more accurate name than universe since the latter sounds like it encompasses everything (which apparently might not be true).
  • enumerate sounds like an imperative operation, however, like a control operation you might find in a collections library
  • -XRequiredTypeArguments might make this read better: enumerate :: forall a -> (Enum a, Bounded a) => [a] would have use sites read like enumerate MyEnumType
  • If people aren't hot on -XRequiredTypeArguments, enumValues might be a better name; allEnumValues being a bit of a lie in the same way as universe
  • Data.Enum sounds like a good home for it, since it's primarily "about" enumeration and Bounded is merely an implementation detail.

@phadej
Copy link

phadej commented Dec 14, 2024

@endgame Enum and Bounded are IMO just bad design (to mention one issue: Bounded has "laws" if the type has also Enum instance, but these classes are not formally related, neither is superclass of other). Enum doesn't even have instances for Either or (a, b) (because you need to know if you have reached last element, succ is partial function: succ '\1114111' throws). And that prevents us from having

data Color = Yellow | Red | Blue deriving (Enum)
data MoreColors = OldColor Color | Green | Magenta | Cyan deriving (Enum)

And as the groundwork is shaky, building on top is not very smart.

The Enum is just good enough, that pushing for something else to be built-in in GHC / base is not worth it. universe is good but it's not "the best" for every use case. I think it's simple Haskell98 definition is nice; Data.Finitary design forcing to have explicit (subset of) Natural` ~ a isomorphism is another design point (and sometimes you want that isomorphism). E.g. ℚ ~ ℕ: it's quite easy to enumerate rational numbers (using a clever technique https://www2.math.upenn.edu/~wilf/website/recounting.pdf), but constructing an efficient direct isomorphism between rationals and natural numbers is tricky. So there's at least two different approaches to how to enumerate values (and there's more, https://hackage.haskell.org/package/size-based, SmallCheck, ...)

@Bodigrim
Copy link
Collaborator

One of “better” Enums is hidden in random: https://hackage.haskell.org/package/random-1.2.1.2/docs/System-Random.html#t:Finite

Another design (which extends towards infinite, but countable sets) is https://hackage.haskell.org/package/cantor-pairing-0.2.0.2/docs/Cantor.html

All in all, I think getting a better Enum in base is a huge topic, which is unlikely to be settled in short term.

@bcardiff
Copy link

When declaring a simple enumeration of values it's common to make it just an instance of Enum.

Once you learn about/remember Bounded you can go an add it to use [minBound .. maxBound].

It's a short enough but is unfortunate (as in less joy) that a simple enumeration of values requires that.

Having a enumerate function it would be a good thing IMO because likely it's the name someone would look to enumerate all values. I don't think universe shares that.

Having enumerate require Bounded would push user to add that typeclass also in the type. But at least it's guided from the need to use the enumerate function. Rather than knowing how to use Bounded in combination of Enum.

@phadej
Copy link

phadej commented Dec 14, 2024

@bcardiff you highlight the problem with Enum: the class alone is not enough to enumerate. You must somehow know a value of that type (to use enumFrom :: a -> [a]) or know how it maps to Int, so you call enumFrom (toEnum x) for some x :: Int (as e.g. toEnum 0 might throw!)

Arguably, enumerate :: [a] could be a member of Enum a directly.

That won't solve all the problems with Enum (you still couldn't define instances for Either, e.g.), but it will make it better.

@endgame
Copy link

endgame commented Dec 14, 2024

All in all, I think getting a better Enum in base is a huge topic, which is unlikely to be settled in short term.

Agreed. That seems sufficient reason to me to encourage people to write [minBound .. maxBound] more often while it gets figured out.

An alternative colour for the bikeshed: if everyone else thinks this should go ahead, a name like enumerated :: (Enum a, Bounded a) => [a] or a required type argument like enumerate :: forall a -> (Enum a, Bounded a) => [a] might make code read more nicely than with enumerate :: (Enum a, Bounded a) => [a]. enumerate scans strangely to me, in situations where GHC is supplying the type argument.

@brandonchinn178
Copy link

brandonchinn178 commented Dec 14, 2024

That won't solve all the problems with Enum (you still couldn't define instances for Either, e.g.), but it will make it better

@phadej Is that true? Couldn't you do

instance (Enum a, Enum b) => Enum (Either a b) where
    enumerate = enumerate @a ++ enumerate @b
    toEnum i = enumerate !! i
    fromEnum x = fromJust $ elemIndex x enumerate

@phadej
Copy link

phadej commented Dec 15, 2024

@brandonchinn178 that's terrible instance. Technically correct; enumerate works, but every other member is terrible. And given how [x .. y] desugars, it will be very bad.

@Bodigrim
Copy link
Collaborator

An alternative colour for the bikeshed: if everyone else thinks this should go ahead, a name like enumerated :: (Enum a, Bounded a) => [a] or a required type argument like enumerate :: forall a -> (Enum a, Bounded a) => [a] might make code read more nicely than with enumerate :: (Enum a, Bounded a) => [a].

I like the one with required type argument, but it will be out of line with minBound / maxBound, which do not require one.

As a non-native speaker I don't really have an opinion on enumerate vs. enumerated. Prior art seems to be in favor of the first one.

@mpscholten how would you like to proceed? Is there enough feedback for you to converge on a final proposal?

@Bodigrim Bodigrim added the awaits-proposal Discussion has not resulted in a formal proposal yet label Dec 16, 2024
@mpscholten
Copy link
Author

An alternative colour for the bikeshed: if everyone else thinks this should go ahead, a name like enumerated :: (Enum a, Bounded a) => [a] or a required type argument like enumerate :: forall a -> (Enum a, Bounded a) => [a] might make code read more nicely than with enumerate :: (Enum a, Bounded a) => [a].

I'd love to use a required type argument here as well, but as mentioned earlier already it feels inconsistent with the existing base enum functions.

--

Thanks everyone for the input!

@Bodigrim yes I think we have enough feedback for the final proposal.

Final Proposal

Add a function enumerate to Data.Enum:

-- | Returns a list of all values of an enum type
--
-- 'enumerate' is often used to list all values of a custom enum data structure, such as a custom Color enum below:
--
-- @
-- data Color = Yellow | Red | Blue
--     deriving (Enum, Bounded, Show)
--
-- allColors :: [Color]
-- allColors = enumerate
-- -- Result: [Yellow, Red, Blue]
-- @
--
-- Note that you need to derive the 'Bounded' type class as well, only 'Enum' is not enough.
-- 'Enum' allows for sequential enumeration, while 'Bounded' provides the 'minBound' and 'maxBound' values.
--
-- 'enumerate' is commonly used together with the TypeApplications syntax. Here is an example of using 'enumerate' to retrieve all values of the 'Ordering' type:
--
-- >> enumerate @Ordering
-- [LT, EQ, GT]
--
-- The '@' symbol here is provided by the TypeApplications language extension.
--
enumerate :: (Enum a, Bounded a) => [a]
enumerate = [minBound .. maxBound]

@ysangkok
Copy link
Member

Is there a rationale for including maxBound? I have been using [minBound..].

@tbidne
Copy link

tbidne commented Dec 18, 2024

Is there a rationale for including maxBound? I have been using [minBound..].

The "correct" behavior here is probably debatable, but non-termination could be a nasty surprise for seemingly reasonable instances.

enumerateMax :: (Enum a, Bounded a) => [a]
enumerateMax = [minBound .. maxBound]

enumerateNoMax :: (Enum a, Bounded a) => [a]
enumerateNoMax = [minBound ..]

newtype Mod5 = MkMod5 Int
  deriving stock (Eq, Show)

instance Bounded Mod5 where
  minBound = MkMod5 0
  maxBound = MkMod5 4

instance Enum Mod5 where
  toEnum x = MkMod5 (x `mod` 5)
  fromEnum (MkMod5 x) = x
λ. enumerateMax :: [Mod5]
[MkMod5 0,MkMod5 1,MkMod5 2,MkMod5 3,MkMod5 4]

λ. enumerateNoMax :: [Mod5]
[MkMod5 0,MkMod5 1,MkMod5 2,MkMod5 3,MkMod5 4,MkMod5 0,MkMod5 1, ...]

@Bodigrim Bodigrim added awaits-MR No GHC MR (https://gitlab.haskell.org/ghc/ghc/-/merge_requests) has been raised yet and removed awaits-proposal Discussion has not resulted in a formal proposal yet labels Dec 22, 2024
@Bodigrim
Copy link
Collaborator

Dear CLC members, any non-binding opinions on #306 (comment)? I'm in favor.

@tomjaguarpaw @mixphix @hasufell @velveteer @angerman @parsonsmatt


@mpscholten the next formal step is to prepare a GHC MR. Please follow the steps at the end of "how" section at https://github.com/haskell/core-libraries-committee/blob/main/PROPOSALS.md#the-how.

@mpscholten
Copy link
Author

Thanks! Created the GHC MR at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13755

@Bodigrim Bodigrim removed the awaits-MR No GHC MR (https://gitlab.haskell.org/ghc/ghc/-/merge_requests) has been raised yet label Dec 29, 2024
@mpscholten
Copy link
Author

@Bodigrim what are the next steps to move this forward after opening the GHC MR?

@Bodigrim
Copy link
Collaborator

Sorry @mpscholten, this fall through the cracks.

Dear CLC members, let's vote on the proposal to add Data.Enum.enumerate = [minBound .. maxBound] :: (Enum a, Bounded a) => [a], as implemented in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/13755/diffs.

Such function is well-attested in prior art in packages such as extra and relude, but also universe-base, quickcheck-monoid-subclasses, board-games.

This is a purely additive change. Data.Enum is a very new module, just added in GHC 9.10 / base-4.20, so there are unlikely to be any name clashes in existing programs.

@tomjaguarpaw @hasufell @mixphix @velveteer @parsonsmatt @angerman


+1 from me. I have defined such function in private project preludes more than once.

@Bodigrim Bodigrim added the vote-in-progress The CLC vote is in progress label Jan 22, 2025
@hasufell
Copy link
Member

+1

@tomjaguarpaw
Copy link
Member

-1


Bounded and Enum are especially unprincipled type classes, so I imagine enumerate in terms of them to have all sorts of weird edge cases in practice. Writing out [minBound .. maxBound] in full is hardly more difficult and much more explicit and clear, so I continue to recommend that approach.

@mixphix
Copy link
Collaborator

mixphix commented Jan 22, 2025

+1

@angerman
Copy link

While I can see @tomjaguarpaw's reasoning, I don't see the reductive argument to be strong enough to sway me into -1. As such weak, but +1.

@Bodigrim
Copy link
Collaborator

@parsonsmatt @velveteer just a gentle reminder to vote.

@parsonsmatt
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vote-in-progress The CLC vote is in progress
Projects
None yet
Development

No branches or pull requests