Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use error for validation #8

Open
jml opened this issue Oct 21, 2016 · 3 comments
Open

Don't use error for validation #8

jml opened this issue Oct 21, 2016 · 3 comments

Comments

@jml
Copy link
Collaborator

jml commented Oct 21, 2016

I started doing this as part of #7 but the scope creep & compatibility changes got a bit too much so I thought I'd file a ticket to get an opinion before I wasted too much time.

Currently, validation of metric names doesn't actually happen (although there is a checkInfo function) and validation of label names happens at vector construction, but causes the program to error out, which is not ideal in library code, because it robs the caller of the ability to make a decision about how to handle errors.

A good way of implementing this is to have dedicated newtypes that encode validity, e.g.

-- | A metric name guaranteed to be valid.
newtype MetricName = MetricName { formatMetricName :: Text } deriving (Read, Show, Eq, Ord, Monoid)

-- | Get a metric
toMetricName :: (MonadError MetricNameError m) => Text -> m MetricName
toMetricName name
    | Text.null name = throwError Empty
    | "__" `Text.isPrefixOf` name = throwError (StartsWithUnderscores name)
    | not (validStart (Text.head name)) = throwError (InvalidCharacters name)
    | Just _ <- Text.find (not . validRest) (Text.tail name) = throwError (InvalidCharacters name)
    | otherwise = return (MetricName name)
    where
        validStart c =  ('a' <= c && c <= 'z')
                     || ('A' <= c && c <= 'Z')
                     || c == '_'
                     || c == ':'

        validRest c =  validStart c || ('0' <= c && c <= '9')

-- | Possible errors when parsing metric names.
data MetricNameError
    = InvalidCharacters Text
    | StartsWithUnderscores Text
    | Empty
    deriving (Eq, Show)

-- | User-friendly error messages for invalid metric names.
formatMetricNameError :: MetricNameError -> Text
formatMetricNameError (InvalidCharacters name) = "The metric '" <> name <> "' contains invalid characters."
formatMetricNameError (StartsWithUnderscores name) = "The metric '" <> name <> "' cannot start with '__'."
formatMetricNameError Empty = "Empty metric names are not allowed."

Here, the MetricName type and the formatMetricName and toMetricName functions would be exported but not the MetricName constructor. This guarantees that any value with the MetricName type has a valid metric name.

(Apologies if this is obvious to you, but since roundtrips are moderately expensive I thought it better to over-communicate.)

Anyway, this works for metric names but not so well for label names because the existing type class more or less assumes that the names of labels have the same type as the values of labels.

I have a bunch of half-formed ideas about what to do about that, but pretty much everything would rather seriously break backwards compatibility.

Thoughts?

@fimad
Copy link
Owner

fimad commented Dec 22, 2016

Sorry for the terrible response latency :(

I think this makes sense. I think I'd still like an escape hatch for creating metric names that would work when defining top level metrics:

{-# NOINLINE c #-}
let c = unsafeRegisterIO $ do
    name <- toMetricName "my_counter"
    counter (Info name "An example metric")

or something like

{-# NOINLINE c #-}
let c = unsafeRegisterIO $ toInfo "my_counter" "An example metric" >>= counter 

Would it be possible to make IO an instance of MonadError MetricNameError IO? I think that this would just work out of the box with your proposal then.

If we can come up with a safer API that doesn't make the existing use cases much more verbose I'd be fine breaking backwards compatibility with a corresponding version bump.

@jml
Copy link
Collaborator Author

jml commented Dec 22, 2016

Sorry for the terrible response latency :(

No worries at all.

I'd still like an escape hatch for creating metric names that would work when defining top level metrics

Yeah, I can see why that'd be nice. I've made a similar change to another library I work on and we're currently writing unsafeMakeName "literal_name" a fair amount.

I'll think about it & get back to you.

Would it be possible to make IO an instance of MonadError MetricNameError IO?

Possible, yes. Probably inadvisable. Because we don't own MonadError or IO, that would be an orphan instance, which would trigger an orphan instance warning, which is there for a reason.

If we can come up with a safer API that doesn't make the existing use cases much more verbose I'd be fine breaking backwards compatibility with a corresponding version bump.

Great, thanks. I'll go away & think about it (although I've switched focus to other projects so it will be some time before I come back to this).

I also filed https://ghc.haskell.org/trac/ghc/ticket/13028#ticket about this, which describes what I think would be the best solution--to be able to write IsString instances that do validation.

@ocharles
Copy link
Collaborator

ocharles commented Dec 4, 2017

I'd still like an escape hatch for creating metric names that would work when defining top level metrics

These days I'm thinking this escape hatch is really a quasiquoter so you can do the checking at compile time rather than runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants