Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Builtins] Add support for pattern matching builtins #5486

Closed

Conversation

effectfully
Copy link
Contributor

@effectfully effectfully commented Aug 21, 2023

The main change is replacing

data BuiltinRuntime val
    = BuiltinCostedResult ExBudgetStream ~(BuiltinResult val)
    | <...>

with

data BuiltinRuntime val
    = BuiltinCostedResult ExBudgetStream ~(BuiltinResult (HeadSpine val))
    | <...>

where HeadSpine is a fancy way of saying NonEmpty:

-- | A non-empty spine. Isomorphic to 'NonEmpty', except is strict and is defined as a single
-- recursive data type.
data Spine a
    = SpineLast a
    | SpineCons a (Spine a)

-- | The head-spine form of an iterated application. Provides O(1) access to the head of the
-- application. Isomorphic to @NonEmpty@, except is strict and the no-spine case is made a separate
-- constructor for performance reasons (it only takes a single pattern match to access the head when
-- there's no spine this way, while otherwise we'd also need to match on the spine to ensure that
-- it's empty -- and the no-spine case is by far the most common one, hence we want to optimize it).
data HeadSpine a
    = HeadOnly a
    | HeadSpine a (Spine a)

(we define a separate type, because we want strictness, and you don't see any bangs, because it's in a module with StrictData enabled).

The idea is that a builtin application can return a function applied to a bunch of arguments, which is exactly what we need to be able to express caseList

caseList xs0 f z = case xs0 of
   []   -> z
   x:xs -> f x xs

as a builtin:

-- | Take a function and a list of arguments and apply the former to the latter.
headSpine :: Opaque val asToB -> [val] -> Opaque (HeadSpine val) b
headSpine (Opaque f) = Opaque . \case
    []      -> HeadOnly f
    x0 : xs ->
        -- It's critical to use 'foldr' here, so that deforestation kicks in.
        -- See Note [Definition of foldl'] in "GHC.List" and related Notes around for an explanation
        -- of the trick.
        HeadSpine f $ foldr (\x2 r x1 -> SpineCons x1 $ r x2) SpineLast xs x0

instance uni ~ DefaultUni => ToBuiltinMeaning uni DefaultFun where
    <...>
    toBuiltinMeaning _ver CaseList =
        let caseListDenotation
                :: Opaque val (LastArg a b)
                -> Opaque val (a -> [a] -> b)
                -> SomeConstant uni [a]
                -> BuiltinResult (Opaque (HeadSpine val) b)
            caseListDenotation z f (SomeConstant (Some (ValueOf uniListA xs0))) = do
                case uniListA of
                    DefaultUniList uniA -> pure $ case xs0 of
                        []     -> headSpine z []                                             -- [1]
                        x : xs -> headSpine f [fromValueOf uniA x, fromValueOf uniListA xs]  -- [2]
                    _ ->
                        -- See Note [Structural vs operational errors within builtins].
                        throwing _StructuralUnliftingError "Expected a list but got something else"
            {-# INLINE caseListDenotation #-}
        in makeBuiltinMeaning
            caseListDenotation
            (runCostingFunThreeArguments . unimplementedCostingFun)

Being able to express [1] (representing z) and [2] (representing f x xs) is precisely what this PR enables.

Adding support for the new functionality to the CEK machine is trivial. All we need is a way to push a Spine of arguments onto the context:

    -- | Push arguments onto the stack. The first argument will be the most recent entry.
    pushArgs
        :: Spine (CekValue uni fun ann)
        -> Context uni fun ann
        -> Context uni fun ann
    pushArgs args ctx = foldr FrameAwaitFunValue ctx args

and a HeadSpine version of returnCek:

    -- | Evaluate a 'HeadSpine' by pushing the arguments (if any) onto the stack and proceeding with
    -- the returning phase of the CEK machine.
    returnCekHeadSpine
        :: Context uni fun ann
        -> HeadSpine (CekValue uni fun ann)
        -> CekM uni fun s (Term NamedDeBruijn uni fun ())
    returnCekHeadSpine ctx (HeadOnly  x)    = returnCek ctx x
    returnCekHeadSpine ctx (HeadSpine f xs) = returnCek (pushArgs xs ctx) f

Then replacing

                BuiltinSuccess x ->
                    returnCek ctx x

with

                BuiltinSuccess fXs ->
                    returnCekHeadSpine ctx fXs

(and similarly for BuiltinSuccessWithLogs) will do the trick.

We used to define caseList in terms of IfThenElse, NullList and either HeadList or TailList depending on the result of NullList, i.e. three builtin calls in the worst and in the best case. Then we introduced ChooseList, which replaced both IfThenElse and NullList in caseList thus bringing total amount of builtin calls down to 2 in all cases. This turned out to have a substantial impact on performance. This PR allows us to bring total number of builtin calls per caseList invokation down to 1 -- the CaseList builtin itself.

@effectfully effectfully added Do not merge Builtins EXPERIMENT Experiments that we probably don't want to merge labels Aug 21, 2023
@effectfully effectfully force-pushed the effectfully/builtins/pattern-matching-builtins branch from 97f7e94 to dfdd042 Compare August 21, 2023 22:26
@mjaskelioff
Copy link
Contributor

I'm a bit confused about what this would entail for the PLC language.
Would there be a new builtin-function? would case be a new language construct?

@effectfully
Copy link
Contributor Author

I'm a bit confused about what this would entail for the PLC language.
Would there be a new builtin-function? would case be a new language construct?

No new language constructs. New built-in functions (caseList in the description of the PR, plus caseData in the PR), plus a bit of changes to the builtins machinery and the CEK machine (see in the PR, it's a surprisingly small amount of changes).

@mjaskelioff
Copy link
Contributor

I saw the PR, but it's about code and I don't fully understand the consequences of the code for PLC. What would be the type of caseList?

(\(u : unit) -> z)
(\(u : unit) -> f (headList {a} x) (tailList {a} x))
(\(x : a) (xs' : list a) (u : unit) -> f x xs')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, we would need to allow higher-order builtins?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, is caseList : all a b. list a -> (unit -> b) -> (a -> list a -> unit -> b) ?
Why is the unit in the last argument needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, we would need to allow higher-order builtins?

Yes, but only in the metatheory, the implementation supports arbitrary Plutus types in type signatures of builtins (i.e. the metatheory does it better).

is caseList : all a b. list a -> (unit -> b) -> (a -> list a -> unit -> b) ?

No, the type of caseList doesn't have any units, even though they're used inside. They probably shouldn't be used inside, like with the caseData definition discussion below. Note that this is the nonsensical stuff from the stdlib, so don't think too hard about it, the actual builtin (confusingly named the same) is defined elsewhere, here it's only tested.

Comment on lines 11 to 15
(\(i : integer) (ds : list data) (u : unit) -> fConstr i ds)
(\(es : list (pair data data)) (u : unit) -> fMap es)
(\(ds : list data) (u : unit) -> fList ds)
(\(i : integer) (u : unit) -> fI i)
(\(b : bytestring) (u : unit) -> fB b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the units are needed if computation is already under a function type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think I just mechanically updated the previous definition which was in terms of chooseData where units were needed without realizing that they're no longer needed. Thank you, will fix.

Copy link
Contributor Author

@effectfully effectfully Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we don't need anything more complicated than

-- | Pattern matching over 'Data' inside PLC.
--
-- > \(d : data) -> /\(r :: *) -> caseData {r} d
caseData :: TermLike term TyName Name DefaultUni DefaultFun => term ()
caseData = runQuote $ do
    r <- freshTyName "r"
    d <- freshName "d"
    return
        . lamAbs () d dataTy
        . tyAbs () r (Type ())
        . apply () (tyInst () (builtin () CaseData) $ TyVar () r)
        $ var () d

toBuiltinMeaning _ver CaseList =
makeBuiltinMeaning
caseListPlc
(\_ _ _ _ -> ExBudgetLast mempty) -- TODO.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering what happens to the cost...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always do a constant amount of work, so should be constant and not very hard to measure. One part that may be tricky is distinguishing (costing-wise) between returning a function application from a builtin and evaluating that application in the CEK machine. But we probably can adapt the current Kenneth's trick of using dedicated no-op builtins. Anyways, we should only discuss it if we think pattern matching builtins make sense, otherwise we'll just waste a bunch of time. I do believe costing shouldn't be an issue, but I might be wrong about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good to know. I assumed so, but when I asked why there is no higher-order builtin (especially for cases like the ones here where higher-order is the most natural formulation) I was told that it was because of the costs, so I was wondering if there is any problem here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good to know. I assumed so, but when I asked why there is no higher-order builtin (especially for cases like the ones here where higher-order is the most natural formulation) I was told that it was because of the costs

We can define higher-order builtins already and we can cost the definable ones properly. E.g. \(f : integer -> integer) -> f is a higher-order builtin, but it's definable and is clearly constant-cost, so there's nothing fancy about costing it.

I think the person giving you comments on higher-order builtins was likely thinking about applying functions and actually reducing them from within a builtin, which we do not support indeed, but

  1. we did support that in the past
  2. we'd never costed that functionality, but given that it was simply continuing evaluation before returning from the builtin using the same evaluator without any state reinitialization or discarding, I don't think there were any costing issues either, it just didn't matter if the CEK machine (or any other evaluator) execution was invoked directly or continued from within a builtin

so I was wondering if there is any problem here.

Anyhow, take my words with a grain of salt, I might be failing to spot an issue with costing pattern matching builtins.

@effectfully
Copy link
Contributor Author

What would be the type of caseList?

It's in the PR, the plutus-core/plutus-core/test/TypeSynthesis/Golden/CaseList.plc.golden file (there's a golden file with a type signature for every builtin, including the proposed ones):

(all
  a
  (type)
  (all
    b
    (type)
    (fun [ (con list) a ] (fun b (fun (fun a (fun [ (con list) a ] b)) b)))
  )
)

or if you're not a Cylon:

all a b. list a -> b -> (a -> list a -> b) -> b

This is the type of the built-in function, i.e. what this PR proposed to implement support for.

We also have a function that is confusingly named the same (the one in stdlib), but has all b and the subsequent list a swapped, because it's some extremely outdated legacy code that was introduced in order to make it easy to replace unwrap-based code with code that uses pattern matching. Basically, if you see StdLib in the name of a file in this PR, ignore it, it's not really useful for anything and it's not used other than for tests. I apologize for the confusion, I should've specified that stdlib stuff should be ignored, it was just for me to test that the proposed builtins type check and evaluate fine.

@mjaskelioff mjaskelioff self-requested a review September 11, 2023 11:10
Copy link
Contributor

@mjaskelioff mjaskelioff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The cases written like this is are the way they should be.
We will need to decide what to do with the spec and metatheory regarding the types of builtins, but I think the benefits are worth it.

@effectfully
Copy link
Contributor Author

We will need to decide what to do with the spec and metatheory regarding the types of builtins

I think types aren't complicated? You just add -> to the type of signatures of builtins and that's it. Or am I missing something?

Specifying the operational semantics of builtins does sound somewhat complicated. "So if f is an argument to a built-in function, then you can return f x, but not f (f x)" -- pretty stupid.

Which is why my original plan was to allow arbitrary trees of application (in the sense of Data.Tree.Tree), not just a head and a spine-as-a-list. That is harder to cost (I still think it's not a big deal and is much easier than, say, equalsData for the kind of builtins that we may want to represent that way), requires extensive discussion on why I believe this is useful (in particular, Michael doesn't trust me that I can define much faster unsafeFromBuiltinData this way and maybe he's right) and is generally more complicated, so I thought it would be worth starting with a simpler but still useful thing first, hence this PR.

@michaelpj
Copy link
Contributor

Which is why my original plan was to allow arbitrary trees of application (in the sense of Data.Tree.Tree), not just a head and a spine-as-a-list.

Is the "real" answer here that we can return any term that introduces no new variable binders? e.g. introducing force/delay/constr all also seem fine, because it's adding new binders that really throws us off (because then we'd need to shift any binders in the input terms depending on where we put them, etc.

Of course, our arguments are values and not terms, so we can't actually construct a term, instead we have to construct an evaluation continuation or something, so that does make things more complicated. And maybe the most useful and straightforward evaluation continuation is just a spine of applications. Although even then... can I do this?

\h a -> h 1 a

or this?

\h a -> h (addInteger 1 2) a

@effectfully effectfully force-pushed the effectfully/builtins/pattern-matching-builtins branch from dc78710 to a551cb4 Compare November 13, 2023 12:22
…to effectfully/builtins/pattern-matching-builtins
@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:nofib

Copy link
Contributor

Click here to check the status of your benchmark.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:lists

Copy link
Contributor

Comparing benchmark results of 'plutus-benchmark:nofib' on '159e5bd8e' (base) and '26b546339' (PR)

Results table
Script 159e5bd 26b5463 Change
clausify/formula1 4.687 ms 4.846 ms +3.4%
clausify/formula2 6.178 ms 6.361 ms +3.0%
clausify/formula3 16.79 ms 17.27 ms +2.9%
clausify/formula4 35.17 ms 36.24 ms +3.0%
clausify/formula5 81.37 ms 83.75 ms +2.9%
knights/4x4 28.01 ms 28.78 ms +2.7%
knights/6x6 78.50 ms 80.69 ms +2.8%
knights/8x8 158.5 ms 162.7 ms +2.6%
primetest/05digits 14.66 ms 15.10 ms +3.0%
primetest/08digits 24.28 ms 25.22 ms +3.9%
primetest/10digits 29.90 ms 31.08 ms +3.9%
primetest/20digits 67.11 ms 69.09 ms +3.0%
primetest/30digits 105.1 ms 108.0 ms +2.8%
primetest/40digits 145.3 ms 150.4 ms +3.5%
primetest/50digits 155.9 ms 161.5 ms +3.6%
queens4x4/bt 6.925 ms 7.112 ms +2.7%
queens4x4/bm 8.811 ms 9.065 ms +2.9%
queens4x4/bjbt1 8.481 ms 8.738 ms +3.0%
queens4x4/bjbt2 7.864 ms 8.116 ms +3.2%
queens4x4/fc 18.42 ms 19.03 ms +3.3%
queens5x5/bt 92.60 ms 95.18 ms +2.8%
queens5x5/bm 98.97 ms 101.6 ms +2.7%
queens5x5/bjbt1 109.1 ms 112.0 ms +2.7%
queens5x5/bjbt2 105.2 ms 108.3 ms +2.9%
queens5x5/fc 235.0 ms 243.0 ms +3.4%

Copy link
Contributor

Click here to check the status of your benchmark.

@effectfully
Copy link
Contributor Author

+3% on average on the builtins-heavy nofib benchmarks. Not too bad I guess and maybe I could squeeze some performance out of it.

Copy link
Contributor

Comparing benchmark results of 'plutus-benchmark:lists' on '159e5bd8e' (base) and '26b546339' (PR)

Results table
Script 159e5bd 26b5463 Change
sort/ghcSort/10 47.86 μs 47.92 μs +0.1%
sort/ghcSort/20 110.5 μs 109.3 μs -1.1%
sort/ghcSort/30 170.0 μs 170.3 μs +0.2%
sort/ghcSort/40 249.9 μs 250.5 μs +0.2%
sort/ghcSort/50 311.3 μs 308.4 μs -0.9%
sort/ghcSort/60 388.2 μs 387.4 μs -0.2%
sort/ghcSort/70 505.8 μs 503.8 μs -0.4%
sort/ghcSort/80 571.8 μs 568.5 μs -0.6%
sort/ghcSort/90 633.0 μs 629.9 μs -0.5%
sort/ghcSort/100 712.1 μs 710.6 μs -0.2%
sort/ghcSort/110 783.1 μs 778.3 μs -0.6%
sort/ghcSort/120 885.6 μs 882.3 μs -0.4%
sort/ghcSort/130 1.086 ms 1.086 ms 0.0%
sort/ghcSort/140 1.152 ms 1.149 ms -0.3%
sort/ghcSort/150 1.232 ms 1.232 ms 0.0%
sort/ghcSort/160 1.311 ms 1.317 ms +0.5%
sort/ghcSort/170 1.392 ms 1.388 ms -0.3%
sort/ghcSort/180 1.461 ms 1.463 ms +0.1%
sort/ghcSort/190 1.551 ms 1.548 ms -0.2%
sort/ghcSort/200 1.661 ms 1.656 ms -0.3%
sort/ghcSort/210 1.730 ms 1.731 ms +0.1%
sort/ghcSort/220 1.813 ms 1.809 ms -0.2%
sort/ghcSort/230 1.945 ms 1.938 ms -0.4%
sort/ghcSort/240 2.052 ms 2.053 ms +0.0%
sort/ghcSort/250 2.151 ms 2.149 ms -0.1%
sort/ghcSort/260 2.477 ms 2.470 ms -0.3%
sort/ghcSort/270 2.555 ms 2.538 ms -0.7%
sort/ghcSort/280 2.621 ms 2.616 ms -0.2%
sort/ghcSort/290 2.713 ms 2.713 ms 0.0%
sort/ghcSort/300 2.807 ms 2.804 ms -0.1%
sort/ghcSort/310 2.909 ms 2.896 ms -0.4%
sort/ghcSort/320 2.983 ms 2.996 ms +0.4%
sort/ghcSort/330 3.087 ms 3.075 ms -0.4%
sort/ghcSort/340 3.172 ms 3.165 ms -0.2%
sort/ghcSort/350 3.244 ms 3.244 ms 0.0%
sort/ghcSort/360 3.336 ms 3.330 ms -0.2%
sort/ghcSort/370 3.437 ms 3.425 ms -0.3%
sort/ghcSort/380 3.549 ms 3.536 ms -0.4%
sort/ghcSort/390 3.686 ms 3.706 ms +0.5%
sort/ghcSort/400 3.786 ms 3.802 ms +0.4%
sort/ghcSort/410 3.845 ms 3.842 ms -0.1%
sort/ghcSort/420 3.972 ms 3.982 ms +0.3%
sort/ghcSort/430 4.038 ms 4.039 ms +0.0%
sort/ghcSort/440 4.153 ms 4.152 ms -0.0%
sort/ghcSort/450 4.381 ms 4.378 ms -0.1%
sort/ghcSort/460 4.434 ms 4.440 ms +0.1%
sort/ghcSort/470 4.549 ms 4.541 ms -0.2%
sort/ghcSort/480 4.691 ms 4.687 ms -0.1%
sort/ghcSort/490 4.785 ms 4.792 ms +0.1%
sort/ghcSort/500 4.922 ms 4.920 ms -0.0%
sort/insertionSort/10 44.37 μs 44.51 μs +0.3%
sort/insertionSort/20 160.6 μs 160.0 μs -0.4%
sort/insertionSort/30 351.9 μs 349.2 μs -0.8%
sort/insertionSort/40 614.9 μs 609.2 μs -0.9%
sort/insertionSort/50 951.2 μs 943.6 μs -0.8%
sort/insertionSort/60 1.373 ms 1.360 ms -0.9%
sort/insertionSort/70 1.854 ms 1.840 ms -0.8%
sort/insertionSort/80 2.414 ms 2.399 ms -0.6%
sort/insertionSort/90 3.060 ms 3.025 ms -1.1%
sort/insertionSort/100 3.772 ms 3.725 ms -1.2%
sort/insertionSort/110 4.532 ms 4.505 ms -0.6%
sort/insertionSort/120 5.401 ms 5.361 ms -0.7%
sort/insertionSort/130 6.370 ms 6.289 ms -1.3%
sort/insertionSort/140 7.401 ms 7.267 ms -1.8%
sort/insertionSort/150 8.421 ms 8.380 ms -0.5%
sort/insertionSort/160 9.591 ms 9.520 ms -0.7%
sort/insertionSort/170 10.87 ms 10.77 ms -0.9%
sort/insertionSort/180 12.21 ms 12.02 ms -1.6%
sort/insertionSort/190 13.59 ms 13.41 ms -1.3%
sort/insertionSort/200 15.06 ms 14.94 ms -0.8%
sort/insertionSort/210 16.58 ms 16.44 ms -0.8%
sort/insertionSort/220 18.11 ms 18.06 ms -0.3%
sort/insertionSort/230 19.86 ms 19.83 ms -0.2%
sort/insertionSort/240 21.64 ms 21.50 ms -0.6%
sort/insertionSort/250 23.46 ms 23.42 ms -0.2%
sort/insertionSort/260 25.47 ms 25.22 ms -1.0%
sort/insertionSort/270 27.50 ms 27.29 ms -0.8%
sort/insertionSort/280 29.54 ms 29.32 ms -0.7%
sort/insertionSort/290 31.80 ms 31.63 ms -0.5%
sort/insertionSort/300 34.07 ms 33.71 ms -1.1%
sort/insertionSort/310 36.26 ms 35.99 ms -0.7%
sort/insertionSort/320 38.87 ms 38.63 ms -0.6%
sort/insertionSort/330 41.25 ms 40.94 ms -0.8%
sort/insertionSort/340 43.74 ms 43.50 ms -0.5%
sort/insertionSort/350 46.44 ms 46.04 ms -0.9%
sort/insertionSort/360 49.12 ms 49.00 ms -0.2%
sort/insertionSort/370 52.09 ms 51.63 ms -0.9%
sort/insertionSort/380 54.98 ms 54.85 ms -0.2%
sort/insertionSort/390 58.09 ms 57.84 ms -0.4%
sort/insertionSort/400 60.70 ms 60.81 ms +0.2%
sort/insertionSort/410 64.34 ms 63.85 ms -0.8%
sort/insertionSort/420 67.28 ms 66.90 ms -0.6%
sort/insertionSort/430 70.62 ms 70.72 ms +0.1%
sort/insertionSort/440 73.90 ms 73.61 ms -0.4%
sort/insertionSort/450 77.43 ms 76.98 ms -0.6%
sort/insertionSort/460 81.24 ms 81.12 ms -0.1%
sort/insertionSort/470 84.72 ms 83.99 ms -0.9%
sort/insertionSort/480 88.27 ms 87.73 ms -0.6%
sort/insertionSort/490 92.17 ms 91.94 ms -0.2%
sort/insertionSort/500 95.69 ms 96.09 ms +0.4%
sort/mergeSort/10 115.4 μs 114.3 μs -1.0%
sort/mergeSort/20 277.6 μs 275.5 μs -0.8%
sort/mergeSort/30 457.4 μs 456.3 μs -0.2%
sort/mergeSort/40 650.0 μs 650.7 μs +0.1%
sort/mergeSort/50 843.0 μs 843.6 μs +0.1%
sort/mergeSort/60 1.062 ms 1.067 ms +0.5%
sort/mergeSort/70 1.275 ms 1.282 ms +0.5%
sort/mergeSort/80 1.504 ms 1.498 ms -0.4%
sort/mergeSort/90 1.721 ms 1.733 ms +0.7%
sort/mergeSort/100 1.945 ms 1.956 ms +0.6%
sort/mergeSort/110 2.175 ms 2.181 ms +0.3%
sort/mergeSort/120 2.430 ms 2.429 ms -0.0%
sort/mergeSort/130 2.697 ms 2.712 ms +0.6%
sort/mergeSort/140 2.904 ms 2.917 ms +0.4%
sort/mergeSort/150 3.131 ms 3.142 ms +0.4%
sort/mergeSort/160 3.411 ms 3.418 ms +0.2%
sort/mergeSort/170 3.638 ms 3.644 ms +0.2%
sort/mergeSort/180 3.893 ms 3.910 ms +0.4%
sort/mergeSort/190 4.153 ms 4.176 ms +0.6%
sort/mergeSort/200 4.391 ms 4.420 ms +0.7%
sort/mergeSort/210 4.667 ms 4.695 ms +0.6%
sort/mergeSort/220 4.914 ms 4.928 ms +0.3%
sort/mergeSort/230 5.181 ms 5.224 ms +0.8%
sort/mergeSort/240 5.447 ms 5.475 ms +0.5%
sort/mergeSort/250 5.763 ms 5.782 ms +0.3%
sort/mergeSort/260 6.020 ms 6.042 ms +0.4%
sort/mergeSort/270 6.249 ms 6.265 ms +0.3%
sort/mergeSort/280 6.494 ms 6.555 ms +0.9%
sort/mergeSort/290 6.746 ms 6.773 ms +0.4%
sort/mergeSort/300 6.991 ms 7.031 ms +0.6%
sort/mergeSort/310 7.261 ms 7.319 ms +0.8%
sort/mergeSort/320 7.603 ms 7.657 ms +0.7%
sort/mergeSort/330 7.833 ms 7.845 ms +0.2%
sort/mergeSort/340 8.095 ms 8.148 ms +0.7%
sort/mergeSort/350 8.392 ms 8.477 ms +1.0%
sort/mergeSort/360 8.700 ms 8.720 ms +0.2%
sort/mergeSort/370 8.970 ms 9.012 ms +0.5%
sort/mergeSort/380 9.291 ms 9.336 ms +0.5%
sort/mergeSort/390 9.572 ms 9.610 ms +0.4%
sort/mergeSort/400 9.767 ms 9.833 ms +0.7%
sort/mergeSort/410 10.11 ms 10.13 ms +0.2%
sort/mergeSort/420 10.39 ms 10.43 ms +0.4%
sort/mergeSort/430 10.71 ms 10.75 ms +0.4%
sort/mergeSort/440 10.92 ms 10.97 ms +0.5%
sort/mergeSort/450 11.17 ms 11.28 ms +1.0%
sort/mergeSort/460 11.50 ms 11.62 ms +1.0%
sort/mergeSort/470 11.79 ms 11.91 ms +1.0%
sort/mergeSort/480 12.12 ms 12.19 ms +0.6%
sort/mergeSort/490 12.46 ms 12.53 ms +0.6%
sort/mergeSort/500 12.78 ms 12.87 ms +0.7%
sort/quickSort/10 106.6 μs 102.7 μs -3.7%
sort/quickSort/20 390.7 μs 380.6 μs -2.6%
sort/quickSort/30 853.0 μs 835.3 μs -2.1%
sort/quickSort/40 1.518 ms 1.485 ms -2.2%
sort/quickSort/50 2.374 ms 2.342 ms -1.3%
sort/quickSort/60 3.415 ms 3.370 ms -1.3%
sort/quickSort/70 4.658 ms 4.571 ms -1.9%
sort/quickSort/80 6.100 ms 5.990 ms -1.8%
sort/quickSort/90 7.739 ms 7.589 ms -1.9%
sort/quickSort/100 9.455 ms 9.276 ms -1.9%
sort/quickSort/110 11.48 ms 11.30 ms -1.6%
sort/quickSort/120 13.62 ms 13.45 ms -1.2%
sort/quickSort/130 16.03 ms 15.77 ms -1.6%
sort/quickSort/140 18.41 ms 18.13 ms -1.5%
sort/quickSort/150 21.23 ms 20.90 ms -1.6%
sort/quickSort/160 24.05 ms 23.67 ms -1.6%
sort/quickSort/170 27.15 ms 26.76 ms -1.4%
sort/quickSort/180 30.44 ms 30.07 ms -1.2%
sort/quickSort/190 33.90 ms 33.43 ms -1.4%
sort/quickSort/200 37.55 ms 36.90 ms -1.7%
sort/quickSort/210 41.40 ms 40.83 ms -1.4%
sort/quickSort/220 45.40 ms 44.64 ms -1.7%
sort/quickSort/230 49.61 ms 49.00 ms -1.2%
sort/quickSort/240 54.10 ms 53.38 ms -1.3%
sort/quickSort/250 58.68 ms 57.60 ms -1.8%
sort/quickSort/260 63.36 ms 62.56 ms -1.3%
sort/quickSort/270 68.31 ms 67.42 ms -1.3%
sort/quickSort/280 73.48 ms 72.49 ms -1.3%
sort/quickSort/290 78.92 ms 77.64 ms -1.6%
sort/quickSort/300 84.46 ms 83.00 ms -1.7%
sort/quickSort/310 90.40 ms 88.97 ms -1.6%
sort/quickSort/320 96.27 ms 94.73 ms -1.6%
sort/quickSort/330 102.3 ms 100.8 ms -1.5%
sort/quickSort/340 109.3 ms 107.2 ms -1.9%
sort/quickSort/350 115.6 ms 113.6 ms -1.7%
sort/quickSort/360 121.8 ms 120.1 ms -1.4%
sort/quickSort/370 129.4 ms 127.0 ms -1.9%
sort/quickSort/380 136.2 ms 134.1 ms -1.5%
sort/quickSort/390 143.1 ms 141.5 ms -1.1%
sort/quickSort/400 151.1 ms 148.6 ms -1.7%
sort/quickSort/410 159.0 ms 156.6 ms -1.5%
sort/quickSort/420 166.8 ms 164.3 ms -1.5%
sort/quickSort/430 174.9 ms 173.0 ms -1.1%
sort/quickSort/440 182.9 ms 180.5 ms -1.3%
sort/quickSort/450 192.3 ms 190.0 ms -1.2%
sort/quickSort/460 200.7 ms 198.2 ms -1.2%
sort/quickSort/470 209.9 ms 207.1 ms -1.3%
sort/quickSort/480 218.5 ms 215.9 ms -1.2%
sort/quickSort/490 228.6 ms 225.5 ms -1.4%
sort/quickSort/500 237.5 ms 234.2 ms -1.4%
sum/compiled-from-Haskell/sum-right-builtin/10 12.51 μs 9.214 μs -26.3%
sum/compiled-from-Haskell/sum-right-builtin/50 57.72 μs 42.59 μs -26.2%
sum/compiled-from-Haskell/sum-right-builtin/100 116.5 μs 84.57 μs -27.4%
sum/compiled-from-Haskell/sum-right-builtin/500 607.9 μs 450.0 μs -26.0%
sum/compiled-from-Haskell/sum-right-builtin/1000 1.317 ms 979.5 μs -25.6%
sum/compiled-from-Haskell/sum-right-builtin/5000 8.578 ms 6.866 ms -20.0%
sum/compiled-from-Haskell/sum-right-builtin/10000 18.58 ms 15.77 ms -15.1%
sum/compiled-from-Haskell/sum-right-Scott/10 9.632 μs 9.467 μs -1.7%
sum/compiled-from-Haskell/sum-right-Scott/50 44.39 μs 43.47 μs -2.1%
sum/compiled-from-Haskell/sum-right-Scott/100 87.75 μs 86.14 μs -1.8%
sum/compiled-from-Haskell/sum-right-Scott/500 461.3 μs 451.1 μs -2.2%
sum/compiled-from-Haskell/sum-right-Scott/1000 994.0 μs 978.2 μs -1.6%
sum/compiled-from-Haskell/sum-right-Scott/5000 7.075 ms 7.108 ms +0.5%
sum/compiled-from-Haskell/sum-right-Scott/10000 16.39 ms 16.68 ms +1.8%
sum/compiled-from-Haskell/sum-right-data/10 27.13 μs 21.44 μs -21.0%
sum/compiled-from-Haskell/sum-right-data/50 130.4 μs 102.8 μs -21.2%
sum/compiled-from-Haskell/sum-right-data/100 261.8 μs 206.4 μs -21.2%
sum/compiled-from-Haskell/sum-right-data/500 1.437 ms 1.102 ms -23.3%
sum/compiled-from-Haskell/sum-right-data/1000 3.256 ms 2.477 ms -23.9%
sum/compiled-from-Haskell/sum-right-data/5000 18.35 ms 14.01 ms -23.7%
sum/compiled-from-Haskell/sum-right-data/10000 38.45 ms 29.27 ms -23.9%
sum/compiled-from-Haskell/sum-left-builtin/10 11.97 μs 9.103 μs -24.0%
sum/compiled-from-Haskell/sum-left-builtin/50 56.78 μs 42.14 μs -25.8%
sum/compiled-from-Haskell/sum-left-builtin/100 114.2 μs 83.71 μs -26.7%
sum/compiled-from-Haskell/sum-left-builtin/500 595.5 μs 442.5 μs -25.7%
sum/compiled-from-Haskell/sum-left-builtin/1000 1.285 ms 961.1 μs -25.2%
sum/compiled-from-Haskell/sum-left-builtin/5000 8.402 ms 6.963 ms -17.1%
sum/compiled-from-Haskell/sum-left-builtin/10000 17.98 ms 15.10 ms -16.0%
sum/compiled-from-Haskell/sum-left-Scott/10 9.593 μs 9.195 μs -4.1%
sum/compiled-from-Haskell/sum-left-Scott/50 44.19 μs 42.22 μs -4.5%
sum/compiled-from-Haskell/sum-left-Scott/100 87.30 μs 83.83 μs -4.0%
sum/compiled-from-Haskell/sum-left-Scott/500 455.6 μs 437.7 μs -3.9%
sum/compiled-from-Haskell/sum-left-Scott/1000 972.4 μs 937.9 μs -3.5%
sum/compiled-from-Haskell/sum-left-Scott/5000 7.083 ms 6.908 ms -2.5%
sum/compiled-from-Haskell/sum-left-Scott/10000 16.05 ms 15.77 ms -1.7%
sum/compiled-from-Haskell/sum-left-data/10 27.90 μs 21.60 μs -22.6%
sum/compiled-from-Haskell/sum-left-data/50 131.6 μs 103.0 μs -21.7%
sum/compiled-from-Haskell/sum-left-data/100 266.1 μs 206.5 μs -22.4%
sum/compiled-from-Haskell/sum-left-data/500 1.455 ms 1.101 ms -24.3%
sum/compiled-from-Haskell/sum-left-data/1000 3.264 ms 2.448 ms -25.0%
sum/compiled-from-Haskell/sum-left-data/5000 18.29 ms 13.94 ms -23.8%
sum/compiled-from-Haskell/sum-left-data/10000 37.77 ms 28.84 ms -23.6%
sum/hand-written-PLC/sum-right-builtin/10 12.10 μs 9.780 μs -19.2%
sum/hand-written-PLC/sum-right-builtin/50 55.14 μs 44.91 μs -18.6%
sum/hand-written-PLC/sum-right-builtin/100 109.1 μs 88.54 μs -18.8%
sum/hand-written-PLC/sum-right-builtin/500 551.7 μs 450.2 μs -18.4%
sum/hand-written-PLC/sum-right-builtin/1000 1.127 ms 920.1 μs -18.4%
sum/hand-written-PLC/sum-right-builtin/5000 6.327 ms 5.326 ms -15.8%
sum/hand-written-PLC/sum-right-builtin/10000 12.88 ms 10.98 ms -14.8%
sum/hand-written-PLC/sum-right-Scott/10 8.318 μs 8.365 μs +0.6%
sum/hand-written-PLC/sum-right-Scott/50 36.50 μs 35.55 μs -2.6%
sum/hand-written-PLC/sum-right-Scott/100 70.72 μs 70.22 μs -0.7%
sum/hand-written-PLC/sum-right-Scott/500 360.1 μs 358.6 μs -0.4%
sum/hand-written-PLC/sum-right-Scott/1000 739.6 μs 733.7 μs -0.8%
sum/hand-written-PLC/sum-right-Scott/5000 4.825 ms 4.798 ms -0.6%
sum/hand-written-PLC/sum-right-Scott/10000 10.52 ms 10.54 ms +0.2%
sum/hand-written-PLC/sum-left-builtin/10 12.91 μs 10.99 μs -14.9%
sum/hand-written-PLC/sum-left-builtin/50 58.87 μs 49.32 μs -16.2%
sum/hand-written-PLC/sum-left-builtin/100 115.1 μs 96.85 μs -15.9%
sum/hand-written-PLC/sum-left-builtin/500 569.1 μs 476.7 μs -16.2%
sum/hand-written-PLC/sum-left-builtin/1000 1.135 ms 949.7 μs -16.3%
sum/hand-written-PLC/sum-left-builtin/5000 5.665 ms 4.701 ms -17.0%
sum/hand-written-PLC/sum-left-builtin/10000 11.24 ms 9.424 ms -16.2%
sum/hand-written-PLC/sum-left-Scott/10 8.890 μs 8.836 μs -0.6%
sum/hand-written-PLC/sum-left-Scott/50 39.90 μs 39.05 μs -2.1%
sum/hand-written-PLC/sum-left-Scott/100 78.83 μs 77.47 μs -1.7%
sum/hand-written-PLC/sum-left-Scott/500 395.1 μs 388.0 μs -1.8%
sum/hand-written-PLC/sum-left-Scott/1000 795.8 μs 783.2 μs -1.6%
sum/hand-written-PLC/sum-left-Scott/5000 4.603 ms 4.528 ms -1.6%
sum/hand-written-PLC/sum-left-Scott/10000 9.482 ms 9.340 ms -1.5%

@effectfully
Copy link
Contributor Author

@michaelpj seems like I forgot to hit "comment" last time I wrote a response to your latest comment. Here's another attempt.

Is the "real" answer here that we can return any term that introduces no new variable binders? e.g. introducing force/delay/constr all also seem fine, because it's adding new binders that really throws us off (because then we'd need to shift any binders in the input terms depending on where we put them, etc.

Yeah, I guess, although we don't seem to need force or delay. Yet anyway.

Although we could allow constructing arbitrary terms as long as it's impossible to embed values supplied as arguments into them, so that there's no chance of mixing up original and created-by-the-builtin variables, but I'm not even sure what we could use that for, so probably not worth discussing it.

Although even then... can I do this?

\h a -> h 1 a

Yep, the machinery implemented in this PR is enough for that, just checked:

this :: Opaque val (Integer -> a -> b) -> Opaque val a -> Opaque (HeadSpine val) b
this h (Opaque a) = headSpine h [fromValue (1 :: Integer), a]

or this?

\h a -> h (addInteger 1 2) a

Not with the machinery implemented in this PR (neither nested application nor builtin calls are supported within the denotation of a builtin), but we can add support for those later if we want to.

Copy link
Contributor

github-actions bot commented Sep 4, 2024

Comparing benchmark results of 'lists' on '4b8e137e1' (base) and '490183b6c' (PR)

Results table
Script 4b8e137 490183b Change
sort/ghcSort/50 244.2 μs 236.2 μs -3.3%
sort/ghcSort/100 570.9 μs 560.1 μs -1.9%
sort/ghcSort/150 990.4 μs 981.0 μs -0.9%
sort/ghcSort/200 1.330 ms 1.306 ms -1.8%
sort/ghcSort/250 1.720 ms 1.687 ms -1.9%
sort/ghcSort/300 2.258 ms 2.218 ms -1.8%
sort/insertionSort/50 828.6 μs 799.7 μs -3.5%
sort/insertionSort/100 3.305 ms 3.199 ms -3.2%
sort/insertionSort/150 7.452 ms 7.204 ms -3.3%
sort/insertionSort/200 13.33 ms 12.84 ms -3.7%
sort/insertionSort/250 20.89 ms 20.38 ms -2.4%
sort/insertionSort/300 30.12 ms 29.24 ms -2.9%
sort/mergeSort/50 713.6 μs 703.8 μs -1.4%
sort/mergeSort/100 1.647 ms 1.614 ms -2.0%
sort/mergeSort/150 2.643 ms 2.569 ms -2.8%
sort/mergeSort/200 3.739 ms 3.636 ms -2.8%
sort/mergeSort/250 4.879 ms 4.714 ms -3.4%
sort/mergeSort/300 5.942 ms 5.831 ms -1.9%
sort/quickSort/50 1.997 ms 1.951 ms -2.3%
sort/quickSort/100 8.065 ms 7.884 ms -2.2%
sort/quickSort/150 18.16 ms 18.15 ms -0.1%
sort/quickSort/200 32.30 ms 31.51 ms -2.4%
sort/quickSort/250 50.46 ms 49.25 ms -2.4%
sort/quickSort/300 72.65 ms 71.37 ms -1.8%
sum/compiled-from-Haskell/sum-right-builtin/100 98.08 μs 68.94 μs -29.7%
sum/compiled-from-Haskell/sum-right-builtin/500 512.0 μs 360.6 μs -29.6%
sum/compiled-from-Haskell/sum-right-builtin/1000 1.108 ms 769.2 μs -30.6%
sum/compiled-from-Haskell/sum-right-builtin/2500 3.380 ms 2.425 ms -28.3%
sum/compiled-from-Haskell/sum-right-builtin/5000 7.247 ms 5.454 ms -24.7%
sum/compiled-from-Haskell/sum-right-Scott/100 65.14 μs 64.15 μs -1.5%
sum/compiled-from-Haskell/sum-right-Scott/500 340.2 μs 336.6 μs -1.1%
sum/compiled-from-Haskell/sum-right-Scott/1000 730.6 μs 727.7 μs -0.4%
sum/compiled-from-Haskell/sum-right-Scott/2500 2.439 ms 2.456 ms +0.7%
sum/compiled-from-Haskell/sum-right-Scott/5000 5.480 ms 5.453 ms -0.5%
sum/compiled-from-Haskell/sum-right-data/100 261.0 μs 208.5 μs -20.1%
sum/compiled-from-Haskell/sum-right-data/500 1.393 ms 1.144 ms -17.9%
sum/compiled-from-Haskell/sum-right-data/1000 3.132 ms 2.601 ms -17.0%
sum/compiled-from-Haskell/sum-right-data/2500 8.455 ms 7.068 ms -16.4%
sum/compiled-from-Haskell/sum-right-data/5000 17.55 ms 14.84 ms -15.4%
sum/compiled-from-Haskell/sum-left-builtin/100 94.02 μs 67.58 μs -28.1%
sum/compiled-from-Haskell/sum-left-builtin/500 490.6 μs 353.6 μs -27.9%
sum/compiled-from-Haskell/sum-left-builtin/1000 1.048 ms 752.1 μs -28.2%
sum/compiled-from-Haskell/sum-left-builtin/2500 3.202 ms 2.366 ms -26.1%
sum/compiled-from-Haskell/sum-left-builtin/5000 6.973 ms 5.284 ms -24.2%
sum/compiled-from-Haskell/sum-left-Scott/100 62.02 μs 62.50 μs +0.8%
sum/compiled-from-Haskell/sum-left-Scott/500 323.4 μs 329.1 μs +1.8%
sum/compiled-from-Haskell/sum-left-Scott/1000 694.9 μs 708.2 μs +1.9%
sum/compiled-from-Haskell/sum-left-Scott/2500 2.302 ms 2.283 ms -0.8%
sum/compiled-from-Haskell/sum-left-Scott/5000 5.203 ms 5.232 ms +0.6%
sum/compiled-from-Haskell/sum-left-data/100 264.3 μs 210.8 μs -20.2%
sum/compiled-from-Haskell/sum-left-data/500 1.430 ms 1.149 ms -19.7%
sum/compiled-from-Haskell/sum-left-data/1000 3.203 ms 2.596 ms -19.0%
sum/compiled-from-Haskell/sum-left-data/2500 8.624 ms 7.147 ms -17.1%
sum/compiled-from-Haskell/sum-left-data/5000 17.79 ms 14.89 ms -16.3%
sum/hand-written-PLC/sum-right-builtin/100 93.30 μs 66.24 μs -29.0%
sum/hand-written-PLC/sum-right-builtin/500 472.0 μs 333.1 μs -29.4%
sum/hand-written-PLC/sum-right-builtin/1000 978.9 μs 685.9 μs -29.9%
sum/hand-written-PLC/sum-right-builtin/2500 2.729 ms 1.962 ms -28.1%
sum/hand-written-PLC/sum-right-builtin/5000 5.730 ms 4.253 ms -25.8%
sum/hand-written-PLC/sum-right-Scott/100 51.12 μs 52.96 μs +3.6%
sum/hand-written-PLC/sum-right-Scott/500 266.9 μs 263.8 μs -1.2%
sum/hand-written-PLC/sum-right-Scott/1000 556.5 μs 545.9 μs -1.9%
sum/hand-written-PLC/sum-right-Scott/2500 1.694 ms 1.704 ms +0.6%
sum/hand-written-PLC/sum-right-Scott/5000 4.036 ms 3.991 ms -1.1%
sum/hand-written-PLC/sum-left-builtin/100 97.92 μs 70.47 μs -28.0%
sum/hand-written-PLC/sum-left-builtin/500 484.6 μs 349.0 μs -28.0%
sum/hand-written-PLC/sum-left-builtin/1000 962.4 μs 689.2 μs -28.4%
sum/hand-written-PLC/sum-left-builtin/2500 2.387 ms 1.720 ms -27.9%
sum/hand-written-PLC/sum-left-builtin/5000 4.784 ms 3.406 ms -28.8%
sum/hand-written-PLC/sum-left-Scott/100 57.50 μs 58.16 μs +1.1%
sum/hand-written-PLC/sum-left-Scott/500 288.9 μs 289.2 μs +0.1%
sum/hand-written-PLC/sum-left-Scott/1000 586.1 μs 592.8 μs +1.1%
sum/hand-written-PLC/sum-left-Scott/2500 1.632 ms 1.653 ms +1.3%
sum/hand-written-PLC/sum-left-Scott/5000 3.567 ms 3.639 ms +2.0%
TOTAL 421.6 ms 393.6 ms -6.6%

@effectfully effectfully removed the EXPERIMENT Experiments that we probably don't want to merge label Sep 14, 2024
…to effectfully/builtins/pattern-matching-builtins
…to effectfully/builtins/pattern-matching-builtins
…to effectfully/builtins/pattern-matching-builtins
@effectfully
Copy link
Contributor Author

Closing in favor of #6530.

@effectfully effectfully closed this Oct 1, 2024
@effectfully effectfully deleted the effectfully/builtins/pattern-matching-builtins branch October 1, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants