-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named tuples experimental first implementation #19075
Conversation
Excited for this!
type MyTuple = (name: String, age: Int)
summon[MyTuple <:< Tuple2[String, Int]] If indeed subtyping exists, then we could actually refer to the named fields with old type MyTuple = (_1: Int, _2: Int)
summon[MyTuple =:= Tuple2[Int, Int]] It could be that we should disallow to name thaem like that especially: type MyTuple = (_2: Int, _1: Int) //wrong order
|
In fact, it's the opposite. Named tuples with different names are not in a subtype relation, but unnamed tuples are a subtype of named tuples. It's similar to named parameters: You can still pass an argument without a name by position, but you cannot pass a named argument to a parameter with a different name, just because the position matches. So this works: type Person = (name: String, age: Int)
val bob: Person = ("Bob", 22) But this doesn't: val x: (String, Int) = bob We can't have both relations because then names would not matter at all, and selection by name would not even be well-defined. And arguably the first relation is a lot more useful than the second. In fact the second relation is actively damaging, because (NamedValue("name", String), NamedValue("age", 22)) and you don't want that to be equivalent to
Maybe just disallow it since it would be genuinely confusing. |
Actually, I have a counter-argument. If I have in my codebase a definition that returns a tuple that I just want to name without immediately changing all the dependent codebase, that could have been useful in gradual adoption, IMO. Maybe not a strong argument in favor, but still is useful. |
There's |
I recently had thoughts about the SIP process and I think any new proposal should define all possible feature interactions (and the implementations tested to cover them). I think feature interaction is where I find the Achilles' heel of newly introduced Scala features.
|
if elem > max then max = elem | ||
(min = min, max = max) | ||
|
||
val mm = minMax(1, 3, 400, -3, 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it still be possible to destructure like this? (and not need to call min.value
?)
val (min, max) = minMax(1, 3, 400, -3, 10)
or we have to do this?
val (min = min, max = max) = minMax(1, 3, 400, -3, 10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be supported, but is not yet implemented.
@soronpo The number of features in Scala is high, and the number of pairwise combinations (and triplets, quadruples, ...) is even higher, so to require "any" proposal" to investigate "all" possible feature interactions would effectively make it impractical to write a SIP. It is good to think about feature interaction early on and we can include a section about that in the SIP template, but the requirement should be to investigate the most important/relevant/significant/critical interactions. It is better to leave the complete investigation to the design and implementation phase. Perhaps support from formal reasoning is even needed in the implementation phase... |
@soronpo Maybe you want to kick off a thread on Contributors on the topic of "How to handle feature interaction in the SIP process" or similar and we can continue discussing it there? Or else we can discuss this at the next SIP meeting @anatoliykmetyuk |
Feature interaction is a real concern. One way to keep it in check is a reductionist approach, which this proposal follows. We desugar named tuples into regular tuples with named elements, where named elements are just instances of an opaque type. After desugaring, there's almost nothing else to do. The one exception is pattern matching, so that has to be studied carefully. In particular I believe that there's synergy between this proposal and the named pattern matching SIP. |
Pairwise is enough, and you can easily eliminate the trivial ones. It's rare that I hit bugs that requires multiple new features to surface, but quite often I find bugs that could have been discovered by a "simple" feature coverage pairing process (e.g., my recent discovery of new style conversions and type classes).
I will make sure it's added to the next meeting's agenda. |
I would be really cool if this turns out to be a feasible unification!! |
- Always follow skolems to underlying types - Also follow other singletons to underlying types if normalize is true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for exploring this idea!
Besides a couple of comments below, I think it also important to synthesize Conversion
s to compatible tuple types.
The test cases don’t show if Mirror
s work on named tuples, but I would expect them to work just like they work on regular tuples. Would it be possible to provide access to the field names in the mirrors?
The fact that the following does not compile is a bit counter-intuitive:
val x: (String, Int) = (foo = "hello", x = 42)
Especially since both types have the same run-time representation.
Would you mind elaborating on “we can't have both relations because then names would not matter at all, and selection by name would not even be well-defined”?
In fact the second relation is actively damaging, because
bob
's expansion to a tuple is(NamedValue("name", String), NamedValue("age", 22))and you don't want that to be equivalent to
(String, 22)
because that would mean that you could strip the important name info at any time.
I am not sure that would be so dramatic, this happens all the time already when we upcast a value.
Last, it would be useful to hear from the maintainers of Iskra to see if they would see value in using this feature:
// currently
measurements
.groupBy($.stationId)
.agg(
min($.temperature).as("minTemperature"),
max($.temperature).as("maxTemperature"),
avg($.pressure).as("avgPressure")
)
.where($.maxTemperature - $.minTemperature < lit(20))
.select($.stationId, $.avgPressure)
.show()
// with this proposal
measurements
.groupBy($.stationId)
.agg(
(
minTemperature = min($.temperature),
maxTemperature = max($.temperature),
avgPressure = avg($.pressure)
)
)
.where($.maxTemperature - $.minTemperature < lit(20))
.select($.stationId, $.avgPressure) // field names would be lost here?
.show()
@@ -0,0 +1,9 @@ | |||
(Bob,33) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to show the field names as well?
(Bob,33) | |
(name = Bob, age = 33) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not with .toString
, since the representation of named and unnamed tuples is exactly the same.
But we really should have a standard Show
typeclass that could handle these things.
bob match | ||
case p @ (name = "Bob", age = _) => println(p.age) | ||
bob match | ||
case p @ (name = "Peter", age = _) => println(p.age) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use a variable pattern on the right-hand side of the =
sign?
case p @ (name = "Peter", age = _) => println(p.age) | |
case (name = "Peter", age = age) => println(age) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the latest tests contain that pattern.
When matching a regular tuple pattern against a named tuple selector we adapted the tuple to be named, which could hide errors. We now adapt the selector to be unnamed instead.
There's the |
73b2ead
to
e588c5a
Compare
If you have both relations then at least conceptually any named tuple is compatible with any other named tuple (since subtyping is transitive), so what does it even mean to write |
Also disallow checking a named pattern against a top type like Any or Tuple. The names in a named pattern _must_ be statically visible in the selector type.
I was more thinking of the ability to convert from, say, |
@odersky I already came up with an additional feature that can rely on Named Tuples. Would love to get your view on it. |
I find it very confusing that unnamed tuples would be subtypes of the named ones. To me it should be the other way around. A named tuple has a stronger contract than an unnamed tuple, since it assigns specific meaning, though names, to its various fields. A type with a stronger contract should be a subtype of a type with a weaker contract, according to LSP. |
object Test: | ||
|
||
object Named: | ||
opaque type Named[name <: String & Singleton, A] >: A = A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>: A
seems to only be there to support the ("bob", 42): (name: String, age: Int)
conversion. This is what makes unnamed tuples subtypes of named ones (String, Int) <:< (name: String, age: Int)
.
An alternative might be to add the inverse of dropNames
to make the conversion in the other direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this encoding we can end up with the malformed (String, Tuple.NamedValue["age", Int])
(which represents (String, age: Int)
)
Therefore subtyping relation holds:
(String, Tuple.NamedValue["age", Int]) <:< (name: String, age: Int)
We might not have syntax for it but be might get that type through concatenation or inference.
Maybe an analogy with named parameters helps explain the subtyping rules? I can assign parameters by position to a named parameter list. def f(param: Int) = ...
f(param = 1) // OK
f(2) // Also OK But I can't use a name to pass something to an unnamed parameter: val f: Int => T
f(2) // OK
f(param = 2) // Not OK The rules for tuples are the same. |
We also have complications when pattern matching on named tuples def foo(x: (name: String, age: Int)) =
x match
case (name, _) =>
name: String // error
x match
case name *: _ =>
name: String // error
val (name, _) = x
name: String // error |
Similar issues with def test(coord: (x: Int, y: Int)) =
val coordList: List[Int] = coord.toList
... |
@nicolasstucki yes, there's some discussion of this in the pos test. Not sure what to do. We could modify the operations to strip the named parts, but that risks making other operations weaker. For instance: def swap[A, B](pair: (A, B)): (B, A) = (pair(1), pair(0)) The way things are defined now, swap((name = "Bob", age = 22)) = (age = 22, name = "Bob") Arguably, that's useful. There's also the problem that if we tweak apply, then it will behave different from _1, _2, _3. So my tentative conclusion is that it's better not do do what "users expect" and instead keep the model simple and educate users what to expect. |
Initially I also thought that this would be good enough. The issue is that when I tried to use more realistic use cases I noticed that we would need to use The issue I am worried about is that there are more methods that do not work than ones that do work.
|
Maybe instead of
@bmeesters I assume that's what Martin meant (not a language feature, but in the standard library). I agree projections and conversions from case classes are useful, but they will always allocate. I guess that's the tradeoff for named tuples, because the runtime representation is fixed, a field access can be efficiently compiled directly as such. Sturctural types are on the other end, no conversion required but field access has to go through reflection. |
It would also lead to a great duplication of the whole machinery we have for tuples (which is extensive). For instance, we have a lot of code for specializing tuples (and it's still not enough, by a large margin!). Do we want to duplicate all of that for named tuples? I think we'll likely never finish doing this. |
That's not what the spec says or what I see in the tests. I see: def foo(x: (name: String, age: Int)) =
x match
case (name, _) =>
name: String // OK
x match
case name *: _ =>
name: String // error
val (name, _) = x
name: String // OK (in fact, currently warning about refutable match, but this will be fixed) |
What does "not work" mean? For instance The |
A possible improvement is to make named tuple elements easier to deal with in isolation. We already have a good object Element:
def apply[S <: String & Singleton, A](name: S, x: A): Element[name.type, A] = x
inline def unapply[S <: String & Singleton, A](named: Element[S, A]): Some[(S, A)] =
Some((compiletime.constValue[S], named)) With that, we can write code like this one: import NamedTuple.Element
val Element(nameStr, n) *: Element(ageStr, a) *: EmptyTuple = bob
println(s"matched elements ($nameStr, $n), ($ageStr, $a)")
val Element(ageStr1, age) = bob(1)
assert(ageStr1 == "age" && age == 33) |
- Add unapply method to NamedTuple.Element - Avoid spurious refutability warning when matching a named tuple RHS against an unnamed pattern.
We could also try // Almost complete implementation of NamedTuple[Nme <: Tuple, Tup <: Tuple]
object NamedTuples:
import Tuple.*
opaque type NamedTuple[Nme <: Tuple, Tup <: Tuple] = Tup
extension [Nme <: Tuple, Tup <: Tuple](tup: NamedTuple[Nme, Tup])
inline def dropNames: Tup = tup
inline def toList: List[Union[Tup]] = (tup: Tup).toList
inline def toArray: Array[Object] = tup.toArray
inline def toIArray: IArray[Object] = tup.toIArray
inline def ++[Nme2 <: Tuple, Tup2 <: Tuple](that: NamedTuple[Nme2, Tup2]): NamedTuple[Concat[Nme, Nme2], Concat[Tup, Tup2]] = tup ++ that
// inline def :* [L] (x: L): NamedTuple[Append[Nme, ???], Append[Tup, L] = ???
// inline def *: [H] (x: H): NamedTuple[??? *: Nme], H *: Tup] = ???
inline def size: Size[Tup] = tup.size
inline def map[F[_]](f: [t] => t => F[t]): NamedTuple[Nme, Map[Tup, F]] = (tup: Tup).map(f)
inline def take(n: Int): NamedTuple[Take[Nme, n.type], Take[Tup, n.type]] = tup.take(n)
inline def drop(n: Int): NamedTuple[Drop[Nme, n.type], Drop[Tup, n.type]] = tup.drop(n)
inline def splitAt(n: Int): NamedTuple[Split[Nme, n.type], Split[Tup, n.type]] = tup.splitAt(n)
inline def reverse: NamedTuple[Reverse[Nme], Reverse[Tup]] = tup.reverse
inline def zip[Tup2 <: Tuple](that: NamedTuple[Nme, Tup2]): NamedTuple[Nme, Zip[Tup, Tup2]] =
tup.zip(that) // note that this zips only if names are equal
end extension
extension [Nme <: NonEmptyTuple, Tup <: NonEmptyTuple](tup: NamedTuple[Nme, Tup])
inline def apply(n: Int): Elem[Tup, n.type] = tup(n)
inline def head: Head[Tup] = tup.head
inline def last: Last[Tup] = tup.last
inline def tail: NamedTuple[Tail[Nme], Tail[Tup]] = tup.tail
inline def init: NamedTuple[Init[Nme], Init[Tup]] = tup.init
end extension
// def unapply[H, T <: Tuple](x: H *: T): (H, T) = (x.head, x.tail) import NamedTuples.*
def test(x: NamedTuple[("name", "age"), (String, Int)], y: NamedTuple[("user", "email"), (String, String)]) =
x.head : String
x.last : Int
x(0) : String
x(1) : Int
x ++ y : NamedTuple[("name", "age", "user", "email"), (String, Int, String, String)]
x.toList : List[Int | String]
x.head : String
x.tail : NamedTuple["age" *: EmptyTuple, Int *: EmptyTuple]
x.last : Int
x.init : NamedTuple["name" *: EmptyTuple, String *: EmptyTuple]
We do have some optimization for tuples that are not identified because of some extra We could also use - opaque type NamedTuple[Nme <: Tuple, Tup <: Tuple] = Tup
+ opaque type NamedTuple[Nme <: Tuple, Tup <: Tuple] >: Tup = Tup to make this conversion work val _: NamedTuple[("name", "age"), (String, Int)] = ("bob", 42) |
That's an interesting idea (and matches what @lrytz proposed). I might have been too concerned about duplication, maybe that's manageable. |
It's just a tiny change, once we have named tuples.
@nicolasstucki This looks interesting! I guess my main question is how easy or hard would it be to define new generic operations over named tuples. As a start, how would we implement and name the analogue of |
In my example encoding, I managed to encode that operation with this extension opaque type NamedTuple[Nme <: Tuple, Tup <: Tuple] = Tup
extension [Nme <: Tuple, Tup <: Tuple](tup: NamedTuple[Nme, Tup])
...
inline def apply[N <: String & Singleton]: NamedAddition[N, Nme, Tup] = tup
end extension
extension [Nme <: NonEmptyTuple, Tup <: NonEmptyTuple](tup: NamedTuple[Nme, Tup])
// inline def apply(n: Int): Elem[Tup, n.type] = tup(n) // removed to make the other apply work when only type parameters are passed
...
end extension
opaque type NamedAddition[N <: String & Singleton, Nme <: Tuple, Tup <: Tuple] = Tup
extension [N <: String & Singleton, Nme <: Tuple, Tup <: Tuple](tup: NamedAddition[N, Nme, Tup])
inline def *:[H] (x: H): NamedTuple[N *: Nme, H *: Tup] = x *: tup
inline def :*[L] (x: L): NamedTuple[Append[Nme, N], Append[Tup, L]] = tup :* x
end extension def test(tup: NamedTuple[("name", "age"), (String, Int)]) =
tup ["user"]*: "bob1"
tup ["user"]:* "bob2"
tup["user"] *: "bob1"
tup["user"] :* "bob2" I also tied Details extension [Nme <: Tuple, Tup <: Tuple](tup: NamedTuple[Nme, Tup])
...
inline def :*[N <: String & Singleton]: NamedAppend[N, Nme, Tup] = tup
inline def *:[N <: String & Singleton]: NamedAppend[N, Nme, Tup] = tup
opaque type NamedPrepend[N <: String & Singleton, Nme <: Tuple, Tup <: Tuple] = Tup
object NamedPrepend:
extension [N <: String & Singleton, Nme <: Tuple, Tup <: Tuple](tup: NamedPrepend[N, Nme, Tup])
inline def apply[H] (x: H): NamedTuple[N *: Nme, H *: Tup] = (x *: tup).asInstanceOf[NamedTuple[N *: Nme, H *: Tup]]
opaque type NamedAppend[N <: String & Singleton, Nme <: Tuple, Tup <: Tuple] = Tup
object NamedAppend:
extension [N <: String & Singleton, Nme <: Tuple, Tup <: Tuple](tup: NamedAppend[N, Nme, Tup])
inline def apply[L] (x: L): NamedTuple[Append[Nme, N], Append[Tup, L]] = (tup :* x).asInstanceOf[NamedTuple[Append[Nme, N], Append[Tup, L]]] def test(tup: NamedTuple[("name", "age"), (String, Int)]) =
tup *:["user"] "bob1" // error: expression expected but '[' found
tup :*["user"] "bob2" // error: expression expected but '[' found |
Great. Here's another challenge: We need to express an operation |
like this? val bob = (name = "Bob", age = 23)
val scala = (age = 19, platforms = List("jvm", "js", "native"))
val bobScala = bob updateWith scala
assert(bobScala == (name = "Bob", age = 19, platforms = List("jvm", "js", "native"))) |
@bishabosha Yes, exactly. |
Shows how UpdateWith on named tuples can be implemented on the tpe level.
Just pushed a test that implements So far I have the impression that a |
I had a go with the old encoding of named tuples - just that it doesn't handle two labels of the same name but different type: import scala.language.experimental.namedTuples
import quoted.*
extension [Ts <: Tuple](ts: Ts)
infix inline def updateWith[Us <: Tuple](us: Us): Ops.UpdateWith[Ts, Us] =
// probably could optimise to unique integer keys across both tuples
val tsLabels = LabelledWitness.resolveLabels[Ts]
val usLabels = LabelledWitness.resolveLabels[Us]
// use ListMap to preserve insertion order
val tsMap = tsLabels.iterator.zip(ts.productIterator).to(collection.immutable.ListMap)
val merged = usLabels.iterator.zip(us.productIterator).foldLeft(tsMap):
case (acc, (label, elem)) => acc.updated(label, elem)
val values = merged.valuesIterator.toArray
Tuple.fromArray(values).asInstanceOf[Ops.UpdateWith[Ts, Us]]
object LabelledWitness:
inline def resolveLabels[Ts <: Tuple]: List[String] = ${ resolveLabelsImpl[Ts] }
def resolveLabelsImpl[Ts <: Tuple: Type](using Quotes): Expr[List[String]] =
import quotes.reflect.*
val namedTupleElement = Symbol.requiredModule("scala.NamedTuple").typeMember("Element")
def inner[Ts <: Tuple: Type](acc: List[String]): List[String] =
Type.of[Ts] match
case '[EmptyTuple] => acc
case '[t *: tail] => TypeRepr.of[t] match
case AppliedType(ref, List(ConstantType(StringConstant(s0)), _)) if ref.typeSymbol == namedTupleElement =>
inner[tail](acc :+ s0)
case _ =>
report.errorAndAbort(s"Expected NamedTuple.Element, got: ${Type.show[t]}")
Expr(inner[Ts](Nil))
object Ops:
type Contains[Needle, Haystack <: Tuple] <: Boolean = Haystack match
case EmptyTuple => false
case Needle *: ts => true
case t *: ts => Contains[Needle, ts]
type UpdateWith[Acc <: Tuple, Explore <: Tuple] <: Tuple = Explore match
case EmptyTuple => Acc
case t *: ts => Contains[t, Acc] match
case true => UpdateWith[Acc, ts]
case false => UpdateWith[Tuple.Append[Acc,t], ts] test case in a separate file: val bob = (name = "Bob", age = 23)
val scalaLang = (age = 19, platforms = List("jvm", "js", "native"))
val bobScala = bob updateWith scalaLang
@main def Test() =
assert(bobScala == (name = "Bob", age = 19, platforms = List("jvm", "js", "native"))) |
#19174 implements the suggestion by @lrytz and @nicolasstucki. Overall, I think I prefer that version. |
Closed in favor of #19174 |
Just a periodic reminder that Scala subtyping is not transitive (notably due to path-dependent types selecting abstract types with both upper and lower bounds) and that this relation would not have to be transitive either. In which situations would you foresee a problem?
It means what the type of |
One problem I would see is here: val x: (name: String, age: Person)
val y: (String, Person)
val z = if ??? then x else y What is the type of |
One could decree that the LUB should go either way, and as long as it's done consistently I don't see how that could lead to an issue. Here, intuitively I'd say But TypeScript (which also subtypes in both directions) happens to think the other way: function lub(b: boolean, x: [name: String, age: number], y: [String, number]) { return b ? x : y }
// function lub(b: boolean, x: [name: String, age: number], y: [String, number]): [name: String, age: number] |
There are no specific rules for disambiguating LUBs, and I would think we prefer if we could keep it that way. I also prefer Typescript's behavior. You have the names, why throw them away? In summary my tendency is to keep the subtyping unnamed <: named and complement it with an implicit conversion named -> unnamed. |
This implementation follows the alternative representation scheme, where a named tuple type is represented as a pair of two tuples: one for the names, the other for the values. Compare with #19075, where named tupes were regular types, with special element types that combine name and value. In both cases, we use an opaque type alias so that named tuples are represented at runtime by just their values - the names are forgotten. The new implementation has some advantages - we can control in the type that named and unnamed elements are not mixed, - no element types are leaked to user code, - non-sensical operations such as concatenating a named and an unnamed tuple are statically excluded, - it's generally easier to enforce well-formedness constraints on the type level. The main disadvantage compared to #19075 is that there is a certain amount of duplication in types and methods between `Tuple` and `NamedTuple`. On the other hand, we can make sure by this that no non-sensical tuple operations are accidentally applied to named tuples.
No docs yet, but here are some tests to show what is supported:
tests/run/named-tuples.scala
EDIT: Docs are now included.