Scala Records introduce a data type Rec
for representing record types. Records are convenient for accessing and manipulating semi-structured data. Records are similar in functionality to F# records and shapeless records, however, they do not impose an ordering on their fields. Most relevant use cases are:
- Manipulating large tables in big-data frameworks like Spark and Scalding
- Manipulating results of SQL queries
- Manipulating JSON, YAML, XML, etc.
Records are implemented using macros and completely blend in the Scala environment. With records:
- Fields are accessed with a path just like regular case classes (e.g.
rec.country.state
) - Type errors are comprehensible and elaborate
- Auto-completion in the Eclipse IDE works seamlessly
- Run-time performance is high due to specialization with macros
- Compile-time performance is high due to the use of macros
To create a simple record run:
import records.Rec
scala> val person = Rec("name" -> "Hannah", "age" -> 30)
person: records.Rec{def name: String; def age: Int} = Rec { name = Hannah, age = 30 }
Fields of records can be accessed just like fields of classes:
if (person.age > 18) println(s"${person.name} is an adult.")
Scala Records allow for arbitrary levels of nesting:
val person = Rec(
"name" -> "Hannah",
"age" -> 30,
"country" -> Rec("name" -> "US", "state" -> "CA"))
They can be explicitly converted to case classes:
case class Country(name: String, state: String)
case class Person(name: String, age: String, country: Country)
val personClass = person.to[Person]
As well as implicitly when the contents of records.RecordConversions
are imported:
import records.RecordConversions._
val personClass: Person = person
In case of erroneous access, type errors will be comprehensible:
scala> person.nme
<console>:10: error: value nme is not a member of records.Rec{def name: String; def age: Int}
person.nme
^
Errors are also appropriate when converting to case classes:
val person = Rec("name" -> "Hannah", "age" -> 30)
val personClass = person.to[Person]
<console>:13: error: Converting to Person would require the source record to have the following additional fields: [country: Country].
val personClass = person.to[Person]
^
To include Scala Records in your SBT build please add:
libraryDependencies += "ch.epfl.lamp" %% "scala-records" % <version>
It is "safe" to use Scala Records in your project. They cross-compile against all minor Scala versions after 2.10.2. We will give our best effort to fix all the bugs promptly until we find a more principal, and functioning, solution for accessing semi-structured data in Scala. For further details see this page.
-
Record types must not be explicitly mentioned. In case of explicit mentioning the result will be a run-time exception. In
2.11.x
this would be detected by a warning. For example:val rec: Rec { def x: Int } = Rec("x" -> 1) rec.x // throws an exception
- Fixing SI-7340 would resolve this issue.
- A workaround would be to write a case class for a record type.
-
Records will not display nicely in IntelliJ IDEA. IntelliJ IDEA does not support whitebox macros:
- Writing a custom implementation for IntelliJ would remove this limitation.
-
In the Eclipse debugger records can not be debugged when conversions to case classes are used. For this to work the IDE must to understand the behavior of implicit macros.
-
In the Eclipse debugger records display as their underlying data structures. If these structures are optimized it is hard to keep track of the fields.
- All record calls will fire a warning for a reflective macro call.
[warn] 109: reflective access of structural type member macro method baz should be enabled
[warn] by making the implicit value scala.language.reflectiveCalls visible.
[warn] row.baz should be (1.7)
To disable this warning users must introduce import scala.language.reflectiveCalls
in a scope or set the compiler option -language:reflectiveCalls
.
2. Least upper bounds (LUBs) of two records can not be found. Consequences are the following:
- If two queries return the same records the results can not be directly combined under a same type. For example,
List(Rec("a" -> 1), Rec("a" -> 2))
will not be usable.
Scala Records compile asymptotically faster and run asymptotically faster than type-based approaches to records (e.g. HMaps
). For up-to-date benchmarks check out this repo.
In case you have any desires for new functionality, or find errors in the existing one, please report them in the issue tracker. We will gladly discuss further development and accept your pull requests.
Scala Records are developed with love and joy in the Scala Lab at EPFL in collaboration with Michael Armbrust from Databricks. Main contributors are:
- Vojin Jovanovic (@vjovanov)
- Tobias Schlatter (@gzm0)
- Hubert Plocziniczak (@hubertp)