Has there ever been a discussion about more idioma...
# language-proposals
z
Has there ever been a discussion about more idiomatic union types in Kotlin? Sealed classes have a lot of limitations compared to conventional counterparts in other languages. I’d love to propose something like this:
sealed typealias Pet = Dog | Cat | String
• Keywords already exist. RHS expression syntax would need a little work but it could work • At runtime they’re all just objects getting casted, but there’s plenty of precedent for this (generics) • Self-contained in the alias expression • Doesn’t allow anonymous expressions like typescript or similar languages do, which I think is a net positive • Can be compiler-checked just like sealed classes but without requiring inheritance. Instance check just results in a smart cast like sealed classes or other instance checks do today
Copy code
when (somePet) {
  is Dog ->
  is Cat ->
  is String -> 
}
👍 17
💯 9
r
Can be complier-checked
Not really, especially when taking any sort of interior into account
s
z
@Ruckus can you elaborate in a meaningful way? I don’t see how it wouldn’t be able to lean on the same kind of smart casting that instance checks do @stojan nothing against arrow, but I’m strictly interested in first party support on this 🙂. I find that syntax a little overly verbose too. It also appears to see adhoc/anonymous definitions as a good thing, which I don’t agree with. Seems partially related to trying to avoid new syntax changes. I also suspect the IDE support doesn’t exist in the same way it would for a first party feature
r
Sorry, I think I misunderstood compiler-checked as compile time enforced. I just meant there will definitely need to be runtime checks.
z
that seems reasonable as an intrinsic, yeah
Roman describes my reservations about anonymous declarations well here https://discuss.kotlinlang.org/t/union-types/77/28
g
I like the idea of restricting of anonymous types But will this work:
sealed typealias Pet = Dog | Cat
sealed typealias Animal = Dog | Cat | Lion
fun Animal.feed()
val pet: Pet = Cat()
pet.feed()
??
If it works as typealias, it should work of course, but curious about instance check
l
That is a really nice proposal, Zac! I'd like to know @elizarov thoughts, and from others in the Kotlin team as well.
u
Sorry but if they are objects are runtime (and at jvm level signature) it would be problematic
l
@Uberto Barbini I don't see why. You'd use
when
anyway with smart casts. Java interop would be not ideal, but interop is still there, you can cast in java.
u
I'm not an expert on the Jvm, but afaik either you declare the sealed typealias a type at bytecode level or just Object/Any. In the first case ok, you have the problem of how doing the cast to final classes (the sealed subclasses) in the second case it will be indistinguishable from Object unless you have the sources. I mean you cannot use it from other Kotlin jar... Note that the Java generics have type informations attached at the class, even if the internal code use it as Object, the jvm can distinguish if a method want a List<String> or a List<Integer>
(at least this is at best of my knowledge, happy to be wrong)
e
We'd love to have untagged unions in Kotlin one day, but it is quite hard to integrate them into the typesystem properly. Original Hindley-Milner type system did not support untagged unions for a reason (only tagged unions aka sealed types are supported). Simply adding them makes a problem of type inferences untractable. That's the reason why very few modern languages have untagged unions (most of them are HM derivations of some form). There's been some modern research and even some sound implementations, but not many (AFAIK, TypeScript has unsound typesystem and does not attempt to fix it). There are some sketchy ideas of how it can be pulled off in Kotlin, but it requires a lot of scrupulous expert work to design properly. The major concern of the naive implementation is type explosion during inferences that has to be carefully constrained without harming useful cases.
👍 8
l
@Uberto Barbini Extensions functions are recognized as such in consuming librarys thanks to kotlin-metadata being provided. The same is already done for
typealias
and everything Kotlin, and this feature proposal would also work in consumer Kotlin projects thanks to that metadata that would support this as well. I once unintentionnaly ditched Kotlin metdata from a library publication, and it was recognized as a java library using only info from the class files, which meant no extensions, no named arguments, not type aliases, etc.
u
mmmh you are right, I forgot about that. 🙂
e
Java interop would suffer too, of course, but we've already crossed that rubicon with suspending functions.
r
@elizarov Does that mean interop is no longer as high of a priority?
e
To get you some food for though. Assume Kotlin has untagged unions. What should be the inteferred type of
listOf(1, "A")
then? Now it is
List<Any>
. But with unions should it become
List<Int | String>
then?
@Ruckus Don't get me wrong. Interop is still of a very high priority and consumes a significant fraction of our design efforts. It is just we got used to living with the idea that some narrow features might survive without a great and seamless interop, provided that you can still design APIs in Kotlin for Java consumers and consume Java libraries naturally.
l
@elizarov I've long believed that Kotlin files should have a Kotlin version header in their first line of code (or below copyright maybe), and that it'd allow to deal with incompatible source changes, where you could run a per-file migration that'd, for example, building upon your example, add
List<Any>
as an explicit type for a list of mixed types where inference would now generate an implicit type union. That said, Zac's original proposal would never be implicit as it requires an explicit typealias declaration that becomes the union type.
u
assuming sealed typealias Pet = Dog | Cat | String how would the when work? using reflection? unless you put some hidden Boxing / Unboxing... it start to seems like th Variant type of dotNet
r
@elizarov Okay, that makes sense. I was a bit worried there for a second 🙂
e
@louiscad If inference of unions is allowed to run unconstrained, it will be producing types so big you cannot even display them, yet alone read and understand. And it will be happening in some very real-life scenarios, not in something artificial. That's not something you really want. But you don't want to explicitly write the types all the time, either. You do want type inference. These two desires are hard to combine into a single coherent design. Not impossible, but hard.
👍 2
l
@elizarov If inferred type unions are allowed, I see them being limited in width and depth. Now, if they are always explicit/declared, with or without a name, then I think it'd be more viable, and it'd be up to the developers to avoid getting types too wide for the human mind, pretty much like function names.
@elizarov Can't explicitly declared typealiases be a first step towards built-in type unions support? I think even with inference possibly coming later on, they'd still be useful.
e
The runtime representation is another thing, but it is the second order of complexity (seems easier to solve). The trickiest questions are related to generics:
typealias DogOr<T> = Dog | T
. What you do with
if (dogOrT is T)
condition when you have
DorOr<Dog>
. Even more trickier are things like
Either<A, B> = A | B
that FP people would love to have. How would they even work? How'd you even deconstruct it into its left and right parts using
when
or
if
?
👍 1
TL;DR: There are lots and lots of questions to ask and to answer just to coherently design this feature even in its simplest shape, yet alone to implement it.
If there are any experts out there willing to help with a concrete proposal on what kind constraints we shall put in place to make it work in some limited shape, we'd love to see that. It would be very appreciated. We don't have a shortage of use-cases for untagged unions. We have tons of them in our own code.
u
using some kind of Boxing(sealcase, object) the generics would be easy to solve, since you have an actual class...
l
I think it'd be right to disallow making union types be an union of the same resolved type. That'd ensure casts can be used, and that the example with dog type you mentioned @elizarov is impossible to have. For cases where such a thing comes from a generic function where the union type is not part of the signature, I see 2 options: Either disallow making the union type depend on a type parameter defined in a declaration (function, class or interface) that doesn't expose that union type, or let it take all the first when branches of for example we get
Int | Int
at runtime.
r
We are releasing union types as a plugin this summer. It’s already implemented and it has a minimal inlined runtime, more efficient than sealed classes for many use cases https://github.com/arrow-kt/arrow-meta/blob/e3c602633948dad3594cce9a22035e0d06e3b1a7/compiler-plugin/src/test/kotlin/arrow/meta/plugins/union/UnionTest.kt Runtime: https://github.com/arrow-kt/arrow-meta/blob/e3c6026339/prelude/src/main/kotlin/arrow/union/unions.kt
arrow 5
😍 8
💯 2
metal 1
They are commutative and can widen without nullable casts to their intersected upperbound
They can be implemented more efficiently directly in the compiler if you make them part of the type hierarchy such as A? =!= A | null
That is placing it alongside synthetically between Any? and Any
Not allowing types in infix position makes it hard to read when nested and that is why the IDEA integration simplifies the display
it’s not just mine, many others have participated on this feature alongside others like type refinements, coercion functions etc.
In fact union types is not a feature just a possible encoding of a feature in the underlying proof system. You can see all the proofs that unlock unions in the prelude in meta. They are all replacements of subtype relationship for functions that take you from a type to another.
This will be available as a plugin from Arrow Meta for anyone interested after IR is stable in 1.4
u
this "runtime" works like a transparent Box/Unbox, right?
so Pet = Dog | Cat is transformed in a Union<Dog, Cat>
going back to @louiscad idea of typealias, I give it a bit of thought. If we super prune the proposal to exclude ony support to Generics, so List<Pet> won't compile, we can ask ourselves 2 questions: 1. would it be still useful? 2. would it be possible? I think the 1 is a yes. I'm still not convinced that the 2 is possible without reflection or an hidden type. If 2 is a yes, then we can think how to tackle the generics
z
With all due respect to arrow folks, I feel this is sidetracking the thread as this is a suggestion for the language itself, especially when Roman just said they'd be open to a concrete proposal
3
k
This is interesting. Can you pattern match on this meta
Union
container?
r
@Zac Sweers agreed, the point of showing this code is that this is also gonna become a KEEP. All features of meta around new types. In your original question you ask if it has been discussion and not only has there been but they are already implemented and working passing the laws of union types, we are just refining the IDE experience to propose exactly what you described above A | B | C
@elizarov If there is ever interest in implement union types in the compiler I’m happy to help. I know the theory, what needs to change and i’ve done it already respecting the kotlin subtype hierarchy. Also we understand the limitations of this as a plugin. In a compiler version of it you cal also extend it to intersection types as their dual and support both A | B and A & B . The kotlin compiler already has intersection types but are not exposed in syntax. @Zac Sweers bottom line if you want to help bring union types we have been working on it for several months. Also if you want to push the proposal I’ll be happy to just advice you in what we known as limitations and how they could be overcomed in the compiler.
I find the typealias approach unambitious and conflicting in semantics in kotlin because Kotlin has made the clear case that `typealias`is just an alias not a new type which implies it just desugars to what it is on the right.
I feel union types can appear in any type position because they are just sitting under Any and above Any? by being just Any | null
e
There's not need for a "formal proposal",but it would be great to see some writeup on your vision on how union types might be integrated into the Kotlin's type-system, especially into type-inference and overload resolution. I'm not worried about runtime representation at all. Does not need to be formal, because we have not released the formal spec for neither inference nor overload resolution yet (but they are coming)
r
Sounds good. I’ll send some docs this week with the considerations and what we’ve learned
e
The semantics of union types are important, too, especially around their upcast to
Any
or to generics and how that behaves w.r.t. subtypes checking at runtime using
is
operator.
r
right, they have to widen to upperbound in type bounds generically they can also be flattened if they contain repeated types and they are both comutative and associative over the intersection
meaning they are unrelated in semantics to Result and Either
so A|B == B|A ,etc.
so it’s the same notion as nullable types but where the predicate is not just null but an instanceOf and when you run into A | A that is A non nullable
Also A? | B? | C? == A | B | C | null
that is one of the limitations since null has no type marker like Unit
e
There's lots of conflicting desires on what are "nice to have" properties of union types. I'm not interested in formalities, I'm interested in your take on what compromise in this landscape shall be taken.
r
the biggest feature that everyone wants regardless of laws or properties is the ability to declare a return result from a function with multiple exit cases without the cost of a sealed hierarchy in allocations and boxing
union types at the compiler level should for example be able to specialize primitives
Today there is a cost to put a String in a union in Kotlin in terms of allocations
then there is the fancy laws and maths some would expect but basically we want to go from tagged unions with runtime and overhead cost to unions without cost
or the same cost nullable types have
so basically Kotlin is forcing now a style with a runtime penalty over an abstraction that should not have since it can be implemented without sealed classes or anything that implies allocations
I think this is consistent with the lang because kotlin has a history of swallowing some useful data types as syntax: Option = ?
IO = suspend
so this in a way I find specially interesting to users because it hides the Left, Right, Success, Errorr etc constructors
when they just want a value of one of these list of types
e
Yes. But, at the same time, there's one rule we cannot violate in Kotlin. It has to support abstraction over those types in intersection. If one can have a function returning a concrete
Dog | Cat
then, in Kotlin, one should be able to abstract over both of them via some
fun <A, B> foo(): A | B
. Now, it begs a question of what syntax one shall use when they have
val x: A | B = ...
to figure out whether
x is A
or
x is B
and what happens when the check for
x is SomeInterface
is performed when both
A
and
B
at run time implement this interface. Should this even be allowed and what should be the effect of this check? TL;DR: The question is not just how it integrates into compile-time type system, but also into a run-time type system (as they are different).
r
To abstract over the intersection the syntax used frequently is:
Copy code
fun <A: Persistence & Datasource> A.foo(): Unit
meaning A has both capabilities and for the union :
Copy code
fun <A: Persistence | Datasource> A.foo(): Unit
As for the runtime the runtime always knows the real type because the value is that of class that is known.
Java interop is an entire different story
In a union at runtime there is a single known value of a class that could not have gotten there unless injected with reflection or hackery
meaning all runtime dispatching is the same as nullable types
there is barely no cost for the abstraction
e
That's not what I'm talking about. And I don't care about cost that much at all. Kotlin is all about convenience in writing code. Solving your problem easily. We don't have "no cost" mantra. Here I'm talking about unions and abstraction over types in Kotlin way, not about hackery. With unions I can write
readData(): Data | Failure
That's great and concise, everyone wants it. But I also want
fun retry(block: () -> Data | Failure)
function so that I can write
retry { readData() }
So far so good. But now I want to abstract
retry
over the type of
Data
to
fun <D> retry(block: () -> D | Failure)
. and then maybe over the type of
Failure
too, to
fun <D, F> retry(block: () -> D | F)
. How that's going to work? How's the
retry
function going to be syntactically written?
(again, I do not care that much of how it is going to be implemented under the hood. First, I need it to consistently work from a type-system, semantic, and syntax perspective)
Syntax is first here, of course. That is what programmers work with every day. So, the no.1 question here is how the code for
fun <D, F> retry(block: () -> D | F): D
should ideally look like assuming that it simply retries endlessly on failure?
r
I guess a simplistic version of that is just:
Copy code
suspend fun <D, F> retry(block: suspend () -> D | F): D =
  when (val df = block()) {
    is D -> df
    is F -> retry(block)
  }
is that what you mean?
e
Yes. Now, what if we refactor this code and add
val df: Any
type. Would
is D
and
is F
check still work and should it be allowed? Now how
df is SomeInterface
is going to conceptually work when, at runtime, both
D
and
F
implement this interface?
Note, that in Kotlin today
is T
performs a runtime type check. What I see in this code that union types in your example will (should?) allow for
is
to perform a compile-time-aided (for a lack of a better word) type check. Thus interaction between compile-time and run-time type systems becomes important.
Would it be clear or confusing for developers to have an
is
operator that does slightly different thing in different contexts? Should we have two different operators or can we design it so that there are no error-prone ambiguities?
r
is D
or
is F
can be allowed if
df : Any
if you add an else clause to become exhaustive and that would work in the same way as it works today if types were reified in this case. But in the case of the compiler knowing the match is exhaustive it can be turned into an int indexed table-switch
The reason that would work is because all types that extend A? are also A | null. We need the else because we ascribed Any explictly and that is an upperbound of the union but it does not covers the null case
or other cases but still Union : Any
The is operator can also capture types like
is (A | B)
since Union would be a synthetic type like
A?
is used for nullability to represent Nullable A
all discussed may apply to intersection too.
A & B
The type inference rules ans laws for these kinds of unions are described here https://dotty.epfl.ch/docs/reference/new-types/union-types-spec.html#type-inference
u
What I find confusing in these example is the semantic : A | B is it a type (even if anonymous) or a typealias? So Pet = Cat | Dog and MyPet = Cat | Dog are the same type? If we are happy they are different types everything is easy to understand. If they are the same type I can imagine tons of strange combinations that would make my choice hard to understand.
r
if they are type aliases then they are the same type
kotlin has no notion of newtype or abstract dependent type you can create without defining a class, interface or object
also these are the same type:
Copy code
typealias Pet = Cat | Dog
Copy code
typealias MyPet = Dog | Cat
and these are also the same type:
Copy code
Pet? == Cat | Dog | null
Copy code
MyPet? == Dog | null | Cat
And this is a type error:
Copy code
Dog | Dog
because is the same as just
Dog
If you had animal in this hierarchy this is ok
Copy code
val animal: Animal = myPet // Cat | Dog
Because Animal is in bounds before Any for both Cat and Dog
same for
Animal?
if it was C`at | Dog | null`
there is no reason why typealiases play a roll in union types in my opinion
we are mixing newtyping with union types
union types can be aliased when you want a clear name. @Uberto Barbini is this related because in Haskell you use
|
to separate what in kotlin we do as sealed class hierarchies?
what I mean is that these don’t have the same semantics:
Copy code
data Bool = False | True

typealias Bool = False | True
| in haskell separates constructors but in Kotlin just means choice of types over a single value
u
Exactly as you said @raulraja typealias have a completely different semantic from "reified" union types. I'm confused what is the discussion here about.
it started with a typealias proposal, which is fine but I consider it a kind of "inlined" type that can be useful in some occasion but not really something I would like to see creepying along all my code in Generics etc.
readData(): Data | Failure
  As a dev that use Outcome<ERR, T> everywhere in a big codebase 8 hours a day for years... I don't see a big advantage here using typealias. Not sure why the last example 
fun <D, F> retry(block: () -> D | F)
would be preferrable over a real type Outcome, and I can think to several reasons why having a specific type would be better. My motivation for union types would be more having stuff like:
Copy code
parseJsonValue(): Int | String | Date | Node
Where having a sealed classes just to throw them away seems unnecessarily and a typealias would do ok.
r
Same here
l
@raulraja For the sake of readability, can you avoid putting many messages next time, making a few big messages instead? Right now, I have your name in bold all over the place breaking the reading flow, and making your writing seem overwhelming.
👍 3
r
sure thing, I can see that it’s annoying. Also looks to me like Slack could group messages like that too and show the face once.
l
Maybe their support would consider such a feature request, yes. In the meantime, maybe you can merge the messages that can using the edit feature? I think it'd be nice for your readers 🙂
r
There does not seem to be an automated way to do that so for this time I send you all my most sincere apologies 🙏. I’m passionate about this because all the time I put on it and did not realised the spamming
z
I feel like right-hand-side literals like
true
or
null
wasn’t in this proposal and sort of confuses the discussion here. I’d suggest dropping that for now
r
@Zac Sweers Can they be ignored given the Kotlin hierarchy already has
Any?
z
I find it would be simpler to just denote nullability on the type if the instance could be null. Sort of how you wouldn’t type a generic as nullable. There’s precedent for this with generics, you wouldn’t write this
fun <T?> foo(): T where T : Serializable
or if you did an anonymous one (though I don’t personally think these should be in scope for this proposal)
fun foo(): (Dog | Cat)?
r
Since this is already valid kotlin how many possible values are in Foo?
Copy code
typealias Foo = String?

sealed typealias Foo = String? | Int?
z
sealed typealias
could have different rules, same way
fun interface
or
inline class
have different rules from regular interfaces or classes
r
you mean disallow nullables in types?
z
in the RHS definition, yeah. Same way it’s disallowed on the type variable in a generic function. It can be declared as nullable at the use site instead
Copy code
sealed typealias Foo = String | Int

fun returnsNullableFoo(): Foo?
r
what happens in cases where you have to infer by the return of many functions that each return a union in different shapes. would in that case flatten all types and just add nullable? What would we expect in this case `z`to return as inferred type? If the expected result is
(A | B | C)?
then the typealias restriction you propose would work.
Copy code
fun a(): A | B | C
fun b(): (A | B)?
fun c(): C
fun z() = if (predicate) a() else b() ?: c()
u
Copy code
A | B | null
seems much better than
Copy code
( A | B )?
to me
r
@Uberto Barbini that is the reason why in the plugin we present it like that with masks. We found infix | imposes parents that make the code less readable unless it’s all flattened
👍 1
z
Another thing I thought about - Java even has sort of a syntax for this via merged exception catch blocks
Copy code
catch (ExceptionA | ExceptionB | ExceptionC e) {

}
Supporting this could facilitate supporting multi-catch blocks too: https://youtrack.jetbrains.com/issue/KT-7128
👍 3
Thinking more on this - if anonymous expressions were allowed a la
|
, then you don’t need to denote typealias or anything. It also becomes easy to write
Copy code
typealias Pet = Dog | Cat

fun pet(input: Dog | Cat): {

}

fun getPet(): Dog | Cat {

}

val Shelter.getPet: Dog | Cat get() = ...

fun (Cat | Dog).pet()
It gets harder to read with higher order functions, but that’s an existing cost anyway. A complex example
Copy code
fun (Cat | Dog).pet(chooser: (List<(Cat | Dog)>) -> (Cat | Dog)) {
  // do stuff
}
r
I believe some of these are unnecessary since if there is a real token in the grammar for infix types the compiler should be able to parse this correctly:
Copy code
(List<Cat | Dog>) -> Cat | Dog
As these would be equivalent to a desugared version such as:
Copy code
(List<Union2<Cat, Dog>>) -> Union2<Cat, Dog>
e
Indeed, making it work in a parser it not at all a problem here. You would not need all those braces inside
<...>
but whether you need to put braces to the right hand-side of
->
could be an interesting (albeit minor) design issue. The major design problem here is how union types would interact with Kotlin type system and generics.
z
wouldn’t the type in this case be treated as
Any
until type checked?
l
I'd even say
Any?
r
A proper compiler support for unions implies union is a member in the type hierarchy for all other things to work in sub-typing so if Kotlin had unions they would have to be on top of the hierarchy below Any and having
Any? == Any | null
and
Any | Any == Any
etc, unions and intersections can’t be implemented any other way in the current Kotlin type system without breaking generic type constrains in generic functions which are based on the dual of the union, the intersection. I think a good baseline for what Unions can look in Kotlin is the Scala dotty impl and spec since Dotty and Scala are also subtype based and Kotlin does not need to account for path dependent types or other esoteric features. https://dotty.epfl.ch/docs/reference/new-types/union-types.html https://dotty.epfl.ch/docs/reference/new-types/intersection-types.html
@Zac Sweers
wouldn’t the type in this case be treated as 
Any
 until type checked?
The super type of a union type is where their hierarchy intersect. Not sure if I understood your question but in the case of something like
Dog | Cat
the compiler knows the bound is
Animal
in the same way as sealed classes. Union types untag sealed hierarchies but the inference rules are the same since they are still constrained by sub typing. @elizarov Regarding your initial question about generic constrains in functions they should naturally work in bounds if the rules regarding subtyping and making them part of the hierarchy are respected. For example in this case:
Copy code
fun <A: Animal, B: Plant> foo(creature : A | B)
Both A and B are constrained, used alongside a Union and don’t necessarily need a common parent. If they had
Creature
as parent the Api of
Creature
is available, if they don’t their upperbound is still
Any
in all cases. Was that your concern about mixing subtyping, generic constrains and unions or did I missunderstood? Thanks.
z
yeah the point was to treat them the way generics are - compiler gives you safety but at runtime they’re erased
👍 1
l
Wouldn't union types be helpful for HMPP? Could it help for types that are aliased to different ones based on the target, like the infamous
NSInteger
that is
Int
on watchOS and iosArm32, and
Long
otherwise?