I see myself using `data class` a lot - but mostly...
# language-proposals
m
I see myself using
data class
a lot - but mostly for automatically generating
toString()
and
equals()
, sometimes
hashCode()
, rarely `copy`(). I never care about
componentX()
for destructuring but noticed that their existence can be confusing to people (and me changing the order of properties breaking their code). The general idea of having less boilerplate for simple repetitive tasks / common idioms is good. It would be awesome if developers could express with more detail what auto-generated functionality is actually desired and which is not. This also makes the actual capabilities and the intended usage of a class clearer than using a
data class
that adds functionality which may not always make sense. I know that I may be misusing data classes here and I guess I'm not the only one using them to avoid manual and repetitive implementation of
toString()
,
equals()
and
hashCode()
. In addition to having less boilerplate the compiler can generate well-optimized implementations which are even more work & code when done manually (think for example a good
hashCode()
implementation over multiple properties without boxing or platform dependency Γ  la Java's
Objects.hashCode(…)
). I like Swift's approach where a developer can for example state that a
struct
is
Equatable
or even
Hashable
and if all properties are too then the implementation is auto-generated by the compiler. Kotlin could do the same to make adding these method simpler. E.g. if I add
: AutoEquatable
to a class the compiler could auto-generate
equals()
for me over all properties - just like a data class and potentially using the same constraints as for these. Same for
AutoHashable
->
hashCode()
,
AutoStringConvertible
(or alike) ->
toString()
,
AutoCopyable
->
copy()
and
AutoDestructurable
->
componentX()
. A
data class
is automatically
AutoEquatable
,
AutoCopyable
,
AutoDestructurable
,
AutoHashable
and
AutoStringConvertible
. @gildor suggested to use annotations instead of interfaces, which is also a good option. There would be no need to change the language itself. The Standard Library would merely add a few interfaces and the compiler use them for auto-generating some code. Another alternative could be to use data classes but opt out to some of its functionality, e.g.
data(nocopy, nocomponents) class …
. That would require syntax and language changes though. What do you think? πŸ€”
o
i
This sounds like it would add complexity without adding a significant language improvement. A natural part of a new language is learning about new features it offers and these fall in that category imo. If you use a library written in kotlin these are just some of the things that come with it and sure maybe you have to learn a little there but that's a pretty low bar imo.
m
@Oleg Yukhnevich interesting, thanks! Not for Kotlin though πŸ˜•
o
Im talking about, if you need it, you can just write annotation processor, or when IR will be public - compiler plugin
m
@ian.shaun.thomas complexity for who? In the compiler most parts should already there for data classes. Developers can but don't have to use it. And since data classes need to be explained anyway this could just be part of it - a specific case of automatically function generation. It's not that there is something entirely new here. It would make more explicit and intentional what is already there. It sure would add a lot of improvement for me as I could stop misusing data classes.
It could be a compiler plugin, that's true. It would make your code dependent on it though for already-existing functionality. Plus you'd have to add another library for the interfaces. And these interfaces reflect core functionality of the language - hashable, equatable, etc.
i
because this would require changes to the compiler, it pushes more knowledge requirements onto the developer 'remember all the options', and lastly in most situations the availability of things you never use is far better than the unavailability of things you need. For instance, as a consumer of a library, it sucks when some model only implements equality but doesn't implement the hashCode or only implements copy but not equality because that's all the library needed and then leaves me stuck with the trade off of dealing with that.
m
If my class implements
equals
and not
hashCode
then there must be a reason for it. Just because you can compare to objects doesn't mean that they have a meaningful identity and that you can generate any meaningful hash code.
Same for copy. Just because you can copy something doesn't imply that it makes sense to compare two objects by value.
It depends on the use case / meaning. And that is only known by the developer, never by the compiler.
k
@ian.shaun.thomas that is why KEEP 87 is important, extending types with features is something we should consider so you can request the compiler to generate a hashCode implementation for the library specific data class
k
This is exactly my opinion about data classes: I want
equals
and
hashcode
, rarely the
componentX
functions and even more rarely the
copy
function. There's some discussion from a couple of days ago here: https://kotlinlang.slack.com/archives/C0922A726/p1559163411143400?thread_ts=1559161518.140900&cid=C0922A726
The
copy
function is not only useless sometimes, it is actively harmful: it disallows private constructors for data classes.
i
do you have an example of when the lack of a private constructor support is harmful for a data class?
m
I agree. It's allowed though but there's a warning. But it shows that we don't use data classes for simply holding dumb data but that often we use it in a little more complex scenarios which seem to require a private constructor, a specialized copy or other functionality somehow. That's not pure data holding anymore.
k
@ian.shaun.thomas Yes, sometimes I want a class with a couple of properties that can't be freely constructed but it should have an equals implementation. It's not harmful, you could remove private constructors entirely in the language and everything would still work too.
g
What do you mean
equals(), sometimes hashCode()
? Isn't they should be always implemented together to follow contract of hashCode?
βž• 5
k
I hope that's a typo simple smile
g
I like idea to have more flexible approach for this, but I think that use interfaces is not good idea, especially interfaces without methods (equals, hashCode and toString are already in Any) I believe one annotation with properties and compiler plugin would be enough
βž• 2
m
@gildor in most non-library projects I don't bother implementing
hashCode
unless it makes sense to have an object as a map key or in a set or alike. I handle that more like Swift does (Hashable implies equatable but not the other way round). I always found it unfortunate that everything in Java is both equatable and hashable even if there is no meaningful way to do so. I wish there would be proper interfaces for that. In most classes hashCode is provided though because I use data classes for sake of being lazy with equals implementations πŸ˜›
😱 4
@gildor for
equals()
,
hashCode()
and
toString()
they can be stated in the interface. It's impossible for
copy()
and
componentX()
though πŸ˜• What would you suggest? Annotations? I don't think that this is worth additional keywords.
k
Definitely not interfaces, I think we learned that with
java.lang.Clonable
. I'd be happy with annotations either like
@Derive("equals")
or
@GenerateEquals
(both implying
hashcode
too).
βž• 2
And then
@Derive("component")
and
@Derive("copy")
.
m
That's funny. ObjectiveC had the same contract between
isEqual
and
hash
. With Swift that was changed to
Hashable
being an explicit trait rather than being implicit. I don't know if that's possible with Kotlin though looking at backward-compatibility πŸ˜•
g
Yes, annotations, like with @Parcelize or @Serializable
m
@karelpeeters I'd suggest distinct annotations then because otherwise you have to remember the right string value rather than using autocompletion.
k
It's definitely cleaner to have a distinction between equality and hashability but it's way to late for that in Kotlin and wasn't possible in the first place looking at Java interop.
@Marc Knaup It would autocomplete the string of course simple smile . I don't have a strong opinion either way.
πŸ‘ 1
m
Interop shouldn't be the problem. Swift also has ObjC interop and there was no outcry about the contract being broken in some cases πŸ™‚
But agree regarding backward-compatibility. Kotlin is where it is now and removing the implicit
Hashable
is probably a looot of work for everyeone.
g
Using strings as param is definitely too error prone, it may be annotation with array of enum values, or just separate annotations (or both)
m
For years I had my own
Hashable
and
Equatable
interfaces and added them to all classes where it made sense, to tell class users about the intended use of the class - whether it makes sense to equate/hash them or not 😡
Right now you can guess for most classes what the result of comparing them is. Is it default comparison by identity? Is it by value? What properties? What happens if I put them in a
Set
? I don't like the current state of that πŸ˜•
g
Do not follow hashCode/equals contract is really bad thing, not sure why do you think it would be somehow beneficial. How you know will be this class used for Map or not? Or for any other code which may require hash
πŸ‘ 2
m
Assuming no class to behave correctly and marking the actually behaving ones with `Equatable`/`Hashable` (or documenting it) seems more safe to me than relying on that everyone is following that contract which likely isn't the case. Some classes may be very expensive to compare and hashing them is bad. You can still wrap it into your own hashable wrapper. It doesn't make much sense to me to spend much time on implementing a function for classes where they are most likely never used - spare some programmer errors. Again, libraries, esp. public ones, are different because you don't know the environment it's being used in. In non-library projects it's just a convention to not rely on it except for where it's needed - in which case you spend a little extra time on implementing / research if implemented/supported.
Btw, this is an interesting topic but a different one. I suggest to get back to autogenerated common functions πŸ™‚
r
@Marc Knaup All that is what KEEP-87 proposes https://github.com/Kotlin/KEEP/pull/87
πŸ‘ 2
you are describing type classes
A bridge interface defines functions that are parametric over a types. The type implements those via structural discovery or a bridge contract and then the syntax is projected over the data type wherever it’s needed
k
Except the actually generating the function automatically part, right?
r
no, that is a feature of type classes too
you can have type class derivation based on structural types
for example if a type has a method like
map
it can also get for free
lift
and others
it’s the same deal here, if a class it’s a product of N arity with these types you get a copy for free
Type classes project a family of extensions over a type without using inheritance but they can be activated automatically which is usually called derivation or manually if a user provides an impl
m
I've seen that. This proposal is more about the compiler-based auto-generation though and ideally not dependent on KEEP-87. As far as I know typeclasses don't require altering the declaration of the extended type, do they?
r
they do if they are implemented structurally
for example
List.map
it’s compatible with the main method that can derive the syntax for
Functor
so you can match structurally and then inject in members or as extension functions whatever is more convenient for usage
m
In the case for this idea the type needs to be annotated in some way that triggers the compiler to auto-generate some functionality. Does KEEP-87 propose any syntax which would be suitable for that?
r
then List all the sudden gets all other functions beside
map
for free
@Marc Knaup we discussed auto derivation which is the part you describe but we have not implemented because jetbrains has not yet said whether they like as propossed or it’s going to go in a different direction but Andrey commented on Reddit he liked type classes thought he didn;t know if they’d end up as proposed
m
In that case interfaces or annotations need to do for now but could be revisited later if KEEP-87 provides a better alternative, right?
r
yes, my point was that if you want auto derivation you may want to propose it as a companion to KEEP-87 because regardless of the proposal getting accepted if type classes land in Kotlin in one shape or another it will definitely be coupled or tightly affect the feature you are proposing
m
But how they related to that KEEP except for having auto-generated code, which isn't much? You can't derive these functions from just looking at other methods or functions of the type. It has to be explicit.
r
Without type classes your
equals
etc implementations are hardcoded
or can they be customized by users that want to define what equality means for a given type?
m
equals
is always hardcoded as it's a method overridden from `Object`/`Any` πŸ€”
r
equals is not always based on structural equality
I though you would provide the implementations of equals in
AutoEquatable
but if it’s a marker interface and you are just injecting methods then it’s unrelated to type classes
m
No, you can't provide the implementation in the interface. Just like
data class
an
equals
will be generated based on all stored properties.
r
because type classes can do that based on inductive derivation of product types like data classes automatically but in addition allow the user to override the meaning of the method
In your case you are saying AutoEquatable will project equals based on structure over a Class automatically? but it can’t be a data class since that already has equals synthetically added by the compiler
m
data class
would just be syntactic sugar for a class which is
AutoEquatable
,
AutoCopyable
, etc. because that's all it is.
d
I think this is far complexifying data classes. The only problem is
copy
. The visibility of
copy
should be the same as the constructor it calls. I don't think anyone is harmed by the generation of componentX functions, hashcode equals contract complying functions and toString
☝️ 1
Even if you dont need it...
m
@Dico why would it make anything more complex? It doesn't change anything for data classes.
I don't want
copy
nor `componentX`/destructuring just to make a class equatable. Or maybe not even a custom
toString()
. Or have
equals
be implemented differently just because I get my
toString()
generated by the compiler.
d
The addition of 4 interfaces makes the whole thing more complex
m
It adds a new functionality (auto-generating implementation). It doesn't touch data classes.
d
Data classes are final for a reason, should all classes that implement AutoEquatable be final?
m
Maybe yes, maybe no. There are likely multiple options here. Data classes are likely final because of
copy
, not because of
equals
.
d
If you need something specific and explicitly dont want the other features, use alt+insert to generate what you need.
Well, there is a compiler option to make them non final I think
Or to allow it rather
With equals you have to ask then, should it accept subclasses?
m
Using an IDE-based generation is again a lot of repetitive work, is IDE-dependent, adds code and makes it less readable and you must reflect changes to your properties in the generated code which can also be forgotten.
d
How do you do that, add another annotation
r
data classes would have issues with
equals
if they are not final unless codegen account for all super properties.
not with equals but with anything really that is an ad-hoc contract based on property structure
d
My opinion is, there is too little benefit to doing this for the cost of complexity.
m
@raulraja data classes have the same problem with properties of supertypes already
d
The only thing that should be changed is visibility of copy because it can violate encapsulation
r
which ones?
m
@raulraja whether or not to include a supertype's properties in
equals
or not.
The potential benefit in my case is huge. I have an enormous amount of data classes which are just like that to get either
equals
and/or
toString
. That doesn't mean I want the other one. Or copy. Or destructuring. It just messes with the API for the user of the data class.
d
Then don't use data classes
m
And it's mostly simple classes. It shouldn't be necessary to manually add three method implementations to each single class.
Then don't use data classes
That's why I raise this idea…
d
Data class is not intended to work for all use cases
m
`class`: You get nothing. `data class`: You get these 5+ methods. Interface/annotations: You can choose between those methods.
d
Its intended to be simple and speed up coding, reduce boilerplate
Yeah, well, then you generate 1 or 2 manually instead of 5
Data class covers all 5 and assumes you'll use some of those functions
m
Less unused methods, clearer API & intent.
d
I dont think less unused functions matters
m
If a data class adds destructuring then users of the class will assume it's meant for destructuring. Yet their code can break because it wasn't.
d
The clearer api & intent is a potential pain point
Again, the benefits dont outweigh the cost imo
m
Same for copy. It may or may not sense to have it copyable. Only the developer will know whether it's a sensible thing to have.
What is the cost?
d
The cost is adding a bunch of interfaces that people will need to be able to know and understand when they are used
This doesn't really add anything important
Like a date & time api, which adds new functionality in the stdlib
There's more benefit there, theres a clear need for that
In comparison, this change seems not worth making is all I'm saying
m
Understanding the meaning of
AutoEquatable
,
AutoToString
or however they're called as interfaces or annotations shouldn't be difficult to learn and can even be derived from their name (you learn them once per language). Having to learn NOT to use some methods/destructuring of a class just because they happen to be part of a data class is much more learning (1+ per data class). Having to manually implement these is a lot of work (1+ per no-longer-data class). Having them generated by the IDE has its own downsides (see above).
βž• 1
d
No, theres no effort to not using functions you dont want to use.
m
The problem is not "not wanting" to use them. The problem is that people see that a class is copyable or destructurable and think it's intended.
It also increases the size of libraries for methods which are never [supposed to be] used.
d
Yeah and my opinion is that if that's a problem, you can use the ide to generate those functions. It's a compromise for not having all of the features you personally need in the language. I've stated my opinion, you can choose to ignore it, I will not engage on this topic further.
t
Just to throw in some prior art: Groovy has annotations that generate equals, hash code, to String, etc. These annotations also provide options to customise them. Groovy has support for AST transformations which makes these possible. In Kotlin that would probably mean writing a compiler plugin (or of course be part of Kotlin itself). See http://docs.groovy-lang.org/2.4.7/html/gapi/groovy/transform/Immutable.html and http://docs.groovy-lang.org/2.4.7/html/gapi/groovy/transform/Canonical.html and its associated annotations.
As for the need for this feature I can see the point: The extra methods for the data class are not just unused functions but imply how the class should be used: as data that can be controlled, copied and extracted outside the class. If you want to hide the internals of a class a data class is not the way to go: the copy, component and even toString leak the internal structure. Your proposal would fill this gap of classes that need some of the data class methods but do not entirely fit the data class usage pattern.
πŸ‘ 1