It seems kind of concerning that everything in the...
# announcements
n
It seems kind of concerning that everything in the standard library that counts things, returns integers, accepts an index, etc, always seems to use
Int
. This seems like a pretty odd choice. Is there any way to instead have the standard library use a 64 bit integer?
c
The vast majority of use cases will not go outside the bounds of int, so it would be more strange to use long.
n
uhm why would it be more strange?
The vast vast majority of Java is running on 64 bit systems. There's likely no performance penalty to returning a Long instead of an Int from a function. Even if you only need 64 bits ten percent of the time, what's the benefit to defaulting to 32 bit?
e
would break Java compatibility (which can't create arrays or strings or collections larger than Int.MAX_VALUE - epsilon anyway)
n
I guess that's fair for an index
and containers that size are especially rare
but something like
count
?
j
Isn't the JVM still based around a 32bit stack? Maybe it's updated 🤷
n
Err I don't think so. They have builds for both, generally, but I'm pretty sure 64 bit builds of the JVM are far the majority now
e
even on 64-bit architectures, JVM uses 32-bit "compressed ordinary object pointers" for heap sizes up to 16GB I believe
n
sure, for the pointers that's beneficial because you have a huge number of pointers, and that lets you use less memory generally
e
and Jon is correct, the operand stack is natively 32-bits regardless
n
wow, still? That's incredibly unfortunate
you have far higher performance languages defaulting to 64 bit integers generally, so it's unfortunate to see this
IME 32 bit overflows are not all that rare, and sometimes a huge pain to debug
e
the size of Int is explicitly 32-bit in the Java Virtual Machine Specification
n
Sure, I figured that from how it's specified, but not using e.g.
Long
for the return type of
count
for example, is a shame
Quite possibly the right decision given Kotlin's Java/JVM related constraints, but in the grand scheme of things, a shame
e
how would
.count()
return a larger value than the possible size of a collection?
c
Can always add your own extension if you feel it’s necessary.
n
you mean I guess "why" not "how"
because the how is pretty easy
The point is not really whether count itself is likely to give results bigger than the size of a collection, the point is once you start doing arithmetic with the results of count, everything defaults to the type that count already has. It doesn't seem like Kotlin does any promotion like Int * Int -> Long
c
That’s a developer decision though. If you’re going to use it in a context that requires long, you need to handle the conversion.
n
Yeah, every piece of code I write is a developer decision... how does that excuse defaults that make needless developer mistakes more likely?
c
By that logic we could make everything a BigInteger and be done with it.
n
No, because there is a large performance hit to BigInteger...
c
My point is that it’s not black and white. A decision needs to be made weighing pros and cons, which is what the language designers have done.
n
Yes, and the pros and cons don't have anything to do with what you suggested... they have to do with what ephemient said
it's just compat with Java/JVM. If you were doing a language "from scratch" designed primarily (overwhelmingly) for 64 bit systems you would use 64 bit integers in all these places
c
So you’re satisfied then?
n
Not sure how that's relevant
c
Just don’t want to continue discussing it if your question is answered.
n
even if my question is not answered, you don't need to continue discussing it
c
🤷‍♂️
r
This seems like a classic trade-off between storage and performance. It is not "shameful" to prefer the former, especially given that its likely most, if not all, of the performance losses of 32 bit ints on 64-bit platforms have been mitigated by clever VM designers.
k
If your question is specifically about Kotlin, then the answer is JVM compatibility. If you want to move the goalposts in the middle of it and talk about a new greenfield language designed from scratch, that's probably not very fruitful given the focus of this slack place on Kotlin.
n
@Kirill Grouchnikov I totally agree that's the answer, it's very obvious. I just find it strange that people feel the need to try to defend the decision, outside of those boundaries.
@rocketraman it's not really storage vs performance at all... there's a huge correctness component here. 32 bit overflows are not all that rare. I won't say it's the majority of integer usage, but I have run across it many times in real codebases.
k
What's the point then? Some abstract discussion of what an "ideal" language should look like? 32 bit overflow in 2020 is 64 bit overflow in 2030. If there was such a thing as an ideal language, people and companies wouldn't be starting work on new languages all the time.
👍 1
n
you can always blame the dev of course for not considering when something could get "big" but the whole point of language design is that humans are fallible, we try to make choices to minimize rather than maximize that
e
it's a balance between lots of things. 32-bit pointers/indices is still better for the majority of applications - less memory pressure is better performance. (doesn't quite apply to x86, as the change in registers and calling convention during the move to 64-bit has a big impact)
n
the point was to understand it. And it's ok to admit there's something sub-optimal about a language because of practical constraints, we don't need to go to a place of "what's the point of these statements"
e
but for the JVM, where nearly everything is a pointer... 32-bit it is.
n
for pointers, sure, it doesn't mean that you couldn't use a 64 bit integer everywhere for convenience
r
Integer overflow is not a huge, or even a minor, problem in the typical use cases for Kotlin.
👍 1
e
the standard library does use Long in places that are expected to be larger - file I/O, for example
n
citation needed
integer overflow, is a minor but present problem in most sufficiently large codebases
@ephemient yeah, I understand the reasoning the stdlib applies, it is consistent so that will be helpful. I just need to be very careful when I start doing any non-trivial arithmetic I guess, more than I would usually be
At any rate thanks to you and Johnathan for answering the original question. I would say so far, 100% of the time something strikes me as strange about Kotlin, the answer to why it is that why, is Java/JVM 🙂
👍 1
k
And that is an immense strength of the language - that it builds on top of something that is quite powerful and ubiquitous
👍 3
e
if you want something actionable, I'm sure you could make a feature request for Kotlin to optionally (when targeting Java 8+) compile arithmetic operators to Math.addExact() etc. which check for overflows/underflows
n
Nobody is denying that it's a strength, but it does in some cases force things that are sub-optimal. Almost like it's a trade-off 😉
@ephemient yeah, that would be a very nice feature IMHO
https://discuss.kotlinlang.org/t/checked-and-unsigned-integer-operations/529/4 what I was able to find on it, doesnt' seem like ti was discused in terms of an optional feature
e
Java's Math.addExact() etc. came in Java 8, which Kotlin didn't yet target at that time. in any case, as long as Java 6 and 7 are targets, Kotlin can't rely on that feature
n
Sure, makes sense, I'll try to raise this somewhere, thanks for the suggestion
r
The first thing you're going to get asked is... show some evidence that the benefits are worth the additional complexity and runtime overhead. I think you'll have a hard time making that argument, especially when
Math.addExact
already exists as a developer option for the use cases when its important. It can't hurt to try though...
The best place to start will probably be #C0B9K7EP2
n
@rocketraman No kidding 🙂. You may not be aware of it, but the fact that integer overflow is a minor but existent problem is pretty much consensus, which is why e.g. C++, Rust, C# (to name a few) all support compilation modes which trap integer operations
Having a compilation mode that does this project wide, for use in debug builds is a lot more practical than deciding to use addExact in every situation, and unlike addExact has no cost in release builds (which is why, again, all these languages offer this as a compilation option)
r
<sarcasm>It's also well known that debug builds catch all problems before they get to production.</sarcasm>
Clearly, to get something accepted into a language implementation, you have to prove it catches all problems 😂
r
Obviously not, which is why I labelled it as sarcasm. The point wasn't that it invalidates your argument -- it just means that the bar will be that much higher to get that feature into the language, as you're talking about solving an (even) smaller set of problems.
e
there's precedent set by
@Strictfp
... sorta. this would be a more invasive change
n
it's not a smaller set of problems. It's a really useful tool for debugging programs, packaged in a separate compilation mode, the same way that there are already many useful things packaged as compiler options for detecting problems, even if they have costs in compilation time, binary size, or performance.
You're making this out to be some enormously high bar, hugely philosophical whether it's worth it, etc... it's not any of these things. It's very clearly a useful things, which is why most mainstream, non JVM languages support it.
That doesn't mean I expect someone to run and implement it, maybe there are even more valuable things to do
@ephemient i'm not sure how it's invasive if it's a compilation option that's off by default. C++ compilers mostly all implement integer trapping as a compiler option, not even as part of the standard
r
So you're saying solving overflows only in debug builds is not "smaller" than solving it in all builds? Now you're just arguing semantics and this is getting pointless. Make your argument to #C0B9K7EP2 and or on the JVM mailing lists, not here.
e
I mean, implementation-wise.
@Strictfp
just sets a flag on the method, this would involve deeper compiler changes
honestly it would be pretty easy to implement after the fact with bytecode rewriting
easy to do yourself instead of touching Kotlin
r
Could it be done in a Kotlin compiler plugin?
n
dude, if it's pointless it's not necessary to keep arguing with me 🙂
@ephemient probably, but I'd expect at least 95% of devs would probably never ever do this on their own, probably higher. If it's easy to implement maybe they would do it. It's probably relative straightforward, backwards compatible, decent value... but who knows. I'm actually pretty surprised that Java 8 added the "addExact" functions but not a mode that does this universally.
e
I mean, like it or not, Java has always explicitly defined arithmetic to wrap
that would be a breaking change for anybody emulating unsigned arithmetic in Java
n
ah I see, I wasn't aware of it
unfortunate
r
I'm not arguing, I'm interested in the subject, and in language design. I'd definitely read your proposal if you actually made one.
n
Here's the proposal: add a flag to the compiler that if passed, causes integer overflow to throw or abort. The end 🙂. This isn't putting a man on the moon.
@ephemient so the flag would make your program behave in a non-conformant way for Java (and I'd assume Kotlin)
but it's a flag that defaults to off for debugging purposes, that's probably allowed no? There isn't precedent for that sort of thing
?
r
If 95% of java devs wouldn't do this on their own, then why would those devs update their builds to add the compiler flag? Complicate their CI and deployment systems to deal with different test and prod build?
e
no, it would have to make
operator fun Int.plus(Int)
translate to
Math.addExact()
instead of the bytecode instruction
iadd
and it would have to be scoped so that it is still possible to implement unsigned arithmetic… or we assume that
UInt
etc become stable and the standard library is always compiled with the flag off
n
@rocketraman.... because it's waaaaaaaay less work to do that? Lots of places may already have debug/non-debug builds anyway? debug/non-debug builds is 100% standard practice in many languages so people are used to the idiom? Should I go on? This comparison is silly. "if a dev won't sit down and dive into the compiler and write a plugin which th ey've probably never done before, why would they spend 10 minutes updating a build matrix". I mean come on.
e
(translating to anything else would have a significant performance penalty)
n
I didn't realize that Kotlin implemented unsigned in terms of signed
r
Dude, the comparison isn't between writing a compiler plugin and enabling a flag. Its between using a compiler plugin (or post-compilation bytecode manipulator) that you download from GitHub, and enabling a compiler flag. Not much difference there. You're right, this is getting silly.
👋 1
e
how else do you think it's implemented??
n
I thought it was a native sort of thing, tbh. My perspective is mostly from C++
Obviously the JVM presents you with a certain set of primitives, from my perspective it would make sense if the JVM provided both kinds of integers among those primitives, given that they both have representation in the underlying machine
e
the JVM only has one integer type and one long integer type, both signed. boolean, byte and short are all represented by int.
and the underlying machine typically only has one integer type - the only difference is which instructions are used on it
n
learning more and more. I guess the idea is to keep the JVM implementation somewhat simpler and push more of the implementation into the language, which is reasonable as well.
sure, yes, you're correct.... I mean assembly doesn't really have types at all
it just uses an instruction and assumes that whatever is in whatever register is the thing
so to the extent assembly recognizes types, it does recognize signed and unsigned separately
anyhow though, it's pretty interesting. Now I want to look at C# and see if they also have defined signed integer to wrap
e
x86, for example, has imul/mul and idiv/div, but only a single add and sub instruction
because in 2's complement it doesn't matter if it's signed or not
n
right, but it makes the distinction at least part of the time
so in that regard, it seems like the JVM is potentially leaving some fairly free performance on the table
a very very tiny amount, probably 🙂
e
well, only for people that care about unsigned multiply/divide and have to implement it out-of-line
n
right. I am very glad that Java defaults to signed integers everywhere. That's a well acknowledged mistake in C++, but I see new languages copying it unfortunately.
(java/kotlin)
Anyhow. I'll try to find a good place to make the suggestion, curious what the JB people will say. thanks for your help.