I was wondering if value object declarations have ever been kotlinlang #language-proposals

I was wondering if value object declarations have ...

Wyatt Kennedy

02/15/2022, 5:51 AM

I was wondering if value object declarations have ever been discussed within kotlin? My bad if I'm beating a dead horse somewhere. I've discussed this a few times over the years with coworkers about garbage collected languages and the answer is always, "If you need control over memory locality, you're using the wrong language," but that just seems like an incomplete answer. I'd like to discuss more technically what prevents this type of control, because I'm not super involved in the development of garbage collectors and don't quite understand the technical limitation or the desire to avoid value types. Why must languages with a clean syntax and languages with control of memory locality be mutually exclusive? What I'm proposing is the ability to declare that a real property for an Object (not a primitive) within a class is actually not a reference and is instead expanded as part of the class within memory like in other static compiled languages. Obviously, this is to reduce heap allocations and garbage collections and to improve cpu cache performance in places where performance is critical. This could just be a keyword or symbol applied at the declaration site of a property. Here is some possible syntax and the semantics for what I'm proposing:

Copy code

class TestInnerClass {
    var x : Float = 0f
    var y : Float = 0f
}

class TestClass {
    val bar : TestInnerClass! // the exclamation point makes it a value property declaration. Initialization rules are the same as type TestInnerClass
    var bar2 : TestInnerClass! // ERROR : value types cannot be mutable, since they aren't references
}

fun semanticExamples() {
    val foo = TestClass()
    foo.bar.x = 6f

    val test : TestInnerClass! = foo.bar // FINE: When used on a function local variable, TestInnerClass! is essentially a const ref in c++
    // when compiled, all usages of test can be replaced by foo.bar.

    val test2 : TestInnerClass = foo.bar // ERROR references of this type can be assigned elsewhere, would allow undefined behavior when foo is destroyed.
    val test3 : TestInnerClass? = foo.bar // ERROR same as above as well as not being able to be a nullable reference
    
    val foo2 = TestClass()
    foo.bar = foo2.bar // ERROR: value properties cannot be assigned because they are not references and implicit copy constructors do not exist

    takesTestInnerClassRef(foo.bar)
    functionTypeExample {
        foo.bar.x = 6f // the value type can still be referenced because the reference to foo is enclosed.
        test.x = 6f // ERROR: cannot be sure that the owner of test hasn't been garbage collected
    }
}

// Types TestInnerClass and TestInnerClass? are demotable to TestInnerClass! which is least permissive
// The value type "!" annotates that the reference cannot be assigned to anything.
fun takesTestInnerClassRef(testInnerClass: TestInnerClass!) {
    val test : TestInnerClass! = testInnerClass // still valid because they are basically just const refs

    val foo2 = TestClass()
    foo2.bar = testInnerClass // Error: value properties cannot be assigned to as mentioned above
}

fun functionTypeExample(func: () -> Unit) {
}

Most of these are simple semantic rules in the type system, the only compile issue being able to use indirection in JVM without a reference being managed by the garbage collector, (not sure if it doesn't already support this). This syntax being purely ad-hoc discourages it's use unless someone explicitly needs better control of memory locality for high performance sections. If you also introduced a value generic, you could also use this for compile time fixed length arrays of value objects should people want them.

Copy code

//   a way to declare value generics, I'm sure this has been discussed elsewhere
//                          |
//                          v
class FixedArray<T : Any!, val Length : Int> {
    // ...
}

class Test {
    var x : Float = 0f
    var y : Float = 0f
}

fun valueArrayTest() {
    val foo = FixedArray<Test!, 6>()
}

While the semantic rules could apply to all compilation targets, (JS, JVM, Native, etc), you could just provide a warning that value annotations will be ignored in environments where it's simply not possible to enforce, (probably JS). Is any/all of this something that has already been made permanently off the table?

Ilmir Usmanov [JB]

02/15/2022, 7:20 AM

Sounds very similar to https://github.com/Kotlin/KEEP/blob/master/notes/value-classes.md, however, there will still be boxing (allocations), like for inline classes.

mcpiroman

02/15/2022, 11:44 AM

What you write about is partly available already as value classes, although their are more limited, mostly, they can have only 1 property. Allowing having more is possible and considered (IIRC), it's just complex to properly design and implement. As to general topic of densely packed values in GC-languages I do agree. I assume they have been historically left out because they were not-needed-enough for the complexity of they implementation. One counter-example though is C# (.NET) which has had C++ like `struct`s - along with JIT-time specialization - for a long time and it's doing fine. However there are recent moves in this field, namely: • JVM - There is a very famous project Valhalla which aims to provide native value classes for JVM and Java, mostly like in .NET but exclusively immutable. • JS - There is a proposal for Records & Tuples, which at least make flattening of objects more predictable (although there are no hard guarantees, it depends on the engine). They are also immutable. For native (and WASM) targets, kotlin could have flat representation OOTB, however I guess the team is mostly waiting for Valhalla to implement that, so that semantics on all kotlin platforms match.

Ilmir Usmanov [JB]

02/15/2022, 6:56 PM

Have you read the document I linked?

Wyatt Kennedy

02/15/2022, 6:58 PM

Still getting through it, there is quite a lot in that KEEP. Thus far I believe the concept I'm proposing is mutually exclusive. Give me some time to finish the KEEP and make a better response.

Wyatt Kennedy

02/15/2022, 7:44 PM

@Ilmir Usmanov [JB] I've read through most of the KEEP, thanks for pointing it out. I believe I understand the approach and I don't think it is mutually exclusive with what I've described. A value class is a way of declaring outright that all of an objects data is truly immutable. References to value class are implicitly replaced on changes to ensure other references to the data type don't have subtly different results. This does NOT preclude the use of the heap. For example:

Copy code

value class Test {
    val bar1 = 0.5f
    val bar2 = 0.5f
}

value class TestOuter {
    cov /* copy val*/ test = Test()
}

fun test() {
    var foo = TestOuter() // here foo is heap allocated, and lets say it points to 0x1234
    foo.test.bar1 = 0.6f    // were test not a value class, it would simply mutate the object
                            // however a new instance of Test is implicitly heap allocated, and test now points to 0x4444 (somewhere in heap)
                            // to hold the new result.
}

// results in heap allocations like so
TestOuter@0x1234 [
    Test@0x3232 // originally pointed to TestOuter@0x3030
]

Test@0x3030 [
    bar1 = 0.5f
    bar2 = 0.5f
]

Test@0x3232 [
    bar1 = 0.6f
    bar2 = 0.5f
]

My reservations with this syntax aside, (implicit heap allocation is a dangerous thing), it actually has valid use cases that are not related with my suggestion. My suggestion would be to enforce certain objects not being allocated to heap altogether, but with many semantic restrictions to enforce it. For example:

Copy code

data class Test (
    val bar1 = 0.5f
    val bar2 = 0.5f 
)

class TestOuter {
    var test = Test!()
}

fun test() {
    var foo = TestOuter() // here foo is heap allocated, and lets say it points to 0x1234
    foo.blah = Test(2.0f, 3.0f) // allow only constructors
}

// there is then only one heap allocation.
TestOuter@0x1234 [
    // Test implied
    test = {
        bar1 = 2.0f
        bar2 = 3.0f
    }
]

This prevents the heap allocation of an instance of test altogether, removes an indirection when accessing properties, and makes it impossible to produce a cache miss because of that lack of indirection. If you can have fixed length arrays, this becomes very prudent. For example:

Copy code

fun test() {
    // with normal references
    val foo = Array<Test>(6)
    for (i in 0..foo.size()) {
        foo[i] = Test()
    }

    // with value types
    val foo2 = FixedArray<Test!, 6>(::Test) // You will probably require that a default constructor be required, or some other way to provide an initializer at compile time not exactly sure what the syntax might be around this
}

// first type
Array@1234 {
    Test@3232
    Test@3434
    Test@3535
    Test@3636
    Test@3737
    Test@3838
}

// ... and then six more allocations that look like this at each of those references scattered across heap. This can result in cache misses on iteration.
Test@3232 {
    bar1 = 0.5f
    bar2 = 0.5f
}

Wheras the second example is a single fixed length allocation.

Copy code

Array@1234 {
    {
        bar1 = 0.5f
        bar2 = 0.5f
    }
    {
        bar1 = 0.5f
        bar2 = 0.5f
    }
    // etc
}

The tradeoff is that the references cannot simply be passed around and stored in other objects, as there lifetime is not managed individually by the GC and this can be enforced semantically, but then you benefit from many fewer cache misses.

Wyatt Kennedy

02/17/2022, 1:11 AM

After reading about project valhalla and reading some documents detailing jvm opcodes, it looks like this is entirely impossible. You could resolve all of the properties for value types and apply them to an object when the AST is turned into jvm code, but there is no way to do anything useful like inheritance and passing references to the "child" structures down the stack that are not managed by the garbage collector. Until the JVM changes to allow object references to unmanaged memory, we'll just have to live with poor cache performance. This isn't so much a problem in JVM since heap allocations are very fast and free is nearly costless, but in other targets like native, heap allocations are terrible until such a time that the garbage collector becomes competitive, (not likely). I suppose you could introduce common syntax like this to the language to indicate that a variable or object declaration can be "inlined" with the semantic rules to enforce it and then provide a compiler flag enabling/disabling for individual targets whether it actually does it. This type of action would significantly improve performance in native, but may negatively impact performance in jvm, even if the jvm supported it.

2 Views

Open in Slack

Previous Next