I've been writing a comprehensive `Matrix` class. ...
# announcements
h
I've been writing a comprehensive
Matrix
class. It extends
MutableList<Vector>
(
Vector
extends
MutableList<Double>
). This allows me to add really nice and intuitive operator functions for row operations like
Copy code
val A = Matrix(5, 5) { i, j -> 1 }
A.row(2) *= 3
A[0] += Vector(0.0, 1.0, 3.0, 4.0, 5.0)
println(A[2]) // < 3.0, 3.0, 3.0, 3.0, 3.0 >
However, column operations aren't going to work the same way because the row and col getters are as so:
Copy code
fun row(i: Int) = this[i]
fun col(j: Int) = Vector(numRows) { this[it][j] }
so a row operation is performed on the stored Vector from the MutableList, while the col operation is performed on a newly created Vector that wraps the corresponding values. Does anyone have any ideas how I could allow for col operations in the same intuitive way?
k
Make
Vector
an interface, and implement a special type of vector that doesn't store values itself but is backed by the original
Matrix
.
You could do the same for
row
and then change
Matrix
to store a flat
DoubleArray
instead for a lot better performance.
✍️ 1
h
That's a great idea. Thanks a lot
it is surprisingly easy to convert classes into interfaces. you can transform constructors into just top-level functions.
k
Isn't there an intelij refactoring for it too? But yes, that part helps a lot to keep factory functions discoverable.
h
is there?
Do you think my Vector interface should extend DoubleArray?
k
"wrap"?
h
extend
it currently extends
MutableList<Double>
k
You can't extend
DoubleArray
though. I'd say it depends, are you planning to use those vectors in a list-like way a lot? You can't change their size, so it's already not really going to be a mutable list.
h
Oh, righto
Yeah, i like that in concept they'd be easily mutable, but in practice matrix classes just need to be fast
k
You could make your own interfaces,
(Mutable)MatrixSlice
that work better and have functions like
subSlice
.
And they won't have to box doubles.
h
hmmm, i'm not sure how that would help, or rather, how to take advantage of that if i implemented it
k
Well mostly because implementing
MutableList
from scratch is quite difficult, and you could choose a simpler interface yourself. But if you have no need for such an abstraction then just don't do it simple smile
h
gotcha. yeah, I already implemented MutableList from scratch.... my Matrix class is over 1000 lines of code atm
k
Why is
Matrix
a
List
?
h
Originally because it was an easy way to get access to kotlin's collections functions by delegation.
k
Delegation doesn't really come into play there, but I see what you mean.
h
open class Matrix(data: List<List<Double>> = listOf()) : MutableList<Vector> by data.map({ it.toVector() }).toMutableList()
k
Right, I meant that just implementing
List
also gets to access to Kotlin's collections functions.
h
Other than the mutable ones 😛
k
Haha true. Note that
.map{}.toMutableList{}
is terribly inefficient, it copies the whole list over for no reason. Use
mapTo
instead.
h
alright, thanks
i really should store the matrix as a flat double array, but holy shit is that a headache to implement
k
Just have a
get(r,c) = arr[r*width + c]
and corresponding
set(r,c,v)
and everything else can stay about the same, right?
h
yeah, but there's other stuff like diagonals, bidiagonals, etc. it gets complex quickly
maybe i'm just dumb and it really isn't that complex
so i'm trying to wrap my head around this special Vector interface for the matrix that doesn't store values. How can i get it to perform its operations on the matrix instead of itself?
wouldn't i not HAVE to make a special vector interface now? couldn't I simply pass a col
as Vector
instead of
to Vector
and thus operations would be performed on the matrix?
hmmmm. this still wouldn't do what I want it to. I'm not sure how to grab all the pointers instead of copying all the elements.
fun col(index: Int) = List(numRows) { this[it][index] } as Vector
oh, the woes of not having pointers
k
Quick mock-up:
Copy code
interface Vector {
    val size: Int
    operator fun get(i: Int): Double
}

fun Vector(elements: DoubleArray): Vector = BasicVector(elements)

private class BasicVector(val arr: DoubleArray): Vector {
    override val size: Int get() = arr.size
    override fun get(i: Int) = arr[i]
}

class Matrix(val width: Int, val height: Int) {
    private val arr = DoubleArray(width * height)

    private fun index(row: Int, col: Int) = row * width + col

    operator fun get(row: Int, col: Int) = arr[index(row, col)]
    operator fun set(row: Int, col: Int, value: Double) { arr[index(row, col)] = value }

    operator fun get(row: Int) = row(row)
    fun row(row: Int): Vector = RowVector(row)
    fun col(col: Int): Vector = ColVector(col)

    private inner class RowVector(val row: Int): Vector {
        override val size: Int get() = this@Matrix.width
        override fun get(i: Int) = this@Matrix[row, i]
    }

    private inner class ColVector(val col: Int): Vector {
        override val size: Int get() = this@Matrix.height
        override fun get(i: Int) = this@Matrix[i, col]
    }
}
h
THANK YOU SO MUCH!
i love you, Karel
k
😄 that was fast.
h
yeah, i'm currently converting my matrix to an interface also, and refactoring out all metadata into a class called DataFrame
k
Why does matrix have to be an interface?
And what metadata? Just the dimensions?
h
because i don't like having open classes, and that way i can extend Matrix from Dataframe
no, as in attributes
whether a column is continuous or categorical. if the latter, how do the categories map to doubles?
k
That feels something that's better for composition instead of inheritance, but you have more information of course.
h
was this loaded from a csv or something? then what is the filename? was this loaded from an aarf? then same shit
k
Now it feels even more like composition simple smile
h
composition? do you mean better if it was a hierarchy?
i'm sorry, i just don't know what you mean by composition
k
It's one of those oneliners from effective Java: favor composition over inheritance. There's lots of stuff about that online I'm sure. Basically instead of inheriting
Matrix
and adding stuff, do this instead:
Copy code
class DataFrame {
    val matrix: Matrix = ...
    val fileName: String = ...
    ...
}
h
yeah, but i'd like to be able to perform all the functions of a matrix on the dataframe without having to rewrite them all as a call to the matrix
k
Hmm okay, still feels wrong to me 😕
h
so what do you think is best?
k
Composition. To elaborate on why it feels wrong to me: you can multiply matrices together, and you get a new matrix. What would that look like for dataframes?
h
that's a very good point
thank you for pointing that out before i committed to much time to an erroneous path
This is the first time I've ever written with inner classes. I didn't know about
this@Matrix.foo
. that's great
The flattened matrix gets a bit confusing when adding and subtracting columns more than anything else
adding and subtracting rows aren't so bad since it's just performing an
addAll
in the correct place or repeatedly performing
removeAt
in the correct place
adding and subtracting columns involved placing a value into the array at the correct location repeatedly, but the width of the matrix is in flux from start to finish, making identifying to correct index a bit tricky
it's actually not as complicated as i'm making it, i think
they are actually this simple:
Copy code
fun addCol(j: Int, col: List<Double> = List(numRows) { 0.0 }): Vector {
        if (col.size != numRows)
            throw IllegalArgumentException("Column does not have number of values equal to number of entries in matrix.")
        width++
        repeat(numRows) { i -> add(i * width + j, col[i]) }
        return col(j)
    }

    fun removeCol(j: Int = lastColIndex): Vector {
        val res = mutableListOf<Double>() as Vector
        width--
        repeat(numRows) { i -> res += removeAt(i * width + j) }
        return res
    }
k
What do those functions do?
+
and
-
between a col/row of a matrix and a vector, mutating the matrix?
h
vectors + vectors are vector addition. matrices plus vectors extend the matrix
k
You're going to change the size of the matrix? Hmm.
h
i mean, i've been debating making the matrix size immutable
it's probably a better course of action
k
Yeah I would do that.
h
well, i think i'll keep it mutable simply because it'll be used in data mining algorithms, and it would take a lot of extra memory to perform preprocessing on if it weren't mutable as far as i understand how
mutableList
vs
List
function in memory
i'll just remove the
+
and
-
operators for adding and removing rows from the matrix
that'll require deliberate function calls like
addRow
and
addCol
k
Are you going to implement resizing youself then? Lists preallocate extra memory to make it faster, allocating for every new row is going to be slow.
h
hmm, wasn't planning on it
so it would be faster to create a new matrix every time you want to remove a column from a matrix?
these matrices can get HUUUUUUUUUUUUUUGE
k
No, I was talking about separate things simple smile . • matrices with mutable size aren't the standard as far as I know • if you're going to make them resizable you should probably preallocate avoid being incredebly slow
h
so by preallocating i assume you mean creating additional columns and rows beyond the "height" and "width" of the matrix
is there a resource you know of that i can look at for best practices in when and how much to preallocate?
k
Yes. And figure out a layout because
r * width + c
isn't going to work out simple smile .
h
yeah, that definitely makes things more complicated 😮
k
I don't know of any general resource, but you can look in the source of
ArrayList
and
HashMap
, they both preallocate.
h
why not just extend ArrayList then?
this is my current primary constructor / class declaration:
open class Matrix(width: Int = 0, values: List<Double> = emptyList()) : MutableList<Double> by values as MutableList
k
Because that will box the doubles and it's wrong, a matrix isn't an ArrayList.
h
since MutableList is arraylist, do i have the memory management covered already?
that's a fair point
k
Are you still using lists? You definitely need
DoubleArray
both for performance and memory.
h
so then can i not store my data privately as an arraylist inside the matrix?
k
No.
h
alrighty, i'll do it the hard way. this is good general practice. i appreciate all the advice.
what's the best way to convert a List into an Array? just
toDoubleArray
function?
k
Yes, but ideally you don't ever put the doubles in a list in the first place.
h
just for constructors
in case someone wants to make a matrix from a 2d list
constructor(data: List<List<Double>> = listOf()) : this(data[0].size, data.flatten().toDoubleArray())
k
You need to check all lists are the same size.
h
yeah, i have too much faith in people
i should most likely convert this to an
invoke
function in a companion object then, so that i can throw an Exception before instantiating the matrix
k
Or a top level function with the same name as the class, or just a normal constructor?
You could use
DoubleArrayList
from here: https://korlibs.soywiz.com/kds/#ArrayList
h
i thought the init block is called after the object is created?
oh neato
k
What does the init block have to do with it?
h
i thought the init block is to the primary constructor what the block following a constructor declaration is for the constructor?
k
No, the init block runs for every constructor.
h
oh wow, didn't know that. so the block after a constructor declaration is run before object instantiation?
i don't know why i thought it was run after all this time. maybe because that's what it appears as sequentially in the code?
k
What block after the constructor declaration?
There a detailed explanation of the order here: https://kotlinlang.org/docs/reference/classes.html#constructors
h
constructor() { val thisOne: Foo }
k
Well first it's init blocks and properties, then the constructor.
h
Copy code
companion object {
    operator fun invoke(rows: List<List<Double>>): Matrix {
        val width = rows[0].size
        if (!rows.drop(1).all { it.size == width })
            throw IllegalArgumentException("All rows must be of equal length.")
        return Matrix(width, rows.flatten().toDoubleArray())
    }
}
Wouldn't this assuredly prevent instantiation of the Matrix, and thus save time and memory? I'm not sure how to perform actions in this order using normal constructors
k
Ah you're right, a normal constructor doesn't work.
I still prefer top-level over invoke though, but it really doesn't matter.
h
I like that invoke is called like a constructor from the outside.
k
Yes, same for top level functions.
Copy code
class Matrix(val width: Int, val height: Int, val content: DoubleArray)

fun Matrix(list: List<List<Double>>): Matrix {
    //todo validation
    return Matrix(list[0].size, list.size, list.flatten().toDoubleArray())
}

fun main() {
    Matrix(2, 3, doubleArrayOf(.0, .0, .0, .0, .0, .0))
    Matrix(listOf(listOf(.0, .0, .0), listOf(.0, .0, .0)))
}
h
yeah, but code style convention poo-poos about function names being PascalCase
as if i didn't suppress the warning already because of other reasons anyways, so i get your point
k
Code style is there for the 99% of code.
h
yep