I'm trying to use pgvector with exposed, but runni...
# exposed
m
I'm trying to use pgvector with exposed, but running into a few issues. Here was my attempt:
Copy code
class PgVectorColumnType(private val size: Int) : ColumnType() {
    override fun sqlType(): String = "vector($size)"

    override fun validateValueBeforeUpdate(value: Any?) {
        if (value !is Embedding) error("Value must be an Embedding")
        require(value.vector().size == size) { "Embedding size must be $size" }
    }

    override fun valueFromDB(value: Any) = when (value) {
        is String -> Embedding(value.substring(1, value.length - 1)
                                    .split(",")
                                    .map(String::toFloat)
                                    .toFloatArray())
        else -> value
    }

    override fun valueToDB(value: Any?): Any? {
        if (value == null) return value
        return notNullValueToDB(value)
    }

    override fun notNullValueToDB(value: Any): Any {
        require(value is Embedding) { "Value must be an Embedding" }
        require(value.vector().size == size) { "Embedding size must be $size" }
        return "'${value.vector().contentToString()}'"
    }
}

fun Table.vector(name: String, size: Int): Column<Embedding> = registerColumn(name, PgVectorColumnType(size))
But trying to insert data results in the following error:
Copy code
Transaction attempt #1 failed: org.postgresql.util.PSQLException: ERROR: column "embedding" is of type vector but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.
  Position: 46. Statement(s): INSERT INTO notes (embedding, title) VALUES (?, ?)
However, manually running the command in the IJ Database Console works just fine.
c
Hi @martmists When setting an object in a prepared statement with JDBC, any unknown data type without a JDBC mapping is set as a varchar string. To avoid the casting exception due to PostgreSQL's stricter type system, you'll need to manually declare what the type of the set parameter should be, using a
PGobject
in
setParamater()
. Here's a more thorough explanation of the issue if you're interested. I had to make some assumptions about
Embedding
, but here's the changes I made to your column that passed tests on my end:
Copy code
class PgVectorColumnType(private val size: Int) : ColumnType() {
    override fun sqlType(): String = "vector($size)" // no change

    override fun validateValueBeforeUpdate(value: Any?) { // no change
        if (value !is Embedding) error("Value must be an Embedding")
        require(value.vector().size == size) { "Embedding size must be $size" }
    }

    override fun valueFromDB(value: Any): Any = when (value) {
        is PGobject -> valueFromDB(value.value!!) // <- change
        is String -> Embedding(value
            .substring(1, value.length - 1)
            .split(",")
            .map(String::toFloat)
            .toFloatArray()
        )
        else -> value
    }

    // valueToDB duplicated default logic

    override fun notNullValueToDB(value: Any): String {
        require(value is Embedding) { "Value must be an Embedding" }
        require(value.vector().size == size) { "Embedding size must be $size" }
        // the original version kept throwing "ERROR: malformed vector literal..."
        // "Vector contents must start with "["."
        return value.vector().contentToString() // <- change
    }

    override fun setParameter(stmt: PreparedStatementApi, index: Int, value: Any?) { // <- change
        val parameterValue: PGobject? = value?.let {
            PGobject().apply {
                type = sqlType().substringBefore('(')
                this.value = value as? String
            }
        }
        super.setParameter(stmt, index, parameterValue)
    }
}
348 Views