<Does this answer still hold true>? No bulk way to...
# squarelibraries
c
Does this answer still hold true? No bulk way to insert into sqldelight?
🚫 1
Seems like I have to do something like this?
Copy code
suspend fun insertAllNames(list: List<String>) =
    withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
      list.forEach {
        personQueries.insert(it)
      }
    }
j
use a transaction though
The inserts are still not executed in batch but a transaction is a good workaround for a small amount of inserts.
c
In this case... it's a large amount of inserts. About 100k records. I have to parse a CSV and put it into a table. I guess I can just iterate over the CSV one by one and just insert it into the table.
h
Depending on the db you could also use a file import, like Postgres COPY.
c
currently on android, so I think im bound to sqlite. I'll see how the 100k inserts end up going 🤞
took about 3 minutes to do all of those inserts.... so i dont think im going to go this route. 😂
h
Inside a transaction?
SQLite supports multiple values, so you could write a single query with eg 100/1000 values.
c
nope. not a transaction. just one insert at a time. let me see how i can do this with a transaction.
m
Better?
k
If you can pre-seed your application with a DB you could do a one time conversion from CSV to SQLite and then ship that db with your application
2
j
100k is not a lot of data. I would expect that to insert in 3 seconds if within a transaction, not 3 minutes
c
Okay, using a transaction (not exactly sure if im using it correctly) take me down to about 10 seconds)
Going to see if I can easily write a single query to take 100 values, etc like hfhbd suggested
Here is my final result. Inserting seems to take about 3 seconds now. Would appreciate any code review since it's a fairly self contained method where I grab a CSV from the network. clear the database. then insert all into the db with a transaction.
Copy code
class MyUseCase @Inject constructor(private val persistence: PersistenceLayer) {
  suspend fun sync() = withContext(<http://Dispatchers.IO|Dispatchers.IO>) {
    val client = OkHttpClient()

    val request = Request.Builder()
      .url("<https://example.com/150k_records.CSV>")
      .build()

    client.newCall(request).execute().use { response ->
      if (!response.isSuccessful) throw IOException("Unexpected code $response")

      val bufferedReader = BufferedReader(StringReader(response.body!!.string()))

      persistence.deleteAll()
      var currentLine: String?
      Log.e("TIMER", "START inserting")
      persistence.myQueries.transaction {
        while (bufferedReader.readLine().also { currentLine = it } != null) {
          persistence.myQueries.insert(currentLine)
        }
      }
      Log.e("TIMER", "STOP inserting")
    }
  }
}
The only thing I'm really not happy with is that my persistenceLayer (which is supposed to abstract away SQLDelight) had to now expose
myQueries
so that I could wrap the insert in a transaction.
j
The delete should be in the transaction
👍 1
h
You could also make
PersistentLayer
to inherit
Transacter
(and implement it using delegation of
schema.myQueries
).
🤔 1
j
If you're going to read line by line you should not buffer the entire response body as a string first
That assumes that this function is running first, otherwise you'll block other transactions on the speed of the network which is not good
Also you need to shut down the OkHttp Dispatcher if you are discarding the instance
👍 1
c
Thanks. This is all great info. I moved the delete into the transaction. 🤦 And using
client.dispatcher().executorService().shutdown()
to shutdown Not sure how to read line by line if I don't put it into a string, but lemme look it up.
Okay, so currently creating the buffer by first converting it to a string takes ~7 seconds and inserting takes 3 If I switch over to
val bufferedReader = BufferedReader(InputStreamReader(response.body!!.byteStream()))
then creating the buffer is instant, but inserting takes a bit longer at like ~6 seconds. Overall the time saved by not converting to a string first seems worth it. Curious if there are any other optimizations there, but that seems pretty good.
j
Next up would be using Okio to read lines rather than java.io
You're unlikely to see any performance changes since now your insertion time is likely limited by the network speed, not the database speed
c
Surprisingly the network request comes back in about 600ms. I will put a TODO to replace java.io with Okio though.
j
No the headers come back in 600ms, the body takes the other 5.4 seconds
The cost of inserting into the database has been amortized over the slower network request so it now is essentially free
c
🤯 okay. apparently i dont understand how this stuff works. lmaooo
ah. okay. so thats why creating the buffer was ~6 seconds before because creating a string meant that I had to read the whole thing from the network.
j
precisely!
c
so interesting! If I wanted to just get an idea of how long it took to read it from the network. i could remove all the db calls, buffered reader creation, etc. and then simply call string() on the response.body, and that's roughly how long the "network call" takes?
j
Yes, that's a close approximation. It would include UTF-8 decoding the bytes into a string. The best you can do is probably calling
source().skip(Long.MAX_VALUE)
which will discard the bytes without any additional processing or copies
❤️ 1
c
QQ: response.body?.source()?.skip(Long.MAX_VALUE) gives me an EOF exception. Any other ideas
Surrounded it with a try catch just for the hell of it. Seems like that did the trick? About 5 seconds for that. Interesting because charles says that the Duration of the call was 1.39 seconds. So I guess that time is just for getting the headers
Just another ping on this.
response.body?.source()?.skip(Long.MAX_VALUE)
gives an EOF exception. Am I good with just catching that if i wanna do some really crude "benchmarking" or would you recommend some other technique?