Why does Kotlinx Serialization protobufs target proto 2 not kotlinlang #serialization

Join Slack

Why does Kotlinx Serialization protobufs target pr...

# serialization

spierce7

01/15/2022, 4:13 AM

Why does Kotlinx Serialization protobufs target proto 2, not proto 3?

ephemient

01/15/2022, 4:20 AM

proto 2 and 3 are equivalent on the wire

01/15/2022, 4:26 AM

The serialization format is equivalent but the semantics are not the same. Proto 3 will not write default values, even ones which are explicitly assigned whereas proto 2 has a formal concept of absence and will write defaults if explicitly assigned.

ephemient

01/15/2022, 4:27 AM

kotlinx.serialization.protobuf also excludes default values by default, unless

ProtoBuf { encodeDefaults = true }

is used

ephemient

01/15/2022, 4:29 AM

for compatibility, the usual protobuf to think about: field names don't matter, but

@ProtoNumber

does matter. once a field is serialized as int/sint/fixed type, that can't be changed.

01/15/2022, 4:57 AM

field names tragically matter if you also target the protobuf json format

spierce7

01/15/2022, 5:17 AM

So basically I need to make sure that I never remove a field, or change the order of a property in an

@Serializable

class?

spierce7

01/15/2022, 5:17 AM

Or change the type of a property once it’s been set

01/15/2022, 5:22 AM

Order, thankfully, is actually always irrelevant! (But that's also true of JSON)

01/15/2022, 5:22 AM

You can sometimes change the type but not significantly. A few are byte-compatible on the wire.

ephemient

01/15/2022, 5:23 AM

order is irrelevant if you set `@ProtoNumber`; otherwise changing order changes the auto-generated field IDs. (you should use

ProtoNumber

instead of relying on order)

🙏 1

ephemient

01/15/2022, 5:24 AM

oh, and kotlinx.serialization polymorphism uses SerialName, not protobuf's oneof, so those should remain stable as well

🙏 1

spierce7

01/15/2022, 5:38 AM

Order, thankfully, is actually always irrelevant!

order is irrelevant if you set
@ProtoNumber

@jw So you are setting

@ProtoNumber

for each field?

01/15/2022, 5:38 AM

We use Wire, not kotlinx.serialization

spierce7

01/15/2022, 5:38 AM

with wire - order is irrelevant?

01/15/2022, 5:39 AM

Yes. As well as Google protobuf and kotlinx.serialization with

@ProtoNumber

👍 1

Paul Woitaschek

01/15/2022, 10:25 AM

We have several tests for protobuf with kotlinx where we have a binary representation of an older class and verify how it gets deserialized to a newer format

spierce7

01/15/2022, 5:37 PM

Is protobuf worth the extra effort?

Paul Woitaschek

01/15/2022, 5:42 PM

Depends on what you are doing. We have a internal bi solution and protobuf is hardcore small

01/16/2022, 12:33 AM

Kotlinx.serialization protobuf is rarely worth it. JSON + Gzip matches size and doesn't really incur drastic CPU overhead. Protobuf has all kinds of dumb things that make it waste CPU anyway.

01/16/2022, 12:34 AM

The biggest advantage is sharing a stable schema across platforms in which case you're better off using Wire or Google protobuf anyway because the types can be autogenerated from that schema

ephemient

01/16/2022, 12:42 AM

you can experimentally generate a protobuf descriptor from the kotlinx.serialization descriptor, but I agree that if you're gonna use the protobuf protocol you'd be better off writing the schema in protobuf and using wire- or protobuf-generated data classes instead of the other way around

spierce7

01/16/2022, 4:59 AM

thank you guys for your insight

Paul Woitaschek

01/16/2022, 8:18 AM

We are doing proto + gzip. But never measured vs json + gzip

Dariusz Kuc

01/17/2022, 3:28 AM

JSON + Gzip matches size and doesn't really incur drastic CPU overhead.

From our experience, even with small messages serialization of protos was much faster than JSON+gzip. With growing message sizes, the perf difference was growing as well (in favor of protos).

01/17/2022, 3:46 AM

With what library? Are you streaming or buffered? When you're streaming it's mostly irrelevant as the cost is amortized over the cost of the network which is easily 10-100x slower.

Dariusz Kuc

01/17/2022, 2:15 PM

no streaming just plain rpc calls using grpc and jackson on the microservice-microservice communication deployed to aws

13 Views

Open in Slack

Previous Next