Why does Kotlinx Serialization protobufs target pr...
# serialization
s
Why does Kotlinx Serialization protobufs target proto 2, not proto 3?
e
proto 2 and 3 are equivalent on the wire
j
The serialization format is equivalent but the semantics are not the same. Proto 3 will not write default values, even ones which are explicitly assigned whereas proto 2 has a formal concept of absence and will write defaults if explicitly assigned.
e
kotlinx.serialization.protobuf also excludes default values by default, unless
ProtoBuf { encodeDefaults = true }
is used
for compatibility, the usual protobuf to think about: field names don't matter, but
@ProtoNumber
does matter. once a field is serialized as int/sint/fixed type, that can't be changed.
j
field names tragically matter if you also target the protobuf json format
s
So basically I need to make sure that I never remove a field, or change the order of a property in an
@Serializable
class?
Or change the type of a property once it’s been set
j
Order, thankfully, is actually always irrelevant! (But that's also true of JSON)
You can sometimes change the type but not significantly. A few are byte-compatible on the wire.
e
order is irrelevant if you set `@ProtoNumber`; otherwise changing order changes the auto-generated field IDs. (you should use
ProtoNumber
instead of relying on order)
🙏 1
oh, and kotlinx.serialization polymorphism uses SerialName, not protobuf's oneof, so those should remain stable as well
🙏 1
s
Order, thankfully, is actually always irrelevant!
order is irrelevant if you set 
@ProtoNumber
@jw So you are setting
@ProtoNumber
for each field?
j
We use Wire, not kotlinx.serialization
s
with wire - order is irrelevant?
j
Yes. As well as Google protobuf and kotlinx.serialization with
@ProtoNumber
👍 1
p
We have several tests for protobuf with kotlinx where we have a binary representation of an older class and verify how it gets deserialized to a newer format
s
Is protobuf worth the extra effort?
p
Depends on what you are doing. We have a internal bi solution and protobuf is hardcore small
j
Kotlinx.serialization protobuf is rarely worth it. JSON + Gzip matches size and doesn't really incur drastic CPU overhead. Protobuf has all kinds of dumb things that make it waste CPU anyway.
The biggest advantage is sharing a stable schema across platforms in which case you're better off using Wire or Google protobuf anyway because the types can be autogenerated from that schema
e
you can experimentally generate a protobuf descriptor from the kotlinx.serialization descriptor, but I agree that if you're gonna use the protobuf protocol you'd be better off writing the schema in protobuf and using wire- or protobuf-generated data classes instead of the other way around
s
thank you guys for your insight
p
We are doing proto + gzip. But never measured vs json + gzip
d
JSON + Gzip matches size and doesn't really incur drastic CPU overhead.
From our experience, even with small messages serialization of protos was much faster than JSON+gzip. With growing message sizes, the perf difference was growing as well (in favor of protos).
j
With what library? Are you streaming or buffered? When you're streaming it's mostly irrelevant as the cost is amortized over the cost of the network which is easily 10-100x slower.
d
no streaming just plain rpc calls using grpc and jackson on the microservice-microservice communication deployed to aws