As we saw in a previous post microservices can communicate with UI clients and with other microservices. We want to achieve low latency and high throughput, so the size of the message definitely makes a difference. But along with the size we need to make sure the message sent by a sender is understood by the receiver. What happens when the sender is changing the structure of the message? How can the receiver still understand it?
At this point I want to focus on Apache Avro. Avro is a data serialization system in which data is being sent along with its description (schema). This is fundamentally different than other serialization systems like Google Protocol Buffers or Thrift where no schema is needed. Also no static code compilation is required beforehand(it is optional). Avro schema can contain primitive and complex types(records, enums, arrays, maps, unions and fixed). We can encode the schema in binary or in a json format(which makes sense for webapps).
Avro can be used with code generation, which means an extra step is needed by generating the class from the schema, and this needs to happened both on producer service and consumers as well; or it can be used without code generation. We will look into both options.
Code generation
This involves generation the message beforehand from the schema using avro maven plugin at build time in generate-sources phase. The generated class will look something like ItemMessage class. We serialize it and write it to an outputstream which we will send as response.
DatumWriter datumWriter = new SpecificDatumWriter<>(ItemMessage.class);
DataFileWriter dataFileWriter = new DataFileWriter<>(datumWriter);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
dataFileWriter.create(item.getSchema(), outputStream);
dataFileWriter.append(item);
dataFileWriter.close();
On the consumer side we need to deserialize it and read the message (only one message in my case).
ByteArrayResource byteResource = response.getBody();
DatumReader datumReader = new SpecificDatumReader<>(ItemMessage.class);
DataFileStream streamReader = new DataFileStream<>(byteResource.getInputStream(), datumReader);
ItemMessage item = new ItemMessage();
if (streamReader.hasNext()) {
return streamReader.next(item);
}
Without code generation
Here we don’t generate the class at build time, we use a generic record. First we parse the schema
schema = new Schema.Parser().parse(this.getClass().getResourceAsStream("/avro/item.avsc"));
and then we fill it with data.
GenericRecord itemMessage = new GenericData.Record(schema);
itemMessage.put("code", item.getCode());
itemMessage.put("reservePrice", item.getReservePrice());
The serialization part is similar with the previous one,
DatumWriter datumWriter = new GenericDatumWriter<>();
DataFileWriter dataFileWriter = new DataFileWriter<>(datumWriter);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
dataFileWriter.create(item.getSchema(), outputStream);
dataFileWriter.append(item);
dataFileWriter.close();
as well the deserialization
ByteArrayResource byteResource = response.getBody();
DatumReader datumReader = new GenericDatumReader<>(schema);
DataFileStream streamReader = new DataFileStream<>(byteResource.getInputStream(), datumReader);
GenericRecord item = null;
if (streamReader.hasNext()) {
return streamReader.next(item);
}
Since the consumer should be the one who dictates the structure of the message, the schema should be imported from it. He is the driver as we saw it in Consumer Driven Contract Testing.
I won’t focus on benchmarking as others have already done it. This is another set of results. In my opinion Avro is a good choice due to the ability to evolve the message structure easily.
You can even create your own communication protocol, though this is not in the scope of this article.