In Avro, is there a difference between calling toString () in GenericRecord and using JSONEncoder?

I have some Avro like GenericRecord

in Java data that I want to convert to JSON and I notice that there are two ways to do it, one is using use JsonEncoder

and the other involves just calling toString()

on GenericRecord

.

After a little experiment, both approaches seem to give equivalent results and the resulting JSON string can be converted back to Avro using JsonDecoder

either way. So my question is:

Is there any functional difference between the two, and is there any reason to use one over the other?

I am using Avro 1.7.7.

+3


source to share


1 answer


After some additional testing look at Avro's source, it seems that the toString () method for GenericRecord is implemented by GenericData.Record.toString () which calls GenericData.toString (). The javadoc on this method states that it should provide a valid JSON representation of the entry it makes.

However, it differs from the JsonEncoder implementation in that JsonEncoder uses the Jackson libraries and pays close attention to the Avro schema. The GenericRecord.toString () method just writes and constructs the JSON representation using a StringBuilder and doesn't pay as much attention to the Avro schema.



This means that there are cases where calling toString () creates a JSON representation that cannot be deserialized using JSONDecoder, for example, in cases where the schema contains joins.

Based on this, it looks like the toString () method is a simple and convenient way to get a human readable representation of a record, but unreliable as a way to serialize data according to a schema.

+5


source







All Articles