Json compare java ignore schema

8/13/2023

JSON Schema was born to fill this gap by introducing schemas for JSON documents. When exchanging structured messages between services, having a schema is usually a reasonable choice. While this works well in certain use cases (including Kafka!), it could become painful in others. It allows us to keep accepting data without worrying too much about what's inside. JSON really shines when data flexibility is required, thanks to its schema-less nature. Questions like " does '7' mean June or July?" may sound funny, but it's not uncommon to end up with such a mess in a real-life project.

Don't we have more important problems to focus on? Should we send null for every empty field, or simply ignore it?Īlthough there are many guidelines available, it still feels like a never-ending series of discussions and debugging.Should we encode dates as timestamps or as ISO-compatible strings?.How to encode decimal numbers in order to prevent accuracy loss?.

There's a "standard" set of questions, that must be asked sooner or later, like: Yet, it could also create more problems than it solves.Īll the projects using JSON as a data exchange format (not only with Kafka) have something in common. The hell of flexible notationįlexibility is usually desired. In fact, the situation becomes even more interesting with producers publishing messages using older schema together with these using a newer ones. Scenarios like reading new messages using an old schema, or reading old messages using a new schema are nothing unusual. If you're already in production, simple solutions like wiping all data and starting from scratch are no longer an option. Before you even realize it, incompatible messages are flooding your Kafka cluster. This makes every structure-related mistake more painful. Newer Kafka consumers introduced later may still try to read the log from the beginning and fail on the incompatible messages. Updating all the existing consumers is not always enough. When using synchronous communication (like REST) we may also run into similar issues. To be fair, this is not a brand new problem. As we have to deal with both: the existing and future consumers, we also have to consider the compatibility aspects. This property has consequences for altering message structure. As a result, messages last longer than the specific consumer versions. Furthermore, defining topics with unlimited retention (storing messages forever) is not unusual too. It's the retention period that defines how long ( time or size-wise) messages from a certain topic should be available for consumption. In fact, the broker doesn't care about that aspect at all. The one thing you have to know about Kafka is that messages published to it are not being removed after reading. Unless specified explicitly, I'll refer to one specific use case: exchanging structured messages between services. Here, I'd like to share some of my observations with you.Īs always, the context matters. Using Kafka made me less enthusiastic about JSON and more focused on data evolution aspects. Yet, this post is not really about Kafka itself. For many organizations adopting Kafka I worked with, the decision to combine it with JSON was a no-brainer. From Javascript UIs, through API calls, and even databases - it became a lingua franca of data exchange. When I started my journey with Apache Kafka, JSON was already everywhere.

0 Comments

Json compare java ignore schema

Leave a Reply.

Author

Archives

Categories