SolvedClickHouse Avro Support

Reopen #1342 for triage.

18 Answers

βœ”οΈAccepted Answer

+1 for Avro support. This schema validated format is super useful on large scale projects involving many separate teams that may not even need to speak to each other to work with the data.

With Kafka and the AvroRegistry it ensures data is of the expected structure and type, and the full doc (schema) on that structure is viewable and easily understandable by anyone on the project(s). A "contract" is made on a topic once you push avro to it. Otherwise the data is rejected. Forces the devs to send clear predictable stuff that anyone can build on

Confluent Kafka Connect module even uses it to let you dump data from topic directly to an sql table and can even do upserts for you and give you exactly once delivery. (which any kafka consumer can do too)

.. It's probably obvious but I'd go with supporting an avro schema where the main type is a record with its fields.

{
  "type": "record",
  "name": "Something",
  "fields": [
    {"name": "some_id", "type": "int", "doc": "A super useful ID of something"},
    {"name": "some_string", "type": "string", "doc": "blabla"}
  ]
}

and just for kicks:

CREATE TABLE my_kafka_data ENGINE = Kafka()
  SETTINGS kafka_topic_list = 'my_kafka_topic', kafka_format = 'AvroRecord'

Other Answers:

This task is assigned to external contributor, Pavel Kruglov @Avogar.

In master.

More Issues: