Overview
This setup streams real-time sensor data from edge devices, enabling telemetry collection and delivery to persistent storage. By leveraging Kafka and Avro, the architecture ensures reliable, high-throughput, and ordered data flow for critical telemetry.
Architecture Overview
Note: The “Edge Device” could be anything from ground rover telemetry to a swarm of IoT sensors.
A toned down version for the demo is shown below:
Key Components
- Edge Device: Collects sensor data and publishes it in batches (e.g., every 30 seconds or after 100 data points). For this demo, mock data simulates the stream. In this setup, there are two temperature sensors, each bursting 100 messages every second.
- Schema: Avro is used for serializing data on the producer side and deserializing on the consumer side, ensuring consistency and compatibility. The schema is registered in the Schema Registry for versioning and compatibility checks.
- Kafka Cluster: Uses Kraft (Kafka Raft Metadata) for cluster management. The setup includes three brokers, three partitions, and a replication factor of three for availability and fault tolerance.
- Consumers: There are two consumers subscribed to this topic, both assigned the same group ID. Kafka assigns partitions based on the number of consumers, maximizing read throughput and increasing ingestion pipeline performance. Other consumers can also subscribe to this topic to apply transformations and derive insights, such as delta temperature or average temperature.
- Database: Deserialized data is saved for storage and analysis.
