Lambda vs Delta Architecture - Realtime Analytics on Delta Lake

Before I start details for Delta Architecture lets recap Lambda Architecture first, then you will be able to appreciate the beauty of delta Architecture.
Lambda architecture is a popular technique where records are processed by a batch system and streaming system in parallel. The results are then combined during query time to provide a complete answer. Strict latency requirements to process old and recently generated events made this architecture popular. As you see in below diagram there is 3 layer in this architecture:
1)     Batch Layer, you will able to already guess it’s responsibility
2)     Speed layer, here events are process in real time and finally
3)     Serving layer where real time flavor is added



The key downside of this architecture is the development and operational overhead of managing two different systems. The main reason is complexity examples:
1)     Read Consistency vs write Consistency
2)     Incremental read with goof throughput
3)     Rollback for bad writes
4)     Replay history to debug
5)     Late arriving of data handling & so on

There have been attempts to unify batch and streaming into a single system in the past. Organizations have not been that successful though in those attempts. But, with the advent of Delta Lake, it is now seems possible.
Delta architecture is simply when structured streaming combine with the power of delta called delta architecture. Whatever example complexity mentioned above in lambda architecture will be solve with ACID transaction, time travel, CDC, Rollback etc. features in Delta Lake. TO understand more about delta follow article here.

In short to summarize Delta Architecture allow processing/reading data in-stream or batch OR Writing data in-stream or batch you will able to do same way in single system.

Follow here if you are interested to develop Modern Data Warehouse solution using Delta Lake.

In continues to above this Modern Data warehouse design below is added with real time analytics architecture. In this diagram if you saw we are ingesting media and Sensor IoT data using Azure IoT Hub and Azure Event Hub, then storing into the same delta lake where our batch data were ingested. After this we can use Azure Databrciks and Azure Stream Analytics to analyze real time Events. Finally at the end this will add the flavor of real time on BI reports or Apps.

Post a Comment

Thanks for your comment !
I will review your this and will respond you as soon as possible.