Below you will find pages that utilize the taxonomy term “Apache-Beam”
dataflow real time + aggregate
A great way to split up your pipeline based on the urgency of results aggregate-data-with-dataflow
Taming the stragglers in Google Cloud Dataflow
I’m currently bench-marking Flink against Google Cloud Dataflow using the same Apache Beam pipeline for quantitative analytics. One observation I’ve seen with Flink is the tail latency associated with some shards.
Google Cloud Dataflow can optimise away stragglers in large jobs using “Dynamic Workload Rebalancing". As far as I know, Flink is currently unable to perform similar optimisations.
Pushing the limits of the Google Cloud Platform
This one is better explained with the presentation below. If you want to learn how to run quantitative analytics at scale, it’s well worth a watch.
Our team recently completed a challenging yet rewarding project: building a scalable and portable risk engine using Apache Beam and Google Cloud Dataflow. This project allowed us to delve deeper into distributed computing and explore the practical application of these technologies in the financial domain.