Managing Flink Jobs
The DA Platform is a huge step forward for running Flink at scale. I was lucky enough to see a demo and was really impressed. Far more advanced that the what can be achieved with Dataflow at the moment.
How to create an effective SRE culture
Here are some tips on how to create an effective SRE culture:
- Start with the right mindset. SRE is a mindset that sees reliability as everyone’s responsibility, not just the responsibility of the SRE team. It is important to create a culture where everyone is empowered to take ownership of reliability and to make decisions that will improve the reliability of the systems they work on.
- Embrace failure. Failure is inevitable, so it is important to create a culture where failure is seen as an opportunity to learn and improve. The SRE team should be empowered to experiment and to take risks, knowing that they will not be punished for failure.
- Promote collaboration. SRE is a team sport, so it is important to create a culture where collaboration is encouraged. The SRE team should work closely with other teams, such as development, operations, and security, to ensure that the systems are reliable.
- Automate everything. Automation is essential for SRE. By automating tasks, the SRE team can free up time to focus on more strategic work. It is also important to automate the collection of data so that the SRE team can have a clear understanding of the health of the systems.
- Measure everything. SRE is data-driven, so it is important to measure everything. The SRE team should collect data on the performance of the systems, the number of incidents, and the time it takes to resolve incidents. This data can be used to identify areas where improvements can be made.
- Celebrate successes. It is important to celebrate successes, both big and small. This will help to keep the SRE team motivated and to create a positive culture.
By following these tips, you can create an effective SRE culture that will help to improve the reliability of your systems.
How to deliver microservices
Here are some tips on how to deliver reliable, high-throughput, low-latency (micro)services:
- Design your services for reliability. This means designing your services to be fault-tolerant, scalable, and resilient. You can do this by using techniques such as redundancy, load balancing, and caching.
- Use the right tools and technologies. There are a number of tools and technologies that can help you to deliver reliable, high-throughput, low-latency microservices. These include messaging systems, load balancers, and caching solutions.
- Automate your deployments. Automated deployments can help you to quickly and easily deploy new versions of your microservices. This can help to improve reliability by reducing the risk of human errors.
- Monitor your services. It is important to monitor your services so that you can identify and address problems quickly. You can use a variety of monitoring tools to collect data on the performance of your services.
- Respond to incidents quickly. When incidents occur, it is important to respond quickly to minimize the impact on your users. You should have a process in place for responding to incidents that includes identifying the root cause of the problem and taking steps to fix it.
By following these tips, you can deliver reliable, high-throughput, low-latency microservices.
Predict the stock market
The premise was simple. Use “big” data analytics and machine learning models to predict the movement of stock prices. However, we had really “dirty” data and our Data Scientists were stuggling to seperate the noise from the signals. We spent a lot of time cleaning the data and introducing good old principles like “how can I run the model somewhere over than a laptop?”. This was a true startup, a bunch of people in a room trying to get stuff working. No red tape, no calling the “helpdesk” to sort out your IT problems (I actually was the helpdesk).
Delta risk
QuantLib is a free and open-source software library for quantitative finance. It provides a wide range of functionality for pricing and risk-managing financial derivatives, including interest rate swaps.
To calculate the delta risk of an interest rate swap in Python using QuantLib, you can follow these steps:
- Import the necessary QuantLib modules:
Python
import QuantLib as ql
- Create a QuantLib YieldTermStructure object to represent the current interest rate curve:
Python
Taming the stragglers in Google Cloud Dataflow
I’m currently bench-marking Flink against Google Cloud Dataflow using the same Apache Beam pipeline for quantitative analytics. One observation I’ve seen with Flink is the tail latency associated with some shards.
Google Cloud Dataflow can optimise away stragglers in large jobs using “Dynamic Workload Rebalancing". As far as I know, Flink is currently unable to perform similar optimisations.
Crypto - why?
The point of cryptocurrency is to provide a decentralized, secure, and efficient way to transfer value. Cryptocurrencies are not issued by any central authority, such as a government or bank, and they are not backed by any physical asset. Instead, they are created and maintained by a network of computers that are running a special software program. This software program is designed to verify and record cryptocurrency transactions, and to prevent fraud.
Latency Sensitive Microservices
Great talk by by Peter Lawrey regarding latency in micro-services. https://www.infoq.com/presentations/latency-sensitive-microservices/
Differences between Beam and Flink
Apache Beam vs. Apache Flink: Choosing the Right Distributed Processing Framework
Apache Beam and Apache Flink are both powerful open-source frameworks for distributed data processing, enabling efficient handling of massive datasets. While they share the common goal of parallel data processing, they differ significantly in their architecture, programming model, and execution strategies. Understanding these differences is crucial for choosing the right tool for your specific needs. This article will help you navigate the decision-making process.
Pushing the limits of the Google Cloud Platform
This one is better explained with the presentation below. If you want to learn how to run quantitative analytics at scale, it’s well worth a watch.