In my previous post, I introduced an automated SOD (Start of Day) Monitor job that performs a specific task. Let me break it down:
Purpose: The SOD Monitor job is designed to keep track of live data directories for a given day.
Functionality:
Data Collection: It loops through all the live data directories created on the current day.
File Count: For each directory, it determines the total number of files present.
Statistics: It calculates relevant statistics (such as min, average, and max file sizes) for these files in the directory.
Comparison: The job then compares these metrics with the previous N-day running average.
Email Notification:
Every business day, between 8:30 AM and 9:00 AM, I receive an email.
This email contains multiple tables generated using the redmail library.
In less than 10 seconds, I can glance at these tables to assess whether everything is running as expected or if any failures need attention.
This efficient monitoring system ensures that any anomalies or deviations from the expected behavior are promptly identified.
Let’s delve into a specific incident: On a particular day (as depicted below), I encountered a reference data ingestion error originating from polygon.io. This error had a cascading effect, causing downstream programs to crash. Here’s how I approached it:
Visual Clarity:
The visual representation allowed me to quickly identify the problem.
I could see which components crashed and pinpoint the affected areas.
Efficient Debugging:
Rather than wasting time searching for the issue, I focused on the critical points.
By understanding what crashed and where, I could streamline my debugging efforts.
Orchestration Tool:
Once I fixed the underlying issue, I leveraged my orchestration tool.
Using the graphical user interface (GUI), I effortlessly restarted all relevant jobs.
In summary, visual cues and efficient tools are essential for effective troubleshooting.
The central message of this post is to emphasize that managing all the components of a hedge fund can be an overwhelming task for a single individual. To navigate this challenge effectively, it’s crucial to develop tools that enhance efficiency wherever possible. These tools streamline processes, optimize decision-making, and allow you to focus on strategic aspects rather than getting bogged down by routine tasks.
Coming up next
Let's tackle the most demanding part of this section: historical data. Initially, we must establish jobs that run intermittently throughout the day, ensuring they don't overlap with our live data ingestion process. It's crucial to bear in mind that we must adhere to the API rate limits set by our data vendors.