Let's start with the basic query that finds the 200,000 most active (i.e. with most executed trades) seconds. In GrokIt, the query is simply:
In the above query, MsOfDay %/% 1000 converts milliseconds into seconds (%/% is the integer division operator). The final division by 3600 converts the second into a hour-as-a-double so we can read more easily the result. The query is similar to queries in the previous blog post but the number of groups is much larger: 87,872,904 (this was determined by running a query that ends with a Count instead of OrderBy).
The running time for the query on the two machines is:
| Server | Time | Tuple Speed | Read Speed | Cpu Load |
|---|---|---|---|---|
| grokit | 387.6 s | 150 MT/s | 1200 MB/s | 14/64 |
| fgrokit | 127.8 s | 500 MT/s | 4000 MB/s | 50/64 |
This is in line with the results we have seen in previous blog posts. Notice that on fgrokit, the query takes just 2 minutes.
The top 10 rows in the result are:
| Date | Hour (dec) | cnt |
|---|---|---|
| 2014-Mar-21 | 15.75 | 106,063 |
| 2011-Sep-01 | 10 | 99,914 |
| 2014-Mar-21 | 15.83 | 95,796 |
| 2014-Mar-21 | 15.92 | 90,809 |
| 2014-May-30 | 15.75 | 89,582 |
| 2013-Dec-20 | 15.92 | 87,102 |
| 2013-Dec-20 | 15.75 | 85,679 |
| 2011-Nov-29 | 10 | 82,537 |
| 2012-Jun-22 | 15.92 | 82,002 |
| 2012-Jun-22 | 15.75 | 81,267 |
We immediately see within the first 10 rows two interesting facts in this data. First, the most active second contains more than 100,000 trades. This gives a sense of what the peak load on financial trading systems might be. Arguably, it is hard to guess this number and this query readily provides the answer. Second, it seems that the hottest seconds are clustered around 10 AM and 3:45-4 PM.
To shed more light into this issue, we plot the distribution of the hottest 1000 seconds versus the hour of the day they appear:
We can immediately see the peak at 10 AM and the very large peak at 4 PM. This strongly suggests algorithmic trading behavior: obtain positions at the beginning of the day and clear the position before the day end.
Just for comparison, here is how the graph looks like when all 200,000 hottest seconds are included:
In fact, 80,000 out of 200,000 hottest seconds happen between 3:30 and 4 PM. That seems to be by far the busiest part of the trading day.
Just as a point of comparison, here is the distribution of the number of trades as a function of the time of day:
This immediately begs the question what happens at the millisecond level. We'll see that in a future blog post.



No comments:
Post a Comment