Showing posts sorted by date for query delta phenomenon. Sort by relevance Show all posts
Showing posts sorted by date for query delta phenomenon. Sort by relevance Show all posts

Tuesday, November 24, 2020

Temporal Clustering on Real Prices

Temporal Clustering on Real Prices

Having now had time to run the code shown in my previous post, Temporal Clustering, part 3, in this post I want to show the results on real prices.

Firstly, I have written two functions in Octave to identify market turning points and each function takes as input an n_bar argument which determines the lookback/lookforward length along price series to determine local relative highs and lows. I ran both these for n_bar values of 1 to 15 inclusive on EUR_USD forex 10 minute bars from July 2012 upto and including last week's set of 10 minute bars. I created 3 sets of turning point data per function by averaging the function outputs over n_bar 1 - 15, 1 - 6 and 7 - 15, and also averaged the outputs over the average of the 2 functions over the same ranges. In total this gives 9 slightly different sets of turning point data.

I then ran the optimal K clustering code, shown in previous posts, over each set of data to get the "solutions" per set of data. Six of the sets had an optimal K value of 8 and a combined plot of these is shown below.

For each "solution" turning point ix (ix ranges from 1 to 198) a turning point value of 1 is added to get a sort of spike train plot through time. The ix = 1 value is 22:00 BST on Sunday and ix = 198 is 06:50 BST on Tuesday. I chose this range so that there would be a buffer at each end of the time range I am really interested in: 7:00 BST to 22:00 BST, which covers the time from the London open to the New York close. The vertical blue lines are plotted for clarity to help identify the the turns and are plotted as 3 consecutive lines 10 minutes apart. The added text shows the time of occurence of the first bar of each triplet of lines, the time being London BST. The following second plot is the same as above but with the other 3 "solutions" of K = 5, 10 and 11 added.
For those readers who are familiar with the Delta Phenomenon the main vertical blue lines could conceptually be thought of as MTD lines with the other lines being lower timeframe ITD lines, but on an intraday scale. However, it is important to bear in mind that this is NOT a Delta solution and therefore rules about numbering, alternating highs and lows and inversions etc. do not apply. It is more helpful to think in terms of probability and see the various spikes/lines as indicating times of the day at which there is a higher probability of price making a local high or low. The size of a move after such a high or low is not indicated, and the timings are only approximate or alternatively represent the centre of a window in which the high or low might occur.

The proof of the pudding is in the eating, however, and the following plots are yesterday's (23 November 2020) out of sample EUR_USD forex pair price action with the lines of the above "solution" overlaid. The first plot is just the K = 8 solution plot

whilst this second plot has all lines shown.
Given the above caveats about caution with regards to the lines only being probabilities, it seems uncanny how accurately the major highs and lows of the day are picked out. I only wish I had done this analysis sooner as then yesterday could have been one of my best trading days ever!

More soon.

Tuesday, October 20, 2020

A Temporal Clustering Function

A Temporal Clustering Function

Recently a reader contacted me with a view to collaborating on some work regarding the Delta phenomenon but after a brief exchange of e-mails this seems to have petered out. However, for my part, the work I have done has opened a few new avenues of investigation and this post is about one of them.

One of the problems I set out to solve was clustering in the time domain, or temporal clustering as I call it. Take a time series and record the time of occurance of an event by setting to 1, in an otherwise zero filled 1-dimensional vector the same length as the original time series, the value of the vector at time index tx and repeat for all occurances of the event. In my case the event(s) I am interested in are local highs and lows in the time series. This vector is then "chopped" into segments representing distinct periods of time, e.g. 1 day, 1 week etc. and stacked into a matrix where each row is one complete period and the columns represent the same time in each period, e.g. the first column is the first hour of the trading week, the second column the second hour etc. Sum down the columns to get a final 1-dimensional vector of counts of the timing of events happening within each period over the entire time series data record. A chart of such is shown below.

The coloured vertical lines show the opening and closing times of the London and New York sessions (7am to 5pm in their respective local times) for one complete week at a 10 minute bar time scale, in this case for the GBP_USD forex pair. This is what I want to cluster.

The solution I have come up with is the Octave function in the code box below

## Copyright (C) 2020 dekalog
##
## This program is free software: you can redistribute it and/or modify it
## under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program. If not, see
## .

## -*- texinfo -*-
## @deftypefn {} {@var{centre_ix} =} blurred_maxshift_1d_linear (@var{train_vec}, @var{bandwidth})
##
## Clusters the 1 dimensional vector TRAIN_VEC using a "centred" sliding window of length 2 * BANDWIDTH + 1.
##
## Based on the idea of the Blurred Meanshift Algorithm.
##
## The "centre ix" value of the odd length sliding window is assigned to the
## maximum value ix of the sliding window. The centre_ix, if it is not the
## maximum value, is then set to zero. A pass through the whole length of
## TRAIN_VEC is completed before any assignments are made.
##
## @seealso{}
## @end deftypefn

## Author: dekalog
## Created: 2020-10-17

function new_train_vec = blurred_maxshift_1d_linear ( train_vec , bandwidth )

if ( nargin < 2 )
bandwidth = 1 ;
endif

if ( numel( train_vec ) < 2 * bandwidth + 1 )
error( 'Bandwidth too wide for length of train_vec.' ) ;
endif

length_train_vec = numel( train_vec ) ;
assigned_cluster_centre_ix = zeros( size( train_vec ) ) ;

## initialise the while condition variable
has_converged = 0 ;

while ( has_converged < 1 )

new_train_vec = zeros( size( train_vec ) ) ;

## do the beginning and end of train_vec first
[ ~ , ix ] = max( train_vec( 1 : 2 * bandwidth + 1 ) ) ;
new_train_vec( ix ) = sum( train_vec( 1 : bandwidth ) ) ;

[ ~ , ix ] = max( train_vec( end - 2 * bandwidth : end ) ) ;
new_train_vec( end - 2 * bandwidth - 1 + ix ) = sum( train_vec( end - bandwidth + 1 : end ) ) ;

for ii = 2 * bandwidth + 1 : numel( train_vec ) - bandwidth
[ ~ , ix ] = max( train_vec( ii - bandwidth : ii + bandwidth ) ) ;
new_train_vec( ii - bandwidth - 1 + ix ) += train_vec( ii ) ;
endfor

if ( sum( ( train_vec == new_train_vec ) ) == length_train_vec )
has_converged = 1 ;
else
train_vec = new_train_vec ;
endif

endwhile

endfunction

I have named the function "blurred_maxshift_1d_linear" as it is inspired by the "blurred" version of the Mean shift algorithm, operates on a 1-dimensional vector and is "linear" in that there is no periodic wrapping of the data within the function code. The two function inputs are the above type of data, obviously, and an integer parameter "bandwidth" which controls the size of a moving window over the data in which the data is shifted according to a maximum value, hence maxshift rather than meanshift. I won't discuss the code further as it is pretty straightforward.

A chart of a typical clustering solution is (bandwidth setting == 2)

where the black line is the original count data and red the clustering solution. The bandwidth setting in this case is approximately equivalent to clustering with a 50 minute moving window. 

The following heatmap chart is a stacked version of the above where the bandwidth parameter is varied from 1 to 10 inclusive upwards, with the original data being at the lowest level per pane.

The intensity reflects the counts at each time tx index per bandwidth setting. The difference between the panes is that in the upper pane the raw data is the function input per bandwidth setting, whilst the lower pane shows hierarchical clustering whereby the output of the function is used as the input to the next function call with the next higher bandwidth parameter setting.

More in due course.