close
close

first Drop

Com TW NOw News 2024

High-frequency data analysis: converting high-frequency signals into discrete buy/sell signals
news

High-frequency data analysis: converting high-frequency signals into discrete buy/sell signals

In high-frequency trading, we generate high-frequency signals from trade and quote tick data and analyze these signals to identify trading opportunities. This tutorial shows you how to convert high-frequency signals into discrete buy/sell/hold signals. Essentially, the problem is to convert an array of floating-point values ​​into an array of just 3 integers: +1 (buy), 0 (hold), and -1 (sell).

The conversion rules can be quite simple. For example:

  • +1 if signal > t1
  • -1 as a signal
  • 0 other

Such conversion rules can be easily implemented in DolphinDB using the function iif:

iif(signal > t1, 1, iif(signal 
Go to full screen mode

Exit full screen

However, to avoid too frequent reversals in trading direction, we usually use a more complex set of rules: if a signal is above the t1 threshold, it is a buy signal (+1) and subsequent signals remain buy signals until one goes below the t10 threshold. Similarly, if a signal is below the t2 threshold, it is a sell signal (-1) and subsequent signals remain sell signals until one goes above the t20 threshold. The relationship between the thresholds is as follows:

t1 > t10 > t20 > t2
Go to full screen mode

Exit full screen

With the above rules, the value of a trading signal is determined not only by the value of the current signal, but also by the status of the previous signal. This is a typical example of path dependency, which is usually considered unsuitable or difficult to handle by vector operations and therefore very slow in scripting languages, including DolphinDB.

In some cases, however, a path dependency problem can be solved using vector operations. The above problem is such an example. The next section describes how to solve it using vector operations.

First, find out which signals fall within the ranges of a particular state:

  • if signal > t1, state=1
  • as a signal
  • as t20

The states of the signals in the ranges (t2,t20) and (t10,t1) are determined by the states of the signals preceding these ranges.

The DolphinDB script to implement the above rules:

direction = (iif(signal>t1, 1, iif(signal
Go to full screen mode

Exit full screen

Let’s run a simple test:

t1= 10
t10 = 6
t20 = -6
t2 = -10
signal = 20.12 8.78 4.39 -20.68 -8.49 -6.98 0.7 2.08 8.97 12.41
direction = (iif(signal>t1, 1, iif(signal
Go to full screen mode

Exit full screen

If we use pandas, the script looks like this:

t1=60
t10=50
t20=30
t2=20
signal=pd.Series((20.12,8.78,4.39,-20.68,-8.49,-6.98,0.7,2.08,8.97,12.41))
start=time.time()
direction1=(signal.apply(lambda signal: 1 if signal > t1 else(-1 if signal
Go to full screen mode

Exit full screen

The test below generates a random array of 10 million signals between 0 and 100 to test the performance of DolphinDB and pandas. The test environment is set up as follows:

CPU: Intel(R) Core(TM) i7–7700 CPU @3.60GHz 3.60 GHz

Memory: 16 GB

Operating System: Windows 10

The executions take 171.73 ms (DolphinDB) and 3.28 seconds (pandas), respectively.

DolphinDB script:

t1= 10
t10 = 6
t20 = -6
t2 = -10
signal = rand(100.0, 10000000)
direction = (iif(signal>t1, 1, iif(signal
Go to full screen mode

Exit full screen

panda’s script:

import time
t1= 60
t10= 50
t20= 30
t2= 20
signal= pd.Series(np.random.random(10000000) * 100)
start= time.time()
direction1=(signal.apply(lambda signal: 1 if signal > t1 else(-1 if signal
Go to full screen mode

Exit full screen