Applying Machine Learning to Cryptocurrency Trading

Paweł Duda (@paweldude)

Disclaimer

I am no expert in any of this

My machine learning skill

My cryptocurrency trading skill

My time for this presentation

A very simple

classification example

Sides Figure
3 Triangle
4 Quadrilateral
5 Pentagon

Feature

Classes

Sides Interior angles sum (...) Figure
3 180​° (...) Triangle
4 360​° (...) Quadrilateral
5 540​° (...) Pentagon

Features

Classes

We can extract more features from our data set to improve accuracy

Feature vector

from sklearn.svm import SVC

# Feature vectors: [[sides]]
X = [[3],        [4],             [5],        [3],        [3]       ]
# Classes: figure name
y = ['triangle', 'quadrilateral', 'pentagon', 'triangle', 'triangle']

classifier = SVC() # support vector classifier
classifier.fit(X, y)

X_test = [[3], [4], [4], [5], [3], [4]]
y_test = ['triangle', 'quadrilateral', 'quadrilateral', 'pentagon', 'triangle', 'quadrilateral']

print(classifier.score(X_test, y_test)) # 1.0 (100% correct)
print(classifier.predict([[3], [5], [4]])) # ['triangle', 'pentagon', 'quadrilateral']

My "Hello World" of Machine Learning

The  trading problem

Price change (5 min) Signal
Significant* increase Buy
No significant* change Hold
Significant* decrease Sell

* more than a transaction fee would cost

What I used

Few years of historical data

about 70 BTC/altcoin markets

The simulation

  • Train the classifier using data from 1 market, run simulations on remaining markets, manually review results
  • Always assume the worst case transaction fees (0.25%)
  • Assume starting investment portfolio of $10 worth of Bitcoin (back then ~0.008 BTC)

First simulations

BTC/DGB (Digibyte), 02/2015 - 04/2017

Problem: too many unprofitable trades

BTC/FCT (Factom), 10/2015 - 04/2017

Problem: too few trade signals over years

What I tried to improve the model

  • extracted more features from my dataset (Technical Analysis indicators)
  • tried different classifiers/parameters
  • basically followed my gut while brute forcing different possibilities and comparing results

Technical analysis

In finance, technical analysis is an analysis methodology for forecasting the direction of prices through the study of past market data, primarily price and volume.

https://en.wikipedia.org/wiki/Technical_analysis

http://blueeconomy.net/wp-content/uploads/2016/01/forex-technical.png

  • Single, Weighted Moving Averages
  • Momentum
  • Relative Strength Index
  • Commodity Channel Index
  • Stochastic Oscillator
  • Moving Average Convergence/Divergence Oscillator
  • Williams %R
  • Accumulation/Distribution oscillator
  • On Balance Volume
  • Aroon
  • Average True Range

Some of technical indicators I have been using during the training:

* I don't understand what most of these are for

Different algorithms I used for training

(with scikit-learn switching from one to another is really simple)

* I don't understand what most of these are for

from sklearn.discriminant_analysis \ 
import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis
from sklearn.ensemble import RandomForestClassifier, 
AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC

The best algorithm
in my case: AdaBoost

AdaBoost is exactly like human specialization. Get person (weak learner) A to learn problem X. Whatever part of X A is not good at, get person B to learn that subset. Whatever A and B are not good at, get C to learn that. And so on. Each learner specializes in the weakest area that needs the most improvement.

https://www.reddit.com/r/MachineLearning/comments/1jcx2a/an_eli5_explanation_of_adaboost/cbdhb4d/?st=j3ulo3je&sh=a9a90fee

Simulations after improvements

BTC/BTCD (BitcoinDark), 06/2014 - 04/2017

Start: 0.008 BTC | Exit: >100 BTC

BTC/BCN (Bytecoin), 05/2014 - 04/2017

Start: 0.008 BTC | Exit: 4 * 10^45 BTC

BTC/EXP (Expanse), 03/2014 - 04/2017

Start: 0.008 BTC | Exit: ~ 1.25 BTC

How it went in production

  • Simulation =/= reality
  • Slowly losing money over time
  • After a few tweaks it sometimes gained money but still kept losing it faster
  • Sudden change in API rate limits enforced by the exchange over time made it impossible to continue the experiment

 

Conclusion: the pre-alpha doesn't look ready but I expected it to be much worse

How it went in production

  • Timeframe - 6 weeks
  • Lost about $10 (~3 Grander Texas burgers)
  • Learned a thing or two about machine learning and how markets work
  • Success: managed to improve the simulated results dramatically
  • One of the most exciting side-projects I have been working on
  • Got a lot of great ideas about improving the project that could consume hundreds of hours

Summary

Thank you

Made with Slides.com