Presenter: Simone Scardapane
Webinar, Italy Big Data & Machine Learning Meetup, 21 July 2017
Can we predict the skill of a player?
import numpy as np
np.random.seed(256)
# Let us load some data!
import pandas as pd
data = pd.read_csv('./Data/SkillCraft1_Dataset.csv', na_values=('?'))
data.ix[0]
GameID 52.000000
LeagueIndex 5.000000
Age 27.000000
HoursPerWeek 10.000000
TotalHours 3000.000000
SelectByHotkeys 0.003515
AssignToHotkeys 0.000220
UniqueHotkeys 7.000000
MinimapAttacks 0.000110
MinimapRightClicks 0.000392
ActionLatency 40.867300
TotalMapExplored 28.000000
WorkersMade 0.001397
UniqueUnitsMade 6.000000
ComplexUnitsMade 0.000000
ComplexAbilitiesUsed 0.000000
Name: 0, dtype: float64
Thompson, J.J., Blair, M.R., Chen, L. and Henrey, A.J., 2013. Video game telemetry as a critical tool in the study of complex skill learning. PloS one, 8(9), p.e75129.
# We remove missing values from the dataset
# by replacing with most common values
from sklearn import preprocessing
data.ix[:,:] = preprocessing.Imputer().fit_transform(data.values)
# We train a random forest to classify
# the predicted league of a player
from sklearn import ensemble
rf = ensemble.RandomForestClassifier()\
.fit(data.values[1:, 2:], data.values[1:, 1])
print('Predicted league is:', rf.predict(data.values[0, 2:].reshape(1, -1)))
Predicted league is: [ 5.]
import autosklearn.classification
automl = autosklearn.classification.AutoSklearnClassifier()
automl.fit(data.values[1:, 2:], data.values[1:, 1])
y_hat = automl.predict(data.values[0,2:])
An overview of the AutoML system taken from: Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M. and Hutter, F., 2015. Efficient and robust automated machine learning. In Advances in Neural Information Processing Systems (pp. 2962-2970).
# Create a simple Keras model
model = Sequential()
model.add(Conv2D(6, (3, 3), input_shape=(1, 50, 50), activation='relu'))
model.add(Conv2D(3, (3, 3), strides=(2,2), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dropout(0.3))
model.add(Dense(1, activation='sigmoid', W_regularizer=l2(0.1)))
# Compile the model
sgd = SGD(lr=0.01, momentum=0.8, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
print(model.summary())
https://developer.apple.com/documentation/coreml
https://cloud.google.com/ml-engine/
McDaniel, P., Papernot, N. and Celik, Z.B., 2016. Machine learning in adversarial settings. IEEE Security & Privacy, 14(3), pp. 68-72.
ML offers a fantastically powerful toolkit for building useful complex prediction systems quickly. ... it is dangerous to think of these quick wins as coming for free. ... it is common to incur massive ongoing maintenance costs in real-world ML systems. [Risk factors include] boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, configuration issues, changes in the external world, and a variety of system-level anti-patterns.
Machine Bias [Pro Publica]
There is a blind spot in AI research [Nature]
DeepCoder: Learning to Write Programs [arXiv preprint]
DeepCoder: Learning to Write Programs [arXiv preprint]