ML frameworks - comparison and experience
AutoML
How to evaluate ML model
An AutoML system based on Keras.
pip install autokeras
3.5 <= Python < 3.9 and TensorFlow >= 2.3.0
from sklearn.datasets import fetch_california_housing
import numpy as np
import pandas as pd
import tensorflow as tf
import autokeras as ak
house_dataset = fetch_california_housing()
df = pd.DataFrame(
np.concatenate((
house_dataset.data,
house_dataset.target.reshape(-1,1)),
axis=1),
columns=house_dataset.feature_names + ['Price'])
train_size = int(df.shape[0] * 0.9)
df[:train_size].to_csv('train.csv', index=False)
df[train_size:].to_csv('eval.csv', index=False)
train_file_path = 'train.csv'
test_file_path = 'eval.csv'
# Initialize the structured data regressor.
reg = ak.StructuredDataRegressor(
overwrite=True,
max_trials=3) # It tries 3 different models.
# Feed the structured data regressor with training data.
reg.fit(
# The path to the train.csv file.
train_file_path,
# The name of the label column.
'Price',
epochs=10)
# Predict with the best model.
predicted_y = reg.predict(test_file_path)
# Evaluate the best model with testing data.
print(reg.evaluate(test_file_path, 'Price'))
conf. matrix | Data | Data | |
---|---|---|---|
positive | negative | ||
Model | positive | a | b |
Model | negative | c | d |
Accuracy = (a+d)/(a+b+c+d)
Sensitivity (Recall) = a/(a+c) proportion of positive cases correctly identified
Specificity = d/(b+d) proportion of negative cases correctly identified
F1 = 2*((precision*recall)/(precision+recall))
value from -1 to 1
0 equals random walk
Root Mean Squared Error |
RMSE is probably the most popular formula to measure the error rate of a regression model. |
Relative squared error (RSE) can be compared between models whose errors are measured in the different units. |
The mean absolute error (MAE) has the same unit as the original data, and it can only be compared between models whose errors are measured in the same units. It is usually similar in magnitude to RMSE, but slightly smaller.
Like RSE , the relative absolute error (RAE) can be compared between models whose errors are measured in the different units.
# Export as a Keras Model.
model = clf.export_model()
print(type(model)) # <class 'tensorflow.python.keras.engine.training.Model'>
try:
model.save("model_autokeras", save_format="tf")
except:
model.save("model_autokeras.h5")
from tensorflow.keras.models import load_model
loaded_model = load_model("model_autokeras", custom_objects=ak.CUSTOM_OBJECTS)
predicted_y = loaded_model.predict(tf.expand_dims(x_test, -1))
print(predicted_y)