Data Scientist
formulate
problem
get data
build and clean dataset
study dataset
train model
+ feature selection
+ algorithm selection
+ hyperparameter optimization
goal = "automatically serve predictions to any given information system"
philosophy = "start from final use case and work your way back to data-prep"
potential_constraints = [
data_sources,
data_enrichment,
model_stability,
scalability,
resilience,
resources
]
Think about your production use-case as early as possible.
This approach is based on serialization and can be used for models trained with librairies other than scikit-learn.
@app.route('/api/v1.0/aballone', methods=['POST'])
def index():
query = request.get_json()['inputs']
data = prepare(query)
output = model.predict(data)
return jsonify({'outputs': output})
if __name__ == '__main__':
app.run(host='0.0.0.0')