A quick-start guide on how to get started making using AI and ML
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Computer Vision & Natural Language Processing
Audio data is usually doable, but results are less good
Mature tech | Heavy duty (forget RPI) |
Easy to start with | Hard to understand |
Tabular data is most likely classical machine learning
Reinforcement Learning is mainly interesting for games
Unsupervised ML sucks hard for makers
Try to minimize your inputs and outputs
Matlab
Intel OneAPI
ONNX
TensorRT
TensorFlow / Keras
+ Lite!
Scikit
Learn
Or even replace what a human would manually input
¯\_(ツ)_/¯
Person is present on the camera feed
Tweet is unfriendly
Audio sample contains dog barks
or proximity sensor?
or weight sensor?
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
All examples of CV applications
Keypoints
Partial credit: Anthony Sarkis, Standford CS231n
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
Pretrained Models
Transfer Learning
DATA
MODEL
DEPLOYMENT
Pretrained Models
Transfer Learning
🔥Huggingface
🔥Pinto Model Zoo
Tensorflow Hub
TensorFlow Model Garden
OpenVino Model Zoo
Awesome Pytorch List
Awesome TensorFlow Lite
The project itself on Github
DATA
MODEL
DEPLOYMENT
Pretrained Models
Transfer Learning
🔥Huggingface
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe = pipe.to('cuda')
prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt, guidance_scale=7.5)["sample"][0]
image.save("astronaut_rides_horse.png")
DATA
MODEL
DEPLOYMENT
Pretrained Models
Transfer Learning
DATA
MODEL
DEPLOYMENT
Pretrained Models
Transfer Learning
Reuse pretrained model that's closest to what you want
Retrain only last layers with little data
???
Profit
DATA
MODEL
DEPLOYMENT
Pretrained Models
Transfer Learning
DATA
MODEL
DEPLOYMENT
Found BlazePose on PINTO model zoo
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
kp1 | kp2 | kp3 | kp4 | kp5 | kp6 | kp7 | kp8 | kp9 | |
---|---|---|---|---|---|---|---|---|---|
x | 2 | 3 | 5 | 7 | 8 | 9 | 0 | -1 | 4 |
y | 6 | 8 | 1 | 3 | 5 | -4 | -2 | 1 | 0 |
Tabular Data in diguise!
>>> from sklearn.svm import SVC
>>> clf = make_pipeline(StandardScaler(), SVC(gamma='auto'))
>>> clf.fit(X, y)
Pipeline(steps=[('standardscaler', StandardScaler()),
('svc', SVC(gamma='auto'))])
>>> print(clf.predict([[-0.8, -1]]))
[1]
DATA
MODEL
DEPLOYMENT
PUSH UP
PUSH DOWN
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
Classical ML doesn't need a model server
class Inference:
def __init__(self):
training_task = Task.get_task(task_id='4cbfaf6975df463b89ab28379b00639b')
self.scaler = joblib.load(training_task.artifacts['scaler_remote'].get_local_copy())
self.model = joblib.load(training_task.artifacts['model_remote'].get_local_copy())
def preprocess_landmarks(self, landmarks):
self.scaler.transform(landmarks)
return landmarks
def run_model(self, landmarks):
predicted = self.model.predict(landmarks)
return predicted
def predict(self, landmarks):
scaled_landmarks = self.preprocess_landmarks(landmarks)
predicted = self.run_model(scaled_landmarks)
return predicted
GOOD ENOUGH
DATA
MODEL
DEPLOYMENT
Labeling Tool Edition
DATA
MODEL
DEPLOYMENT
DATA
MODEL
DEPLOYMENT
🔥Huggingface
Premade APIs
from transformers import Speech2Text2Processor, SpeechEncoderDecoderModel
model = SpeechEncoderDecoderModel.from_pretrained("facebook/s2t-wav2vec2-large-en-de")
processor = Speech2Text2Processor.from_pretrained("facebook/s2t-wav2vec2-large-en-de")
inputs = processor("hey_newline.wav", sampling_rate=16_000, return_tensors="pt")
generated_ids = model.generate(inputs=inputs["input_values"],
attention_mask=inputs["attention_mask"])
transcription = processor.batch_decode(generated_ids)
DATA
MODEL
DEPLOYMENT
🔥Huggingface
Premade APIs
from transformers import Speech2Text2Processor, SpeechEncoderDecoderModel
model = SpeechEncoderDecoderModel.from_pretrained("facebook/s2t-wav2vec2-large-en-de")
processor = Speech2Text2Processor.from_pretrained("facebook/s2t-wav2vec2-large-en-de")
inputs = processor("hey_newline.wav", sampling_rate=16_000, return_tensors="pt")
generated_ids = model.generate(inputs=inputs["input_values"],
attention_mask=inputs["attention_mask"])
transcription = processor.batch_decode(generated_ids)
DATA
MODEL
DEPLOYMENT
Just build it in
This slide structure is already a dumpster fire
+ Voice activated labeling tool
+ Voice activated labeling tool
Takeaways
Use pretrained models: check Huggingface
Force ML into your project
Label yourself, you don't need much
victor@projectwhy.be
projectwhy.be
@VictorSonck
Check out ClearML!
It's Open Source
It can help