There’s one type of osh for everyone
There’s one type of osh for everyone
There’s one type of osh for everyone. You only request how much you want.
You request exactly what you want.
You can customize based on your taste.
Give me one portion with:
You get what’s cooked. Give me one portion
extra meat
less oil
eggs
You ask for: nvidia.com/gpu: 1
You could not say:
Describe what you actually need...
You can ask for:
resources:
request:
nvidia.com/gpu: 1spec:
requirements:
- deviceClassName: gpu
selectors:
- name: model
value: A100
- name: memory
value: "40Gi"Driver writer
Cluster admin
Application Developer/DevOps
Driver writer
Cluster admin
Application Developer/DevOps
.. is someone who understands how a piece of hardware works, basically knows writing the software that lets to control and allocate that hardware.
Driver developer
Cluster admin
Application Developer/DevOps
.. is someone who understands how a piece of hardware works, basically knows writing the software that lets to control and allocate that hardware.
Decides:
Driver developer
Cluster admin
Application Developer/DevOps
.. is someone who understands how a piece of hardware works, basically knows writing the software that lets to control and allocate that hardware.
Decides:
Driver developer
Cluster admin
Application Developer/DevOps
.. is someone who understands how a piece of hardware works, basically knows writing the software that lets to control and allocate that hardware.
Decides:
Driver developer
Cluster admin
App developer/DevOps
.. is who installs the DRA driver, sets up device classes, and configures nodes (e.g., attaching GPUs) so workloads can use the hardware.
Driver developer
Cluster admin
App developer/DevOps
.. is someone who knows the application needs and defines resource requirements (define ResourceClaims) for the their application
helm install dra-driver-pizza....kubectl get resourceSlices
NAME NODE DRIVER POOL AGE
kind-control-plane-dra.pizza-9q2ls kind-control-plane dra.pizza kind-control-plane 20hhelm install dra-driver-pizza....kubectl get resourceSlices
NAME NODE DRIVER POOL AGE
kind-control-plane-dra.pizza-9q2ls kind-control-plane dra.pizza kind-control-plane 20hapiVersion: pizza.kitchen/v1
kind: ResourceSlice
metadata:
name: pizzahut-matinkyla
spec:
pizzas:
- name: margherita-pan-pizza
attributes:
kitchen.pizza.example/dough:
string: pan
kitchen.pizza.example/sauce:
string: tomato
kitchen.pizza.example/cheese:
string: mozzarella
kitchen.pizza.example/toppings:
string: "basil"
kitchen.pizza.example/extraCheeseAvailable:
bool: true
kitchen.pizza.example/availableSlices:
string: "4,6,8"
- name: pepperoni-pan-pizza
attributes:
kitchen.pizza.example/dough:
string: pan
kitchen.pizza.example/sauce:
string: tomato
kitchen.pizza.example/cheese:
string: mozzarella
kitchen.pizza.example/toppings:
string: "pepperoni"
kitchen.pizza.example/extraCheeseAvailable:
bool: true
kitchen.pizza.example/availableSlices:
string: "6,8"helm install dra-driver-pizza....
"I want a vegetarian pizza with mushrooms, basil, and extra cheese"
deviceClass
attributes
apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: intelCpu
spec:
selectors:
- cel:
expression: |
attributes["hardware.cpu/architecture"].string == "x86_64" &&
attributes["hardware.cpu/vendor"].string == "amd"apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: intelCpu
spec:
selectors:
- cel:
expression: |
attributes["hardware.cpu/architecture"].string == "x86_64" &&
attributes["hardware.cpu/vendor"].string == "intel"apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: intelCpu
spec:
selectors:
- cel:
expression: |
attributes["hardware.cpu/architecture"].string == "x86_64" &&
attributes["hardware.cpu/vendor"].string == "amd"apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: intelCpu
spec:
selectors:
- cel:
expression: |
attributes["hardware.cpu/architecture"].string == "x86_64" &&
attributes["hardware.cpu/vendor"].string == "intel"
"I want a vegetarian pizza with mushrooms, basil, and extra cheese"
deviceClass
attributes
apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: pizza-vegetarian
...
spec:
selectors:
- cel:
expression: |
attributes["kitchen.pizza.example/cheese"].string == "mozzarella" &&
attributes["kitchen.pizza.example/toppings"].string.contains("mushroom") &&
attributes["kitchen.pizza.example/extraCheeseAvailable"].bool == true
"I want a vegetarian pizza with mushrooms, basil, and extra cheese"
deviceClass
attributes
apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: pizza-vegetarian
...
spec:
selectors:
- cel:
expression: |
attributes["kitchen.pizza.example/cheese"].string == "mozzarella" &&
attributes["kitchen.pizza.example/toppings"].string.contains("mushroom") &&
attributes["kitchen.pizza.example/extraCheeseAvailable"].bool == true
"I want a vegetarian pizza with mushrooms, basil, and extra cheese"
deviceClass
attributes
apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: gpu.amd.com
spec:
selectors:
- cel:
expression: "device.driver == 'gpu.amd.com'"apiVersion: resource.k8s.io/v1
kind: DeviceClass
metadata:
name: mig.nvidia.com
spec:
selectors:
- cel:
expression: "device.driver == 'gpu.nvidia.com' && device.attributes['gpu.nvidia.com'].type == 'mig'"apiVersion: resource.k8s.io/v1
kind: ResourceClaim
metadata:
name: my-pizza-order
spec:
devices:
requests:
- name: pizza
deviceClassName: vegeterian-pizza
selectors:
- cel:
expression: |-
device.attributes["kitchen.pizza.example/cheese"].string == "mozzarella" &&
device.attributes["kitchen.pizza.example/toppings"].string.contains("mushroom") &&
device.attributes["kitchen.pizza.example/extraCheeseAvailable"].bool == true
apiVersion: resource.k8s.io/v1
kind: ResourceClaim
metadata:
name: claim-cpu-capacity-20
spec:
devices:
requests:
- name: numa0-cpus
exactly:
deviceClassName: dra.cpu
capacity:
requests:
dra.cpu/cpu: "10"
selectors:
- cel:
expression: device.attributes["dra.cpu"].numaNodeID == 0
- name: numa1-cpus
exactly:
deviceClassName: dra.cpu
capacity:
requests:
dra.cpu/cpu: "10"
selectors:
- cel:
expression: device.attributes["dra.cpu"].numaNodeID ==1Pod references ResourceClaim
ResourceClaim
deviceClass + CEL constraints
DeviceClass
cluster wide device filter
ResourceSlice
devices advertised by
driver
DRA driver
scheduler
matches the claim to the node
ResourceClaim is allocated
status is updated by scheduler
DRA driver on Node
NodePrepareResource called
Kubelet mounts devices into Pod sandbocx
Pod starts
device ready, CDI injected
CPU, memory, hugepages, NIC, RDMA net devices and etc.