OMB - PARAMETER FILTERING

CW - 12.07.23

OMB - Python module

  • User interaction with renku client

renku client  -->  renku workflows, activities, datasets

  • Automatically manage inputs, outputs, parameter

manage --> identify matching inputs, check for updates, "match" outputs

User mode I: "config.yaml"

---
data:
    name: "method_XY"
    title: "example method XY"
    description: "method to do ..."
    keywords: ["method_omniexample"]
script: "src/run_method_XY.R"
benchmark_name: "omniexample"
inputs:
    keywords: ["dataset_omniexample"]
    files: ["counts", "meta_file"]
    prefix:
        counts: "_counts"
        meta_file: "_meta"
outputs:
    files:
        method_res:
            end: "json"
parameter:
    names: ["dims", "n"]
    keywords: ["parameter_omniexample"]

Explicit input files

---
data:
	...
script: "src/run_method_XY.R"
benchmark_name: "omniexample"
inputs:
    keywords: ["dataset_omniexample"]
    files: ["counts", "meta_file"]
    prefix:
        counts: "_counts"
        meta_file: "_meta"
    input_files:
    	explicit_dataset:
      		count_file: "data/explicit_dataset/explicit_dataset.mtx.gz"
      		meta_file: "data/explicit_dataset/explicit_dataset.json"
outputs:
	...

Input dataset filter

---
data:
	...
script: "src/run_method_XY.R"
benchmark_name: "omniexample"
inputs:
    keywords: ["dataset_omniexample"]
    files: ["counts", "meta_file"]
    prefix:
        counts: "_counts"
        meta_file: "_meta"
    filter_names: ["dataset1", "dataset2"]
outputs:
	...

Implicit parameter space

---
inputs:
    ...
outputs:
    ...
parameter:
    names: ["dims", "n"]
    keywords: ["parameter_omniexample"]

Explicit parameter space

---
inputs:
    ...
outputs:
    ...
parameter:
    values:
    	dims: [10, 20]
    	n: [10, 100, 1000]

Explicit parameter combinations

---
inputs:
    ...
outputs:
    ...
parameter:
    combinations:
    	comb1:
		dims: 10
		n: 10
    	comb2:
      		dims: 10
      		n: 100
        comb3:
      		dims: 20
      		n: 10

Parameter limits and filter

---
inputs:
    ...
outputs:
    ...
parameter:
    names: ["dims", "n"]
    keywords: ["parameter_omniexample"]
    filter:
    	dims:
      		upper: 100
      		lower: 10
    	n:
      		exclude: 100

How to combine input and parameter filter?

Dataset X
true k: 3
Dataset Y
true k: 8
Parameter
k: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
Dataset X:
Dataset Y:
K
[1,2,3,4,5,6]
[5,6,7,8,9,10]

Implicit output definition

---
inputs:
    ...
outputs:
    files:
    	method_res: 
        	end: ".json"

Explicit output definition

---
inputs:
    ...
outputs:
  file_mapping:
    mapping1: 
      output_files:
        method_res: "data/method_XY/method_XY_data1__dims_10__n_100.json"
      input_files:
        count_file: "data/data1/data1__counts.mtx.gz"
        meta_file: "data/data1/data1__meta_file.json"
      parameter:
        dims: 10
        n: 100
    mapping2: 
      output_files:
        method_res: "data/method_XY/method_XY_data2__dims_10__n_10.json"
      input_files:
        count_file: "data/data2/data2__counts.mtx.gz"
        meta_file: "data/data2/data2__meta_file.json"
      parameter:
        dims: 10
        n: 10

How to combine input and parameter filter?

Dataset X
true k: 8
Dataset Y
true k: 3
Parameter
k: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
Dataset X:
Dataset Y:
K
[1,2,3,4,5,6]
[5,6,7,8,9,10]
Dataset ?:
[?]
Dataset ?
true k: ?

Filter-combinations.json

---
inputs:
    ...
outputs:
    files:
    	method_res: 
        	end: ".json"
    filter_json: "path/to/filter_combinations.json"
    

User mode II: OmniObject

import omnibenchmark as omni

omni_obj = omni.utils.get_omni_object_from_yaml("config.yaml")

omni_obj.__class__
<class 'omnibenchmark.core.omni_object.OmniObject'>
import omnibenchmark as omni

omni_obj = omni.utils.get_omni_object_from_yaml("config.yaml")
omni_obj.__dict__

{'name': 'method_XY', 
'keyword': ['method_omniexample'], 
'title': 'example method XY', 
'description': 'method to do ...', 
'command': <omnibenchmark.core.output_classes.OmniCommand object, 
'inputs': <omnibenchmark.core.input_classes.OmniInput object, 
'outputs': <omnibenchmark.core.output_classes.OmniOutput object, 
'parameter': <omnibenchmark.core.input_classes.OmniParameter object, 
'script': 'src/run_method_XY.R', 
'omni_plan': None, 
'orchestrator': 'https://renkulab.io/omniexample/orchestrator', 
'benchmark_name': 'omniexample', 
'wflow_name': 'method_XY', 
'dataset_name': 'method_XY', 
'renku': True, 
'kg_url': 'https://renkulab.io/knowledge-graph'}

OmniObject

import omnibenchmark as omni
import filter_combinations

# Build object from config
omni_obj = omni.utils.get_omni_object_from_yaml("config.yaml")
omni_obj.__dict__

# Check for new inputs/updates 
omni_obj.update_object()

# Generate json with filter combinations
filter_comb = filter_combinations.get_param_filter_by_ground_truth(omni_obj)
with open("src/filter_comb.json", "w") as fp:
    json.dump(filter_comb, fp, indent=3)
renku_save()

# Update outputs and commands
omni_obj.outputs.filter_json = "src/filter_comb.json"
omni_obj.outputs.update_outputs()
omni_obj.command.outputs = omni_obj.outputs
omni_obj.command.update_command()

Automatically generate filter_comb.json

import omnibenchmark as omni
import filter_combinations

# Build object from config
omni_obj = omni.utils.get_omni_object_from_yaml("config.yaml")
omni_obj.__dict__

# Check for new inputs/updates 
omni_obj.update_object()

# Generate json with filter combinations
filter_comb = filter_combinations.get_param_filter_by_ground_truth(omni_obj)
with open("src/filter_comb.json", "w") as fp:
    json.dump(filter_comb, fp, indent=3)
renku_save()

## update outputs and commands
omni_obj.outputs.filter_json = "src/filter_comb.json"
omni_obj.outputs.update_outputs()
omni_obj.command.outputs = omni_obj.outputs
omni_obj.command.update_command()

Automatically generate filter_comb.json

Summary Input-Parameter Filtering

3 main level filtering:

  • inputs
  • parameter
  • combinations

(renku-independent)

flexibility vs. simplicity?

cw: omb-filtering

By Almut Luetge

cw: omb-filtering

  • 65