Data Analysis Working Group
presented by Matias & Wout, Software Group
23/06/2021
BLISS: a Swiss knife to open Data Analysis
Data flow
Acquisition chain
Channels
1
2
3
Redis
1
What is Redis ?
1
producer
producer
producer
producer
consumer
consumer
consumer
XADD
XREAD
Redis streams
1
individual string values
we arbitrary limit streams to 2000 string values max
1 string value can correspond to multiple data events
BLISS client API: knows scans structure, read data
Note: Redis data is only transient !
Consumers need to continuously read from it, to get complete data
(Like Nexus Writer for example)
1
1
Beware of "scan.get_data()"
Only useful for small scans with current implementation (alignment...)
References
2
References
2
Online Data Analysis
3
ODA: definition and ESRF use cases ?
3
Real-time analysis while acquisition is running ?
Opening scan file just at the end of scans ?
Running in BLISS shell process ?
Running on another computer/cluster ?
Stopping scan if data is not good ?
Live feedback and scan re-orientation in real time ?
Automatic analysis ? How to feedback results ?
Lima
Flint
Writer
Analysis
BLISS
network I/O,
serialization,
deserialization
file I/O
links
unused path ?
ODA data flow today
3
Leveraging BLISS API to perform Online Data Analysis
3
BLISS API for Online Data Analysis
3
Scan Watching
3
Example from Wout, running in a Jupyter Notebook; see also on-the-fly FFT from scan data
Example from BM29:
BLISS shell
Scan
BM29 watcher
already listening
Processing, DAHU server
scan command
"scan
start"
execute commands,
send relevant data
scan command returns
"scan
data"
"scan end"
Questions so far
3
More complicated ODA...
Feeding back BLISS shell with ODA results
BLISS shell
Scan
Processing with Scan Watcher
already listening
scan command
"scan
start"
scan command returns
"scan
data"
"scan end"
start experiment script
set processing result value on Beacon channel
wait result on Beacon channel
"result received"
Driving acquisition from ODA (1)
Problems with running multiple BLISS sessions in multiple processes
Driving acquisition from ODA (2)
Another way is to add "remote control" capabilities to a running BLISS process, like SPEC remote feature
from bliss.setup_globals import *
from bliss.common import standard
from xmlrpc.server import SimpleXMLRPCServer
import inspect
import gevent
xmlrpc_server = SimpleXMLRPCServer(("", 8000))
xmlrpc_server.register_introspection_functions()
# register all standard functions to make them available via xml-rpc server
for name, func in inspect.getmembers(standard, inspect.isfunction):
xmlrpc_server.register_function(func)
# start xml-rpc server in background
gevent.spawn(xmlrpc_server.serve_forever)
in BLISS setup script...
from remote Python process...
Python 3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from xmlrpc.client import ServerProxy
>>> xmlrpc_client = ServerProxy("http://localhost:8000", allow_none=True)
>>> xmlrpc_client.system.list_methods()
['Group', 'SoftAxis', '__move', '_lsmot', '_lsobj', 'a2scan', 'a3mesh', 'a3scan', 'a4scan', 'a5scan', 'amesh', 'anmesh', 'anscan', 'ascan', 'cleanup', 'clear_cache', 'ct', 'd2scan', 'd3mesh', 'd3scan', 'd4scan', 'd5scan', 'dmesh', 'dnscan', 'dscan', 'error_cleanup', 'info', 'interlock_state', 'iter_axes_position', 'iter_axes_position_all', 'iter_axes_state', 'iter_axes_state_all', 'iter_counters', 'lineup', 'lookupscan', 'loopscan', 'move', 'mv', 'mvd', 'mvdr', 'mvr', 'namedtuple', 'plot', 'pointscan', 'reset_equipment', 'rockit', 'safe_get', 'sct', 'sync', 'system.listMethods', 'system.methodHelp', 'system.methodSignature', 'timescan', 'wid']
>>> xmlrpc_client.mv("roby", 5)
>>>
Perspectives
A future objective is to turn BLISS into a server, which would host sessions and would offer a web shell
This will come with a REST interface for remote operation
What about a BLISS Data Analysis object ?
da.load_module("id31.analysis.whatever")
Proposal to have a common way to deal with analysis from BLISS scripts
BLISS process, computer C1
generic analysis server compatible with Data Analysis object, computer C1 (or C2)
BLISS Scan Watcher
da.execute_while_scanning(scan, "function_name")
da.execute_at_scan_end(scan, "function_name")
da.wait_result()
Beyond Redis (?)
In-memory store seems a good choice
How to scale up ? Distributed memory across several computers ?
Which level of performance do we need ?
What are bottlenecks ?
(serialization/deserialization and copying ?)
Lima
Flint
Writer
Analysis
BLISS
Object store
shared memory
(indexing)
1) redis for events streaming and indexing of acquisition data
2) Immutable objects store provides access to a shared memory space. Possible technology: vineyard. Need infrastructure (Kubernetes ?)
3) Lima informs BLISS about acquisition progress and object IDs in store
Introducing a dedicated data manager
Conclusion
BLISS API can provide data for online data analysis
There is room for improvement, though - final components still to be defined
It is unclear what our needs are : how to address use cases efficiently ?
Depending on use cases, a solution for the current needs and future needs might imply to opt for other technologies + infrastructure