A Metrics-Oriented Architectural Model
to Characterize Complexity on
Machine Learning-Enabled Systems

https://renatocf.xyz/cain-2025-slides

2025

Renato Cordeiro Ferreira

Institute of Mathematics and Statistics (IME)
University of São Paulo (USP)  Brazil

Jheronimus Academy of Data Science (JADS)
Technical University of Eindhoven (TUe) – The Netherlands

Co-funder of CodeLab
Brazilian student group focused on stimulating technological innovation.

B.Sc. and M.Sc. at University of São Paulo (BR)

Theoretical and practical experience with machine learning and software engineering.

Scientific Programmer at JADS (NL)

Participating on European Projects using Machine Learning techniques.

Ph.D. candidate at USP

Research about intelligent software engineering, in particular metrics to build ML-enabled systems.

Renato Cordeiro Ferreira

https://renatocf.xyz/contacts

My goal is to
use metrics to identify
where complexity emerges
in the software architecture
of ML-enabled systems

Research Questions

What are the measurable dimensions of complexity
in the architecture of ML-enabled systems?

How can complexity metrics be operationalized
over the architecture of ML-enabled systems?

RQ1

RQ2

RQ3

How can complexity metrics be used to aid
the development, operation, and evolution

of real-world ML-enabled systems?

Research Questions

How can complexity metrics be used to choose between architecture proposals for an ML-enabled system?

How can complexity metrics be used to identify refactoring opportunities in an ML-enabled systems?

RQ3.1

RQ3.2

RQ3

How can complexity metrics be used to aid
the development, operation, and evolution

of real-world ML-enabled systems?

Research Methodology

State of the art

about metrics

regarding

ML-Enabled

Systems

Industry- and academic-based case study on complexity metrics for
ML-Enabled Systems

Mixed-method approach to assess the impact of complexity in development tasks for
ML-Enabled Systems

Threats to Validity

Construct Validity
The study can measure what it proposed to measure

Internal Validity
The study can produce the results it reported

External Validity
The study can be generalized to other contexts

Conclusion
The study can be replicated by other researchers

C

I

E

R

I

C

E

R

E

C

Data from

knowledge bases

Researcher

Ontology

Design

Choice of

Case Studies

Selection of

Metrics

Constructed
Examples

 Sampling

Population of

Practitioners 

I

C

E

R

E

C

 Established

Publication

Databases
 

Guidelines

from

Literature

Inclusion

Criteria for

Case Studies

Exploratory +

Confirmatory

Case Studies

 Sampling

Population of

Practitioners 

Constructed
Examples

Reference Architecture
for ML-Enabled Systems

Case Studies

Data Collection App
Scientific Initiation 2021
Francisco Wernke

Streaming
Prediction Server
+ Client API / App

Capstone Project 2022
Vitor Tamae

Highly Availability
with Kubernetes

Capstone Project 2023
Vitor Guidi

Redesign Continuous Training Subsystem
Capstone Project 2023
Daniel Lawand

CI/CD/CD4ML on
Training Pipeline

Capstone Project 2024
Lucas Quaresma
+ Roberto Bolgheroni

Continuous Training
Serving
Continuous Delivery
Data Acquisition
Development
Monitoring




Data
Store
Synthetic
Data Gen.
Pipeline
Rule-Based
Training
Pipeline
Metadata
Store
Raw Data
Store
Model
Registry
CI
Pipeline
Artifact
Registry
Scheduler
Service
Code
Repository
CD4ML
Pipeline
Batch
Prediction
Pipeline
API
Prediction Service
Prediction
Store
Code Editor
IDE
Notebooks
3rd party
Providers
Label
Store
CD
Pipeline
Web
Application
Physical
Sensors
(Marine Objects)
1
2
I
III
4
5
V
VII
VI
9
8
VIII
6
B
A
D
G
F
E
H
C
[ build ]
[ trigger ]
[ trigger ]
Manual
Trigger
[ redeploy ]
[ run ]
[  deploy  ]
[  update  ]
[ rollback ]
Data
Crawlers
3
Data
Augmentation
Pipeline
II
ML-Based
Training
Pipeline
IV
Governance
Application
7
Telemetry
Store
I
[ train ]
[ train ]
[ trigger ]

Research Team
MSc Students

Innovation Team
PDEng Trainees

Ui Dev Team
Hired Developers

Core Dev Team
Scientific Programmers

Work Plan

A Metrics-Oriented Architectural Model
to Characterize Complexity on
Machine Learning-Enabled Systems

https://renatocf.xyz/phd-quali-live

2025

Renato Cordeiro Ferreira

Supervisor: Prof. Dr. Alfredo Goldman

Co-Supervisor: Prof. Dr. Damian Tamburri

IME-USP