Evolutionary and Reinforcement Learning Approaches for GW Data Analysis

He Wang

International Centre for Theoretical Physics Asia-Pacific (ICTP-AP)

University of Chinese Academy of Sciences  (UCAS)

hewang@ucas.ac.cn

Evolutionary and Reinforcement Learning Approaches
for Gravitational-Wave Data Analysis

He Wang

International Centre for Theoretical Physics Asia-Pacific (ICTP-AP)
University of Chinese Academy of Sciences (UCAS)

Adaptive trajectory, not a single optimum

Abstract

Upcoming challenges such as MLGWSC2, currently at the proposal stage, provide a new testbed for exploring machine-learning–based approaches to gravitational-wave analysis. In this flash talk, I briefly introduce my core ideas and experience using evolutionary algorithms, Evo-MCTS, and reinforcement learning as adaptive search and optimization tools. I outline key methodological insights and discuss how these ideas may inform future GW analysis tasks, including potential applications to LISA.

才翻到上面看到有人现场拍照 [破涕为笑],随手分享一下 

  • 我最近常用的PPT英语字体是 Economica,是一个风格比较现代的无衬线字体:https://fonts.google.com/specimen/Economica 
  • 但用这个字体显得好看,牺牲了一点儿清晰度,有需要的时候还是会回归Helvetica Neue
  • 衬线字体我喜欢用 Arno Pro: https://fonts.adobe.com/fonts/arno
  • 中文字体已经锁死了喜鹊宋或者木叶(收费字体)
  • 颜色一般从MetBrewer里面挑,但并没有特别注意配色:https://github.com/BlakeRMills/MetBrewer
  • 今天刚和邵老师说,可能是中年危机的一种表现,就现在越来越喜欢五颜六色的东西。。。也体现在了PPT上。这个完全见仁见智。
  •  如果有人对这种PPT感兴趣,我把一个7月份会议的短PPT分享在这供参考:https://www.dropbox.com/scl/fi/duez2bpbcck4ogtn98sw6/songhuang_sesto_20250707.key?rlkey=g18rnjym1hpzke3jxcj5y6ezh&st=ot5xu2w8&dl=0
  • 我自己现在习惯的PPT排版的风格只适合分steps展示,不能一次都show全。我自己开始使用这个风格是上课以后,需要满足PPT好看,能吸引注意力,但同时信息量够足,学生可以拿来复习。暂时觉得还好,但过两年可能还是会学着做简单一点儿。
  • 用字体大小和颜色来highlight关键词是最简单粗暴、最俗的引导视线的方法,属于广告里早就用烂了的。其实有更好的设计语言,但不会。。。
  • PPT风格纯属个人审美兴趣,和报告水平,更和报告内容好坏无关。

When LLMs Enter the Algorithmic Loop

The LLM does not predict answers — it reshapes how we search for algorithms.

Evaluation for MLGWSC-1 benchmark

LLMs act as policies over algorithms, not predictors of data.

Concept

Mechanism

problem → algorithm

data → algorithm → reward
↺ LLM-guided algorithm updates

LLM as designer

external_knowledge
(constraint)

from problem-solving to algorithm discovery

HW, LZ. arXiv:2508.03661 [cs.AI]

When LLMs Enter the Algorithmic Loop

The LLM does not predict answers — it reshapes how we search for algorithms.

 

external_knowledge
(constraint)

PyCBC (linear-core)

cWB (nonlinear-core)

Simple filters (non-linear)

CNN-like (highly non-linear)

Benchmarking against state-of-the-art methods

Evaluation for MLGWSC-1 benchmark

LLM as designer

HW, LZ. arXiv:2508.03661 [cs.AI]

LLMs act as adaptive policy priors over algorithmic decisions.

Evo-MCTS: When LLMs Enter the Algorithmic Loop

The LLM does not predict answers — it shapes the search process itself.

 

  • LLM proposes moves, not outputs

  • Search history becomes reusable knowledge

  • Algorithm behavior evolves, not just parameters

 

What changed?

  • LLMs do not predict waveforms or labels

  • LLMs propose actions that guide the search

  • Evaluations (fitness/likelihood) become reusable memory

Search trajectories matter more than isolated optima.

When LLMs Enter the Algorithmic Loop

LLM-Driven Algorithmic Evolution Through Reflective Code Synthesis.

HW, LZ. arXiv:2508.03661 [cs.AI]

Monte Carlo Tree Search (MCTS) Algorithmic Evolution Pathway

What changed?

  • LLMs propose actions that guide the search

  • Evaluations (fitness/likelihood/...) become reusable memory

Search trajectories matter more than isolated optima.

The LLM does not predict answers — it reshapes how we search for algorithms.

  • deepseek-R1 for reflection generation
  • o3-mini-medium for code generation

Interpretability Analysis

Algorithmic Component Impact Analysis.

  • A comprehensive technique impact analysis using controlled comparative methodology
import numpy as np
import scipy.signal as signal
from scipy.signal.windows import tukey
from scipy.signal import savgol_filter

def pipeline_v2(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    The pipeline function processes gravitational wave data from the H1 and L1 detectors to identify potential gravitational wave signals.
    It takes strain_h1 and strain_l1 numpy arrays containing detector data, and times array with corresponding time points.
    The function returns a tuple of three numpy arrays: peak_times containing GPS times of identified events,
    peak_heights with significance values of each peak, and peak_deltat showing time window uncertainty for each peak.
    """
    eps = np.finfo(float).tiny
    dt = times[1] - times[0]
    fs = 1.0 / dt
    # Base spectrogram parameters
    base_nperseg = 256
    base_noverlap = base_nperseg // 2
    medfilt_kernel = 101       # odd kernel size for robust detrending
    uncertainty_window = 5     # half-window for local timing uncertainty

    # -------------------- Stage 1: Robust Baseline Detrending --------------------
    # Remove long-term trends using a median filter for each channel.
    detrended_h1 = strain_h1 - signal.medfilt(strain_h1, kernel_size=medfilt_kernel)
    detrended_l1 = strain_l1 - signal.medfilt(strain_l1, kernel_size=medfilt_kernel)

    # -------------------- Stage 2: Adaptive Whitening with Enhanced PSD Smoothing --------------------
    def adaptive_whitening(strain: np.ndarray) -> np.ndarray:
        # Center the signal.
        centered = strain - np.mean(strain)
        n_samples = len(centered)
        # Adaptive window length: between 5 and 30 seconds
        win_length_sec = np.clip(n_samples / fs / 20, 5, 30)
        nperseg_adapt = int(win_length_sec * fs)
        nperseg_adapt = max(10, min(nperseg_adapt, n_samples))
        
        # Create a Tukey window with 75% overlap.
        tukey_alpha = 0.25
        win = tukey(nperseg_adapt, alpha=tukey_alpha)
        noverlap_adapt = int(nperseg_adapt * 0.75)
        if noverlap_adapt >= nperseg_adapt:
            noverlap_adapt = nperseg_adapt - 1
        
        # Estimate the power spectral density (PSD) using Welch's method.
        freqs, psd = signal.welch(centered, fs=fs, nperseg=nperseg_adapt,
                                  noverlap=noverlap_adapt, window=win, detrend='constant')
        psd = np.maximum(psd, eps)
        
        # Compute relative differences for PSD stationarity measure.
        diff_arr = np.abs(np.diff(psd)) / (psd[:-1] + eps)
        # Smooth the derivative with a moving average.
        if len(diff_arr) >= 3:
            smooth_diff = np.convolve(diff_arr, np.ones(3)/3, mode='same')
        else:
            smooth_diff = diff_arr
        
        # Exponential smoothing (Kalman-like) with adaptive alpha using PSD stationarity.
        smoothed_psd = np.copy(psd)
        for i in range(1, len(psd)):
            # Adaptive smoothing coefficient: base 0.8 modified by local stationarity (±0.05)
            local_alpha = np.clip(0.8 - 0.05 * smooth_diff[min(i-1, len(smooth_diff)-1)], 0.75, 0.85)
            smoothed_psd[i] = local_alpha * smoothed_psd[i-1] + (1 - local_alpha) * psd[i]
            
        # Compute Tikhonov regularization gain based on deviation from median PSD.
        noise_baseline = np.median(smoothed_psd)
        raw_gain = (smoothed_psd / (noise_baseline + eps)) - 1.0
        
        # Compute a causal-like gradient using the Savitzky-Golay filter.
        win_len = 11 if len(smoothed_psd) >= 11 else ((len(smoothed_psd)//2)*2+1)
        polyorder = 2 if win_len > 2 else 1
        delta_freq = np.mean(np.diff(freqs))
        grad_psd = savgol_filter(smoothed_psd, win_len, polyorder, deriv=1, delta=delta_freq, mode='interp')
        
        # Nonlinear scaling via sigmoid to enhance gradient differences.
        sigmoid = lambda x: 1.0 / (1.0 + np.exp(-x))
        scaling_factor = 1.0 + 2.0 * sigmoid(np.abs(grad_psd) / (np.median(smoothed_psd) + eps))
        
        # Compute adaptive gain factors with nonlinear scaling.
        gain = 1.0 - np.exp(-0.5 * scaling_factor * raw_gain)
        gain = np.clip(gain, -8.0, 8.0)
        
        # FFT-based whitening: interpolate gain and PSD onto FFT frequency bins.
        signal_fft = np.fft.rfft(centered)
        freq_bins = np.fft.rfftfreq(n_samples, d=dt)
        interp_gain = np.interp(freq_bins, freqs, gain, left=gain[0], right=gain[-1])
        interp_psd = np.interp(freq_bins, freqs, smoothed_psd, left=smoothed_psd[0], right=smoothed_psd[-1])
        denom = np.sqrt(interp_psd) * (np.abs(interp_gain) + eps)
        denom = np.maximum(denom, eps)
        white_fft = signal_fft / denom
        whitened = np.fft.irfft(white_fft, n=n_samples)
        return whitened

    # Whiten H1 and L1 channels using the adapted method.
    white_h1 = adaptive_whitening(detrended_h1)
    white_l1 = adaptive_whitening(detrended_l1)

    # -------------------- Stage 3: Coherent Time-Frequency Metric with Frequency-Conditioned Regularization --------------------
    def compute_coherent_metric(w1: np.ndarray, w2: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        # Compute complex spectrograms preserving phase information.
        f1, t_spec, Sxx1 = signal.spectrogram(w1, fs=fs, nperseg=base_nperseg,
                                              noverlap=base_noverlap, mode='complex', detrend=False)
        f2, t_spec2, Sxx2 = signal.spectrogram(w2, fs=fs, nperseg=base_nperseg,
                                               noverlap=base_noverlap, mode='complex', detrend=False)
        # Ensure common time axis length.
        common_len = min(len(t_spec), len(t_spec2))
        t_spec = t_spec[:common_len]
        Sxx1 = Sxx1[:, :common_len]
        Sxx2 = Sxx2[:, :common_len]
        
        # Compute phase differences and coherence between detectors.
        phase_diff = np.angle(Sxx1) - np.angle(Sxx2)
        phase_coherence = np.abs(np.cos(phase_diff))
        
        # Estimate median PSD per frequency bin from the spectrograms.
        psd1 = np.median(np.abs(Sxx1)**2, axis=1)
        psd2 = np.median(np.abs(Sxx2)**2, axis=1)
        
        # Frequency-conditioned regularization gain (reflection-guided).
        lambda_f = 0.5 * ((np.median(psd1) / (psd1 + eps)) + (np.median(psd2) / (psd2 + eps)))
        lambda_f = np.clip(lambda_f, 1e-4, 1e-2)
        # Regularization denominator integrating detector PSDs and lambda.
        reg_denom = (psd1[:, None] + psd2[:, None] + lambda_f[:, None] + eps)
        
        # Weighted phase coherence that balances phase alignment with noise levels.
        weighted_comp = phase_coherence / reg_denom
        
        # Compute axial (frequency) second derivatives as curvature estimates.
        d2_coh = np.gradient(np.gradient(phase_coherence, axis=0), axis=0)
        avg_curvature = np.mean(np.abs(d2_coh), axis=0)
        
        # Nonlinear activation boost using tanh for regions of high curvature.
        nonlinear_boost = np.tanh(5 * avg_curvature)
        linear_boost = 1.0 + 0.1 * avg_curvature
        
        # Cross-detector synergy: weight derived from global median consistency.
        novel_weight = np.mean((np.median(psd1) + np.median(psd2)) / (psd1[:, None] + psd2[:, None] + eps), axis=0)
        
        # Integrated time-frequency metric combining all enhancements.
        tf_metric = np.sum(weighted_comp * linear_boost * (1.0 + nonlinear_boost), axis=0) * novel_weight
        
        # Adjust the spectrogram time axis to account for window delay.
        metric_times = t_spec + times[0] + (base_nperseg / 2) / fs
        return tf_metric, metric_times

    tf_metric, metric_times = compute_coherent_metric(white_h1, white_l1)

    # -------------------- Stage 4: Multi-Resolution Thresholding with Octave-Spaced Dyadic Wavelet Validation --------------------
    def multi_resolution_thresholding(metric: np.ndarray, times_arr: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
        # Robust background estimation with median and MAD.
        bg_level = np.median(metric)
        mad_val = np.median(np.abs(metric - bg_level))
        robust_std = 1.4826 * mad_val
        threshold = bg_level + 1.5 * robust_std

        # Identify candidate peaks using prominence and minimum distance criteria.
        peaks, _ = signal.find_peaks(metric, height=threshold, distance=2, prominence=0.8 * robust_std)
        if peaks.size == 0:
            return np.array([]), np.array([]), np.array([])

        # Local uncertainty estimation using a Gaussian-weighted convolution.
        win_range = np.arange(-uncertainty_window, uncertainty_window + 1)
        sigma = uncertainty_window / 2.5
        gauss_kernel = np.exp(-0.5 * (win_range / sigma) ** 2)
        gauss_kernel /= np.sum(gauss_kernel)
        weighted_mean = np.convolve(metric, gauss_kernel, mode='same')
        weighted_sq = np.convolve(metric ** 2, gauss_kernel, mode='same')
        variances = np.maximum(weighted_sq - weighted_mean ** 2, 0.0)
        uncertainties = np.sqrt(variances)
        uncertainties = np.maximum(uncertainties, 0.01)

        valid_times = []
        valid_heights = []
        valid_uncerts = []
        n_metric = len(metric)

        # Compute a simple second derivative for local curvature checking.
        if n_metric > 2:
            second_deriv = np.diff(metric, n=2)
            second_deriv = np.pad(second_deriv, (1, 1), mode='edge')
        else:
            second_deriv = np.zeros_like(metric)

        # Use octave-spaced scales (dyadic wavelet validation) to validate peak significance.
        widths = np.arange(1, 9)  # approximate scales 1 to 8
        for peak in peaks:
            # Skip peaks lacking sufficient negative curvature.
            if second_deriv[peak] > -0.1 * robust_std:
                continue
            local_start = max(0, peak - uncertainty_window)
            local_end = min(n_metric, peak + uncertainty_window + 1)
            local_segment = metric[local_start:local_end]
            if len(local_segment) < 3:
                continue
            try:
                cwt_coeff = signal.cwt(local_segment, signal.ricker, widths)
            except Exception:
                continue
            max_coeff = np.max(np.abs(cwt_coeff))
            # Threshold for validating the candidate using local MAD.
            cwt_thresh = mad_val * np.sqrt(2 * np.log(len(local_segment) + eps))
            if max_coeff >= cwt_thresh:
                valid_times.append(times_arr[peak])
                valid_heights.append(metric[peak])
                valid_uncerts.append(uncertainties[peak])

        if len(valid_times) == 0:
            return np.array([]), np.array([]), np.array([])
        return np.array(valid_times), np.array(valid_heights), np.array(valid_uncerts)

    peak_times, peak_heights, peak_deltat = multi_resolution_thresholding(tf_metric, metric_times)
    return peak_times, peak_heights, peak_deltat
  • Automatically discover and interpret the value of nonlinear algorithms
  • Facilitating new knowledge production along with experience guidance

PT Level 5

Framework Mechanism Analysis

Integrated Architecture Validation

  • A comprehensive comparison of our integrated
    Evo-MCTS framework against its constituent components operating in isolation.
    • Evo-MCTS: MCTS + Self-evolve + Reflection mech.
    • MCTS-AHD: MCTS framework for CO.
    • ReEvo: evolutionary framework for CO.

Contributions of knowledge synthesis

  • Compare to w/o external knowledge
    • non-linear vs linear only

LLM Model Selection and Robustness Analysis

  • Ablation study of various LLM contributions (code generator) and their robustness.
    • o3-mini-medium
      o1-2024-12-17
      gpt-4o-2024-11-20
      claude-3-7-sonnet-20250219-thinking

59.1%

115%

Framework Mechanism Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Contributions of knowledge synthesis

  • Compare to w/o external knowledge
    • non-linear vs linear only

59.1%

115%

59.1%

### External Knowledge Integration
1. **Non-linear** Processing Core Concepts:
    - Signal Transformation: 
        * Non-linear vs linear decomposition
        * Adaptive threshold mechanisms
        * Multi-scale analysis
    
    - Feature Extraction:
        * Phase space reconstruction
        * Topological data analysis
        * Wavelet-based detection
    
    - Statistical Analysis:
        * Robust estimators
        * Non-Gaussian processes
        * Higher-order statistics

2. Implementation Principles:
    - Prioritize adaptive over fixed parameters
    - Consider local vs global characteristics
    - Balance computational cost with accuracy

Traditional Physics

✓ Fully interpretable
✗ Performance ceiling

Human-designed pipelines
Fixed heuristics

Examples:
Matched filtering
χ² tests

Black-box AI

✓ High performance
✗ Opaque decisions

End-to-end prediction
Model-centric learning

Examples:
CNNs, DINGO

Interpretable
Algorithmic Discovery

Algorithms as search objects
Physics-informed objectives

Performance:
Competitive with state-of-the-art
(MLGWSC-1 benchmark)

Example:
Evo-MCTS (this work)
AlphaEvolve

Interpretable AI for Gravitational-Wave Discovery

Scientific discovery requires interpretability — not just performance.

AI should help us understand why an algorithm works — not just output an answer.

PyCBC (linear-core)

cWB (nonlinear-core)

Simple filters (non-linear)

CNN-like (highly non-linear)

Benchmarking against state-of-the-art methods
(MLGWSC1)

HW, LZ. arXiv:2508.03661 [cs.AI]

Interpretable AI Approach

The best of both worlds

Input

Physics-Informed
Algorithm

(High interpretability)

Output

Example: Evo-MCTS, AlphaEvolve

AI Model

Physics
Knowledge

Traditional Physics Approach

Input

Human-Designed Algorithm

(Based on human insight)

Output

Example: Matched Filtering, linear regression

Black-Box AI Approach

Input

AI Model

(Low interpretability)

Output

Examples: CNN, AlphaGo, DINGO

Data/
Experience

Data/
Experience

🎯 OUR WORK

Scientific discovery requires interpretability, not just performance.

Interpretable AI for Gravitational-Wave Discovery

Scientific discovery requires interpretability — not just performance.

AI should help us understand why an algorithm works — not just output an answer.

Key Takeaways: ... against Symbolic Regression

Any algorithm design problem can be seen as an optimization problem

  • Many intermediate processes in gravitational wave data processing can be viewed as "algorithm optimization" problems, such as filter design, noise modeling, detection statistic construction, etc.
  • Many analytical modeling and "symbolic regression" methods in theoretical physics and cosmology can also be seen as "algorithm optimization" problems
    • Symbolic regression vs algorithm optimization:

 

 

 

 

 

 


 

 

  • Other Opt. Problem Egs:
    • AI-driven design of experiments. [Phys. Rev. X 15, 021012 (2025)]
    • RL design for multiple filters in LIGO control system. [Science (2025)]

vs

From Black-Box AI to Algorithmic Co-Design

LLMs as agents that optimize physics-based algorithms

A new axis: adaptivity over algorithm design

LLMs allow us to search over algorithms, not just over parameters.

Interpretable AI for Gravitational-Wave Discovery

Scientific discovery requires interpretability — not just performance.

Traditional Physics

Fully interpretable
Performance ceiling

Human-designed pipelines
Fixed heuristics

Examples:
Matched filtering
χ² tests

Black-box AI

High performance
Opaque decisions

End-to-end prediction
Model-centric learning

Examples:
CNNs, DINGO

Interpretable Algorithmic Discovery

Interpretable
High performance

Algorithms as search objects
Physics-informed objectives

Example:
Evo-MCTS (this work)

AI should help us understand why an algorithm works — not just output an answer.

Interpretable AI Approach

The best of both worlds

Input

Physics-Informed
Algorithm

(High interpretability)

Output

Example: Evo-MCTS, AlphaEvolve

AI Model

Physics
Knowledge

Traditional Physics Approach

Input

Human-Designed Algorithm

(Based on human insight)

Output

Example: Matched Filtering, linear regression

Black-Box AI Approach

Input

AI Model

(Low interpretability)

Output

Examples: CNN, AlphaGo, DINGO

Data/
Experience

Data/
Experience

🎯 OUR WORK

What do we think about AI for scientific discovery?

Scientific discovery requires interpretability, not just performance.

Why MLGWSC2 Needs More Than Better Classifiers

From single-pipeline performance to ensemble unbiasedness

We emphasize evaluating complementarity and ensemble behavior rather than single-pipeline superiority.

What LVK already does

  • GWTC catalogs rely on multiple independent pipelines

    • A candidate proceeds if any pipeline produces a confidence trigger

  • This already forms an implicit ensemble
     

Ensembling is already standard practice — just not explicitly analyzed.

The missing question

  • Each pipeline carries distinct inductive biases
    (eg: duration, morphology, noise response, parameter coverage).

cWB
GstLAL
PyCBC
AI_1
AI_2
AI_3
AI_n

Is the ensemble unbiased?

LVK anchor pipelines
(selected)

Ensemble evaluation

MLGWSC-2 is under proposal development with active community input (e.g. Nitz, Dent, Messenger, ...).

A RL Perspective on LISA Global Fitting

“I don’t claim this is solved. I claim the framing matters.”

Global fitting is not a single inference — it is a long-horizon control problem.

  • Numerical orbits (of Taiji)
  • Unequal-arm
  • TDI-2.0

preliminary

preliminary

MH Du+, arXiv:2505.16500 [gr-qc]

MDP Element GW Interpretation
State Residuals, PSD drift, candidate list
Action Propose/subtract/refine/allocate compute
Reward Evidence gain, residual stationarity
Horizon Entire observing run

A trajectory tree of global-fitting decisions over time

Nodes: residual states
Edges: modeling actions

(HW+, in preparation)

Global fitting as a Markov Decision Process (MDP)

GW Data Analysis as a Markov Decision Process

Many GW pipelines already define an MDP — implicitly and inconsistently.

“Once you phrase the problem this way,
RL and MCTS are not exotic — they are obvious.”

Open Questions for the Community

  • What is the right reward for discovery?
  • Should we train ensembles instead of curating them?
  • When does adaptivity beat optimality?
for _ in range(num_of_audiences):
    print('Thank you for your attention! 🙏')

Call for Speakers - MLA F2F @ March LVK 2026 (Pisa)

Just a gentle reminder that we’re collecting contributions for the Machine Learning Algorithms (MLA) section!

Evolutionary and Reinforcement Learning Approaches for GW Data Analysis

By He Wang

Evolutionary and Reinforcement Learning Approaches for GW Data Analysis

2026/01/27 | Flash Talk @GWFREERIDE | Abstract: Upcoming challenges such as MLGWSC2, currently at the proposal stage, provide a new testbed for exploring machine-learning–based approaches to gravitational-wave analysis. In this flash talk, I briefly introduce my core ideas and experience using evolutionary algorithms, Evo-MCTS, and reinforcement learning as adaptive search and optimization tools. I outline key methodological insights and discuss how these ideas may inform future GW analysis tasks, including potential applications to LISA.

  • 8