He Wang PRO
Knowledge increases by sharing but not by saving.
He Wang (王赫)
hewang@ucas.ac.cn
International Centre for Theoretical Physics Asia-Pacific (ICTP-AP), UCAS
Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), UCAS
On behalf of the KAGRA collaborations / Taiji collaborations
Detection
Inference
AHD
Gravitational waves (GW) are a strong field effect in General Relativity, ripples in the fabric of spacetime caused by accelerating massive objects.
GW Data Characteristics
LIGO-VIRGO-KAGRA
LISA Project
Noise: non-Gaussian and non-stationary
Signal challenges:
(Earth-based) A low signal-to-noise ratio (SNR) which is typically about 1/100 of the noise amplitude (-60 dB).
(Space-based) A superposition of all GW signals (e.g.: 104 of GBs, 10~102 of SMBHs, and 10~103 of EMRIs, etc.) received during the mission's observational run.
Matched Filtering Techniques (匹配滤波方法)
In Gaussian and stationary noise environments, the optimal linear algorithm for extracting weak signals
Statistical Approaches
Frequentist Testing:
Bayesian Testing:
Core Insight from Computer Vision
Performance Analysis
Pioneering Research Publications
PRL, 2018, 120(14): 141103.
PRD, 2018, 97(4): 044039.
Universal Approximation Theorem: Existence Theorem
Beyond Speed: Generalization and Explainability
Convolutional Neural Network (ConvNet or CNN)
feature extraction
classifier
Matched-filtering Convolutional Neural Network (MFCNN)
He Wang, et al. PRD 101, 10 (2020): 104003
>> Is it matched-filtering ? >> Wait, It can be matched-filtering!
Universal Approximation Theorem: Existence Theorem
Beyond Speed: Generalization and Explainability
Convolutional Neural Network (ConvNet or CNN)
Matched-filtering Convolutional Neural Network (MFCNN)
He Wang, et al. PRD 101, 10 (2020): 104003
GW150914
GW150914
Transform matched-filtering method from frequency domain to time domain.
The square of matched-filtering SNR for a given data \(d(t) = n(t)+h(t)\):
\(S_n(|f|)\) is the one-sided average PSD of \(d(t)\)
where
Deep Learning Framework
FYI: \(N_\ast = \lfloor(N-K+2P)/S\rfloor+1\)
(A schematic illustration for a unit of convolution layer)
Time Domain
(matched-filtering)
(normalizing)
(whitening)
Frequency Domain
Transform matched-filtering method from frequency domain to time domain.
The square of matched-filtering SNR for a given data \(d(t) = n(t)+h(t)\):
\(S_n(|f|)\) is the one-sided average PSD of \(d(t)\)
where
Deep Learning Framework
Time Domain
(matched-filtering)
(normalizing)
(whitening)
Frequency Domain
import mxnet as mx
from mxnet import nd, gluon
from loguru import logger
def MFCNN(fs, T, C, ctx, template_block, margin, learning_rate=0.003):
logger.success('Loading MFCNN network!')
net = gluon.nn.Sequential()
with net.name_scope():
net.add(MatchedFilteringLayer(mod=fs*T, fs=fs,
template_H1=template_block[:,:1],
template_L1=template_block[:,-1:]))
net.add(CutHybridLayer(margin = margin))
net.add(Conv2D(channels=16, kernel_size=(1, 3), activation='relu'))
net.add(MaxPool2D(pool_size=(1, 4), strides=2))
net.add(Conv2D(channels=32, kernel_size=(1, 3), activation='relu'))
net.add(MaxPool2D(pool_size=(1, 4), strides=2))
net.add(Flatten())
net.add(Dense(32))
net.add(Activation('relu'))
net.add(Dense(2))
# Initialize parameters of all layers
net.initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx, force_reinit=True)
return net1 sec duration
35 templates used
Explainable AI Approach
Matched-filtering Convolutional Neural Network (MFCNN)
The available codes (2019): https://gist.github.com/iphysresearch/a00009c1eede565090dbd29b18ae982c
He Wang, et al. PRD 101, 10 (2020): 104003
Benchmark Results
Publications
Key Findings
Note on Benchmark Limitations:
Outperforming PyCBC doesn't conclusively prove that matched filtering is inferior to AI methods. This is both because the dataset represents a specific distribution and because PyCBC settings could be further optimized for this particular benchmark.
arXiv:2501.13846 [gr-qc]
Phys. Rev. D 110, 024024 (2024)
Phys. Rev. D 107, 023021 (2023)
AI Model Denoising
Our Model's Detection Statistics
LVK Official Detection Statistics
Signal denoising visualization using our deep learning model (Transformer-based)
He Wang et al 2024 MLST 5 015046
Detection statistics from our AI model showing O1 events
He Wang et al 2024 MLST 5 015046
GW151226
GW151012
Official detection statistics from LVK collaboration
LVK. PRD (2016). arXiv:1602.03839
arXiv:2407.07820 [gr-qc]
Recent AI Discoveries & Validation Hurdles:
Search
PE
Rate
Parameter Estimation Challenges with AI Models:
arXiv:2404.14286
Phys. Rev. D 109, 123547 (2024)
Given the interpretability challenges we've explored,
how might we advance GW detection and parameter estimation while maintaining scientific rigor?
Given the interpretability challenges we've explored, how might we advance GW detection and parameter estimation while maintaining scientific rigor?
Automatic and Evolutionary Algorithm Heuristics for GW Detection using LLMs
A promising new approach combining the power of large language models with evolutionary algorithms to create interpretable, adaptive detection systems
Evolution of GPT Capabilities
A careful examination of GPT-3.5's capabilities reveals the origins of its emergent abilities:
GPT-3.5 series [Source: University of Edinburgh, Allen Institute for AI]
GPT-3 (2020)
ChatGPT (2022)
Magic: Code + Text
Recent research demonstrates that LLMs can solve complex optimization problems through carefully engineered prompts. DeepMind's OPRO (Optimization by PROmpting) approach showcases how LLMs can generate increasingly refined solutions through iterative prompting techniques.
OPRO: Optimization by PROmpting
Example: Least squares optimization through prompt engineering
arXiv:2309.03409 [cs.NE]
Two Directions of LLM-based Optimization
arXiv:2405.10098 [cs.NE]
LLMs can generate high-quality solutions to optimization problems without specialized training
The Interpolation Theory
LLMs' ability to generate novel responses from few examples is increasingly understood as manifold interpolation rather than mere memorization:
The theory suggests that in-context learning is not "learning" in the traditional sense, but rather a form of implicit conditioning on the manifold of learned representations.
Representation Space Interpolation
Real-world Case: FunSearch (Nature, 2023)
The Interpolation Theory
LLMs' ability to generate novel responses from few examples is increasingly understood as manifold interpolation rather than mere memorization:
The theory suggests that in-context learning is not "learning" in the traditional sense, but rather a form of implicit conditioning on the manifold of learned representations.
Representation Space Interpolation
Key Literature
The Interpolation Theory
LLMs' ability to generate novel responses from few examples is increasingly understood as manifold interpolation rather than mere memorization:
The theory suggests that in-context learning is not "learning" in the traditional sense, but rather a form of implicit conditioning on the manifold of learned representations.
Representation Space Interpolation
Key Literature
The Interpolation Theory
LLMs' ability to generate novel responses from few examples is increasingly understood as manifold interpolation rather than mere memorization:
The theory suggests that in-context learning is not "learning" in the traditional sense, but rather a form of implicit conditioning on the manifold of learned representations.
Representation Space Interpolation
Key Literature on Manifold Interpolation
https://www.lesswrong.com/posts/GADJFwHzNZKg2Ndti/have-llms-generated-novel-insights
https://gowrishankar.info/blog/deep-learning-is-not-as-impressive-as-you-think-its-mere-interpolation/
REWIRING AGI—NEUROSCIENCE IS ALL YOU NEED
What is test-time scaling?
Why LLMs can do the inference/optimation?
How about the theory? (check: 2410.14716)
Why we need MCTS?
Why and How is Evoluation theory in Opt area?
Add computational complexity analysis
借用流浪地球的台词?
借用流浪地球的台词?
Drawbacks and limitations:hard control for opt direction(when to balance between exploration and exploitation);sensitive to prompt template / LLM version; hard to define the search space for the unknown solution when problem is complicated;
好好先review一下:eccentricity using DINGO; AreaGW
自己实验的OPRO效果
好好先review一下:eccentricity using DINGO; AreaGW
逐层递进深刻的reflection
Monte Carlo Tree Search (MCTS)
Evolutionary Algorithms
LLM Agents
Together, these approaches create a powerful framework for heuristic optimization of gravitational wave signal search algorithms
Proposed framework integrating MCTS decision-making, self-evolutionary optimization, and LLM agent guidance for gravitational wave signal search
With route/short/long-term reflection:《Thinking, Fast and Slow》
Preliminary Results (February 2025)
import numpy as np
import scipy.signal as signal
def pipeline_v1(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
def data_conditioning(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
window_length = 4096
dt = times[1] - times[0]
fs = 1.0 / dt
def whiten_strain(strain):
strain_zeromean = strain - np.mean(strain)
freqs, psd = signal.welch(strain_zeromean, fs=fs, nperseg=window_length,
window='hann', noverlap=window_length//2)
smoothed_psd = np.convolve(psd, np.ones(32) / 32, mode='same')
smoothed_psd = np.maximum(smoothed_psd, np.finfo(float).tiny)
white_fft = np.fft.rfft(strain_zeromean) / np.sqrt(np.interp(np.fft.rfftfreq(len(strain_zeromean), d=dt), freqs, smoothed_psd))
return np.fft.irfft(white_fft)
whitened_h1 = whiten_strain(strain_h1)
whitened_l1 = whiten_strain(strain_l1)
return whitened_h1, whitened_l1, times
def compute_metric_series(h1_data: np.ndarray, l1_data: np.ndarray, time_series: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
fs = 1 / (time_series[1] - time_series[0])
f_h1, t_h1, Sxx_h1 = signal.spectrogram(h1_data, fs=fs, nperseg=256, noverlap=128, mode='magnitude', detrend=False)
f_l1, t_l1, Sxx_l1 = signal.spectrogram(l1_data, fs=fs, nperseg=256, noverlap=128, mode='magnitude', detrend=False)
tf_metric = np.mean((Sxx_h1**2 + Sxx_l1**2) / 2, axis=0)
gps_mid_time = time_series[0] + (time_series[-1] - time_series[0]) / 2
metric_times = gps_mid_time + (t_h1 - t_h1[-1] / 2)
return tf_metric, metric_times
def calculate_statistics(tf_metric, t_h1):
background_level = np.median(tf_metric)
peaks, _ = signal.find_peaks(tf_metric, height=background_level * 1.0, distance=2, prominence=background_level * 0.3)
peak_times = t_h1[peaks]
peak_heights = tf_metric[peaks]
peak_deltat = np.full(len(peak_times), 10.0) # Fixed uncertainty value
return peak_times, peak_heights, peak_deltat
whitened_h1, whitened_l1, data_times = data_conditioning(strain_h1, strain_l1, times)
tf_metric, metric_times = compute_metric_series(whitened_h1, whitened_l1, data_times)
peak_times, peak_heights, peak_deltat = calculate_statistics(tf_metric, metric_times)
return peak_times, peak_heights, peak_deltat
Function Role in Framework
Pipeline Workflow
Input: H1 and L1 detector strains, time array | Output: Event times, significance values, and time uncertainties
Preliminary Results (February 2025)
Prompt Structure for Algorithm Evolution
This template guides the LLM to generate optimized gravitational wave detection algorithms by learning from comparative examples.
Key Components:
One Prompt Template for MLGWSC1 Algorithm Synthesis
You are an expert in gravitational wave signal detection algorithms. Your task is to design heuristics that can effectively solve optimization problems.
{prompt_task}
I have analyzed two algorithms and provided a reflection on their differences.
[Worse code]
{worse_code}
[Better code]
{better_code}
[Reflection]
{reflection}
Based on this reflection, please write an improved algorithm according to the reflection.
First, describe the design idea and main steps of your algorithm in one sentence. The description must be inside a brace outside the code implementation. Next, implement it in Python as a function named '{func_name}'.
This function should accept {input_count} input(s): {joined_inputs}. The function should return {output_count} output(s): {joined_outputs}.
{inout_inf} {other_inf}
Do not give additional explanations.
Preliminary Results (February 2025)
MLGWSC1 preliminary 结果
Tree-based representation of our framework's exploration path, where each node represents a unique algorithm variant generated during the optimization process
Node color intensity: Algorithm performance level | Connections: Algorithmic modifications | Tree depth: Iteration sequence
Preliminary Results (February 2025)
Preliminary Results (February 2025)
Optimization Progress & Algorithm Diversity
Sensitivity vs False Alarm Rate
Optimization Target: Maximizing Area Under Curve (AUC) in the 10-100Hz frequency range, balancing detection sensitivity and false alarm rates across algorithm generations
Optimization Target: Maximizing Area Under Curve (AUC) in the 10-100Hz frequency range, balancing detection sensitivity and false alarm rates across algorithm generations
Preliminary Results (February 2025)
This pipeline combines adaptive PSD whitening and multi-band spectral coherence computation with a noise floor-aware peak detection and a non-linear timing uncertainty model to enhance gravitational wave signal detection accuracy and robustness.
Integrate asymmetric PSD whitening, extended STFT overlap optimization, chirp-enhanced prominence scaling, multi-channel noise floor refinement, and dynamic timing calibration for improved gravitational wave signal detection.
Optimization Target: Maximizing Area Under Curve (AUC) in the 10-100Hz frequency range, balancing detection sensitivity and false alarm rates across algorithm generations
Optimization Progress & Algorithm Diversity
Preliminary Results (February 2025)
The framework (LLMs) can effectively optimize complex algorithms and guide iterative development along specified optimization directions, achieving targeted performance improvements in GW detection
Preliminary Results (February 2025)
Sensitivity vs False Alarm Rate
PyCBC
CNN-like
Simple non-linear filter
Key Finding: Our framework demonstrates potential to optimize highly interpretable and scalable non-linear algorithm pipelines that achieve performance comparable to traditional matched filtering techniques.
Traditional Physics Approach
Input
Human-Designed Algorithm
(Based on human insight)
Output
Example: Matched Filtering
Black-Box AI Approach
Input
AI Model
(Low interpretability)
Output
Examples: CNN, AlphaGo
Interpretable AI Approach
Input
Optimized
Algorithm
(High interpretability)
Output
Example: OURS (on-going)
The Future: Combining traditional physics knowledge with LLM-optimized algorithms for transparent, reliable scientific discovery
Data/
Experience
Data/
Experience
AI Model
Data/
Experience
Key Insights from Our Journey
The Critical Role of Interpretability
Algorithm interpretability provides multiple essential benefits:
The future of gravitational wave science lies at the intersection of traditional physics-inspired methods and interpretable AI approaches, creating a new paradigm for reliable scientific discovery.
Key Insights from Our Journey
The Critical Role of Interpretability
Algorithm interpretability provides multiple essential benefits:
The future of gravitational wave science lies at the intersection of traditional physics-inspired methods and interpretable AI approaches, creating a new paradigm for reliable scientific discovery.
for _ in range(num_of_audiences):
print('Thank you for your attention! 🙏')hewang@ucas.ac.cn
By He Wang
2025/04/08 @KIAA