AI × 宇宙学：从算法工具到科学发现的新范式

AI and Cosmology: From Computational Tools to Scientific Discovery

—— A forward-looking perspective on AI-driven scientific research

王赫

2026/01/10 10:00-11:00 | 第五届后羿系列研讨会

—— 一些关于未来研究方向的思考

国际理论物理中心（亚太地区）

中国科学院大学

才翻到上面看到有人现场拍照 [破涕为笑]，随手分享一下

我最近常用的PPT英语字体是 Economica，是一个风格比较现代的无衬线字体：https://fonts.google.com/specimen/Economica
但用这个字体显得好看，牺牲了一点儿清晰度，有需要的时候还是会回归Helvetica Neue
衬线字体我喜欢用 Arno Pro: https://fonts.adobe.com/fonts/arno
中文字体已经锁死了喜鹊宋或者木叶（收费字体）
颜色一般从MetBrewer里面挑，但并没有特别注意配色：https://github.com/BlakeRMills/MetBrewer
今天刚和邵老师说，可能是中年危机的一种表现，就现在越来越喜欢五颜六色的东西。。。也体现在了PPT上。这个完全见仁见智。
如果有人对这种PPT感兴趣，我把一个7月份会议的短PPT分享在这供参考：https://www.dropbox.com/scl/fi/duez2bpbcck4ogtn98sw6/songhuang_sesto_20250707.key?rlkey=g18rnjym1hpzke3jxcj5y6ezh&st=ot5xu2w8&dl=0
我自己现在习惯的PPT排版的风格只适合分steps展示，不能一次都show全。我自己开始使用这个风格是上课以后，需要满足PPT好看，能吸引注意力，但同时信息量够足，学生可以拿来复习。暂时觉得还好，但过两年可能还是会学着做简单一点儿。
用字体大小和颜色来highlight关键词是最简单粗暴、最俗的引导视线的方法，属于广告里早就用烂了的。其实有更好的设计语言，但不会。。。
PPT风格纯属个人审美兴趣，和报告水平，更和报告内容好坏无关。

Content

LLMs for scientific discovery
- Key insights:
  - Code
  - Natural Language Programming
  - Manifold Interpolation
- "Program" vs "Solution" Space
- Optimization Potentials:
  - Combinatorial combination
  - Interpretable AI approach
- Evo-MCTS (arXiv:2508.03661 [cs.AI])
AI x Cosmology
- AI for Science
- AI Scientist
- AI for cosmology

LLM for Scientific Discovery

AI x Cosmology

我想做一个实验。你可以随意问我任何一个问题，我会尽可能真实且完整地回答。基于我的回答，你再继续问下一个问题。我们会这样来回进行，持续下去，直到挖掘出我内心深处的构思——包括谬误、局限、潜能、需要改进的地方，或者任何潜藏在我潜意识中的东西。

How can LLMs be used for scientific discovery?

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Uncovering the "black box" to reveal how AI actually works

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

arXiv:2507.11810 [cs.DL]

"科学家"

"合作者"

"评估者"

Let's be honest about our motivations... 😉

AI and Cosmology: From Computational Tools to Scientific Discovery

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

arXiv:2507.11810 [cs.DL]

Let's be honest about our motivations... 😉

AI and Cosmology: From Computational Tools to Scientific Discovery

arXiv:2504.03424 [astro-ph.IM]

arXiv:2412.00431 [astro-ph.IM]

The AI Cosmologist I: An Agentic System for Automated Data Analysis

Multi-Agent System for Cosmological Parameter Analysis

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

Let's be honest about our motivations... 😉

直接不行？那就包装回炉再来一遍。

npj Artif. Intell. 1, 14 (2025).

"序列输出"

"序列输入"

Direct fails. Refine and recover.

AI and Cosmology: From Computational Tools to Scientific Discovery

Demo: LLM 验证开普勒行星运动三定律

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

Let's be honest about our motivations... 😉

AI and Cosmology: From Computational Tools to Scientific Discovery

Generative agents rely on predefined rules. 🤫

arXiv:2304.03442 [cs.HC]

arXiv:2201.11903 [cs.CL]

arXiv:2305.10601 [cs.CL]

NVIDIA Blog: How Scaling Laws Drive Smarter, More Powerful AI (Source)

📄 Google DeepMind: "Scaling LLM Test-Time Compute Optimally" (arXiv:2408.03314)

🔗 OpenAI: Learning to Reason with LLMs

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

AI and Cosmology: From Computational Tools to Scientific Discovery

GPT能力的演变

对GPT-3.5能力的仔细审查揭示了其新兴能力的起源：

原始的GPT-3通过预训练获得了生成能力、世界知识和上下文学习
经过指令调优的模型发展出了遵循指令和推广到未见任务的能力
代码训练模型（code-davinci-002）获得了代码理解能力
进行复杂推理的能力可能是代码训练的副产品

What are our thoughts on LLMs?

GPT-3.5 series [Source: University of Edinburgh, Allen Institute for AI]

GPT-3 (2020)

ChatGPT (2022)

Magic: Code + Text

究竟是什么让 LLMs 如此强大？

Code! (1/3)

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs?

究竟是什么让 LLMs 如此颠覆？

MCP

MCP Tool

https://github.com/iphysresearch/gw150914-mcp-signal-search

prompt

"Please generate gw templates first."

模型上下文协议（Model Context Protocol，MCP），是由Anthropic推出的开源协议，旨在实现大语言模型与外部数据源和工具的集成，用来在大模型和数据源之间建立安全双向的连接。

Demo: GW150914 MCP Signal Search

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs?

Nat Mach Intell 7, 592–601 (2025).

Rule-Based Vs. LLMs: (Source)

Natural Language Programming! (2/3)

究竟是什么让 LLMs 如此颠覆？

MCP

RAG

具身智能

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs?

It's Mere Interpolation! (3/3)

究竟如何解释 AI/LLMs 的原理？

The core driving force of AI4Sci largely lies in its “interpolation” generalization capabilities, showcasing its powerful complex modeling abilities.

Deep Learning is Not As Impressive As you Think, It's Mere Interpolation (Source)

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs?

It's Mere Interpolation! (3/3)

究竟如何解释 AI/LLMs 的原理？

The core driving force of AI4Sci largely lies in its “interpolation” generalization capabilities, showcasing its powerful complex modeling abilities.

Deep Learning is Not As Impressive As you Think, It's Mere Interpolation (Source)

Representation Space Interpolation

Representation Space Interpolation

He Wang | ICTP-AP, UCAS

The "Real" Reasons We Apply LLMs to Scientific Discovery

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs?

It's Mere Interpolation! (3/3)

究竟如何解释 AI/LLMs 的原理？

Deep Learning is Not As Impressive As you Think, It's Mere Interpolation (Source)

【信号搜索】通过对超出广相的BGR波形进行测试，框架在不同PN阶和光度距离下均展现出与GR信号检测相媲的通用化能力和鲁棒性。

GR

BGR

Yu-Xin Wang, Xiaotong Wei, Chun-Yue Li, Tian-Yang Sun, Shang-Jie Jin, He Wang*, Jing-Lei Cui, Jing-Fei Zhang, and Xin Zhang*. “Search for Exotic Gravitational Wave Signals beyond General Relativity Using Deep Learning.” PRD 112 (2), 024030. e-Print: arXiv:2410.20129 [gr-qc]

【宇宙学】CVAE模型能将CMB功率谱谱高效压缩到仅5个潜在维度，并在Planck不确定性内实现了超过99.9%的高保真度重建，即使在参数外推下也能可靠重构。

~ sampling

Tian-Yang Sun, Tian-Nuo Li, He Wang*, Jing-Fei Zhang, Xin Zhang*. Conditional variational autoencoders for cosmological model discrimination and anomaly detection in cosmic microwave background power spectra. e-Print: arxiv:2510.27086 [astro-ph.CO]

通过设计非线性映射，将科学数据表示到多样化的特征空间中，以提升对复杂科学问题的建模与推断能力。

《纽伦堡编年史》上这些圆圈并不是随意涂写，而是某位古代读者试图调和《七十士译本》（希腊旧约）与《希伯来圣经》两种不同年代计算体系所做的笔记。

AI 与考古：

Cap Set Problem

给定一个N，求维度为N的网格里面最大能找到多少个点，这些点中任意三个点都不能连成一条直线。

Bin Packing Problem

如何在线将不同尺寸的物品装入最少数量的箱子中。

The largets cap set in N=2 has size 4.

The largest cap set in N=3 has size 9 > \(2^3\)

For N > 6, the size of the largest cap set is unknown.

Discover new knowledge and efficient algorithms using AI

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Illustrative example of bin packing using existing heuristic – Best-fit heuristic (left), and using a heuristic discovered by FunSearch (right).

DeepMind Blog (Source)

LLM guided search in ”program“ space

Discover new knowledge and efficient algorithms using AI

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

LLM guided search in ”program“ space

Real-world Case: FunSearch (Nature, 2023)

Google DeepMind's FunSearch system pairs LLMs with evaluators in an evolutionary process
Discovered new mathematical knowledge for the cap set problem in combinatorics, improving on best known bounds
Also created novel algorithms for online bin packing that outperform traditional methods
Demonstrates LLMs can make verifiable scientific discoveries beyond their training data

YouTube (Source)

Discover new knowledge and efficient algorithms using AI

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

YouTube (Source)

LLM guided search in ”program“ space

Discover new knowledge and efficient algorithms using AI

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

LLM guided search in ”solution“ space

Recent research demonstrates that LLMs can solve complex optimization problems through carefully engineered prompts. DeepMind's OPRO (Optimization by PROmpting) approach showcases how LLMs can generate increasingly refined solutions through iterative prompting techniques.

Example: Least squares optimization

arXiv:2309.03409 [cs.NE]

Example: Traveling Salesman Problems

LLMs can generate high-quality solutions to optimization problems without specialized training

Discover new knowledge and efficient algorithms using AI

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

LLM guided search in ”solution“ space

Recent research demonstrates that LLMs can solve complex optimization problems through carefully engineered prompts. DeepMind's OPRO (Optimization by PROmpting) approach showcases how LLMs can generate increasingly refined solutions through iterative prompting techniques.

LLMs can generate high-quality solutions to optimization problems without specialized training

m1

m2

Demo: GW Parameter (Point) Estimation

arXiv:2309.03409 [cs.NE]

The Optimization Potential of Large Language Models

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Recent research demonstrates that LLMs can solve complex optimization problems through carefully engineered prompts.

Two Directions of LLM-based Optimization

arXiv:2405.10098 [cs.LG]

arXiv:2410.14716 [cs.LG]

Large Language Models as Designers: LLMs are used to directly create algorithms or specific components,
which are commonly incorporated iteratively to continuously search for better designs.

What are our thoughts on LLMs in scientific discovery?

使用 LLM 生成算法解决组合优化问题 (1/3)

The Optimization Potential of Large Language Models

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs in scientific discovery?

arXiv:2402.01145 [cs.NE]

ReEvo

MCTS-AHD

arXiv:2501.08603 [cs.AI]

使用 LLM 生成算法解决组合优化问题 (1/3)

The Optimization Potential of Large Language Models

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

使用 LLM 生成解决科学计算中的痛点 (2/2)

What are our thoughts on LLMs in scientific discovery?

The strict requirements for algorithm discovery

Physical constraints: Must follow physical laws and domain knowledge
Efficiency: Must navigate large, costly search spaces
Interpretability: Must be understandable and verifiable by experts

Interpretable AI Approach

The best of both worlds

Input

Physics-Informed
Algorithm

(High interpretability)

Output

Example: Evo-MCTS, AlphaEvolve

AI Model

Physics
Knowledge

Traditional Physics Approach

Input

Human-Designed Algorithm

(Based on human insight)

Output

Example: Matched Filtering, linear regression

Black-Box AI Approach

Input

AI Model

(Low interpretability)

Output

Examples: CNN, AlphaGo, DINGO

Data/
Experience

🎯 OUR WORK

The Optimization Potential of Large Language Models

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs in scientific discovery?

Motivation I: Linear template method using prior data

Traditional matching filters need large templates, increasing computational costs and noise sensitivity, which hampers new gravitational wave signal detection.

Motivation II: Black-box data-driven learning methods

Deep neural networks excel in nonlinear modeling but are "black boxes" with poor interpretability, making them unsuitable for high-risk scientific validation.

Nitz et al., ApJ (2017)

Sci4MLGW @ICERM (June 2025)

Automated Heuristic Design: Problem Definition

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

For any complex task \(P\) (especially NP-hard problems), Automated Heuristic Design (AHD)
searches for the optimal heuristic \(h^*\) within a heuristic space \(H\):

\(h^*=\underset{h \in H}{\arg \max } g(h) \)

The heuristic space \(H\) contains all feasible algorithmic solutions for task \(P\). Each heuristic \(h \in H\) maps from the set of task inputs \(I_P\) to corresponding solutions \(S_P\):

\(h: I_P \rightarrow S_P\)

Performance measure \(g(\cdot)\) evaluates each heuristic's effectiveness, \(g: H \rightarrow \mathbb{R}\). For minimization problems with objective function \(f: S_P \rightarrow \mathbb{R}\), we estimate performance by evaluating the heuristic instances \({ins}\in D \subseteq I_P\) on dataset \(D\) as follows:

\(g(h)=\mathbb{E}_{\boldsymbol{ins} \in D}[-f(h(\boldsymbol{ins}))]\)

arXiv:2410.14716 [cs.LG]

P

H

S_p

\mathbb{R}

f

I_p

h

external_knowledge
(constraint)

h

g(h)

Automated Heuristic Design: Problem Definition

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

import numpy as np
import scipy.signal as signal
def pipeline_v1(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    def data_conditioning(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
        window_length = 4096
        dt = times[1] - times[0]
        fs = 1.0 / dt
        
        def whiten_strain(strain):
            strain_zeromean = strain - np.mean(strain)
            freqs, psd = signal.welch(strain_zeromean, fs=fs, nperseg=window_length,
                                       window='hann', noverlap=window_length//2)
            smoothed_psd = np.convolve(psd, np.ones(32) / 32, mode='same')
            smoothed_psd = np.maximum(smoothed_psd, np.finfo(float).tiny)
            white_fft = np.fft.rfft(strain_zeromean) / np.sqrt(np.interp(np.fft.rfftfreq(len(strain_zeromean), d=dt), freqs, smoothed_psd))
            return np.fft.irfft(white_fft)

        whitened_h1 = whiten_strain(strain_h1)
        whitened_l1 = whiten_strain(strain_l1)
        
        return whitened_h1, whitened_l1, times
    
    def compute_metric_series(h1_data: np.ndarray, l1_data: np.ndarray, time_series: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        fs = 1 / (time_series[1] - time_series[0])
        f_h1, t_h1, Sxx_h1 = signal.spectrogram(h1_data, fs=fs, nperseg=256, noverlap=128, mode='magnitude', detrend=False)
        f_l1, t_l1, Sxx_l1 = signal.spectrogram(l1_data, fs=fs, nperseg=256, noverlap=128, mode='magnitude', detrend=False)
        tf_metric = np.mean((Sxx_h1**2 + Sxx_l1**2) / 2, axis=0)
        gps_mid_time = time_series[0] + (time_series[-1] - time_series[0]) / 2
        metric_times = gps_mid_time + (t_h1 - t_h1[-1] / 2)
        
        return tf_metric, metric_times

    def calculate_statistics(tf_metric, t_h1):
        background_level = np.median(tf_metric)
        peaks, _ = signal.find_peaks(tf_metric, height=background_level * 1.0, distance=2, prominence=background_level * 0.3)
        peak_times = t_h1[peaks]
        peak_heights = tf_metric[peaks]
        peak_deltat = np.full(len(peak_times), 10.0)  # Fixed uncertainty value
        return peak_times, peak_heights, peak_deltat

    whitened_h1, whitened_l1, data_times = data_conditioning(strain_h1, strain_l1, times)
    tf_metric, metric_times = compute_metric_series(whitened_h1, whitened_l1, data_times)
    peak_times, peak_heights, peak_deltat = calculate_statistics(tf_metric, metric_times)
    
    return peak_times, peak_heights, peak_deltat

Input: H1 and L1 detector strains, time array | Output: Event times, significance values, and time uncertainties

Optimization Target: Maximizing Area Under Curve (AUC) in the 1-1000Hz false alarms per-year range, balancing detection sensitivity and false alarm rates across algorithm generations

Problem: Pipeline Workflow

Conditions raw detector data (whitening)
Computes time-frequency metrics
Identifies peaks above background
Returns event candidates with timestamps

MLGWSC-1 benchmark

arXiv:2410.14716 [cs.LG]

P

H

S_p

\mathbb{R}

f

I_p

h

external_knowledge
(constraint)

h

g(h)

Automated Heuristic Design: Problem Definition

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Optimization Target: Maximizing Area Under Curve (AUC) in the 1-1000Hz false alarms per-year range, balancing detection sensitivity and false alarm rates across algorithm generations

MLGWSC-1 benchmark

arXiv:2410.14716 [cs.LG]

P

H

S_p

\mathbb{R}

f

I_p

h

external_knowledge
(constraint)

h

g(h)

Evaluation for MLGWSC-1 benchmark

Strategies for Adapting Gravitational Wave Detection for Algorithmic Discovery

Automated Heuristic Design: Problem Definition

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

arXiv:2410.14716 [cs.LG]

external_knowledge
(constraint)

Strategies for Adapting Gravitational Wave Detection for Algorithmic Discovery

You are an expert in gravitational wave signal detection algorithms. Your task is to design heuristics that can effectively solve optimization problems.

{prompt_task}

I have analyzed two algorithms and provided a reflection on their differences. 

[Worse code]
{worse_code}

[Better code]
{better_code}

[Reflection]
{reflection}

{external_knowledge}

Based on this reflection, please write an improved algorithm according to the reflection. 
First, describe the design idea and main steps of your algorithm in one sentence. The description must be inside a brace outside the code implementation. Next, implement it in Python as a function named '{func_name}'.
This function should accept {input_count} input(s): {joined_inputs}. The function should return {output_count} output(s): {joined_outputs}. 
{inout_inf} {other_inf}

Do not give additional explanations.

One Prompt Template for MLGWSC1 Algorithm Synthesis

Prompt Structure for Algorithm Evolution

This template guides the LLM to generate optimized gravitational wave detection algorithms by learning from comparative examples ("Crossover").

Key Components:

Expert role establishment
Example pair analysis (worse/better algorithm)
Reflection on improvements
Targeted new algorithm generation
Strict output format enforcement

Algorithmic Synergy: MCTS, Evolution & LLM Agents

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

LLM-Informed Evo-MCTS for AAD

蒙特卡洛树搜索 (MCTS)

Algorithmic Synergy: MCTS, Evolution & LLM Agents

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

h

蒙特卡洛树搜索 (MCTS)

Casse1: Go Game

Case 2: OpenAI Strawberry (o1)

o1 的发布，标志着推理时间扩展（inference-time scaling）范式正式应用于生产环境。正如Sutton在《The Bitter Lesson》中指出，只有学习和搜索两种技术能随计算能力无限扩展。自此开始重点转向搜索了。

Browne et al. (2012)

arXiv:2305.14078 [cs.RO]

蒙特卡洛树搜索（MCTS）结合随机模拟与树搜索优化决策，(一直)都是现代博弈程序（如AlphaGo）的核心技术。

LLM-Informed Evo-MCTS for AAD

OpenAI. Learning to reason with LLMs (2024)

OpenAI. Openai o1 system card (2024)

深度推理时代的来临：OpenAI o1开创推理扩展的新范式 (2024)

Algorithmic Synergy: MCTS, Evolution & LLM Agents

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

h

进化演化 (EA)

Within each evolutionary iteration, Monte Carlo Tree Search (MCTS) decomposes complex signal detection problems into manageable decision sequences, enabling depth-wise and path-wise exploration of algorithmic possibilities.
We propose four evolutionary operations for MCTS expansion: Parent Crossover (PC) combines information from nodes at the parent level, Sibling Crossover (SC) exchanges features between nodes sharing the same parent, Point Mutation (PM) introduces random perturbations to individual nodes, and Path-wise Crossover (PWC) synthesizes information along complete trajectories from root to leaf.

LLM-Informed Evo-MCTS for AAD

Algorithmic Synergy: MCTS, Evolution & LLM Agents

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

LLM-Driven Algorithmic Evolution Through Reflective Code Synthesis.

deepseek-R1 for reflection generation
o3-mini-medium for code generation

LLM-Informed Evo-MCTS for AAD

MLGWSC1 Benchmark: Optimization Performance Results

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Benchmarking against state-of-the-art methods

Automated exploration of algorithm parameter space

MLGWSC1 Benchmark: Optimization Performance Results

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Benchmarking against state-of-the-art methods

Automated exploration of algorithm parameter space

PyCBC (linear-core)

cWB (nonlinear-core)

Simple filters (non-linear)

CNN-like (highly non-linear)

20.2%

23.4%

MLGWSC1 Benchmark: Optimization Performance Results

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Benchmarking against state-of-the-art methods

Automated exploration of algorithm parameter space

PyCBC (linear-core)

cWB (nonlinear-core)

Simple filters (non-linear)

CNN-like (highly non-linear)

20.2%

23.4%

MLGWSC1 Benchmark: Optimization Performance Results

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Optimization Progress & Algorithm Diversity

Diversity metrics:

Shannon index captures algorithmic variety
CID measures structural complexity differences.

Diversity in Evolutionary Computation

Population encoding:

Removing comments and docstrings using abstract-syntax tree,
standardizing code snippets into a common coding style (e.g., PEP81),
Convert code snippets to vector representations using a code embedding model.

Interpretability Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Algorithmic Component Impact Analysis.

A comprehensive technique impact analysis using controlled comparative methodology

import numpy as np
import scipy.signal as signal
from scipy.signal.windows import tukey
from scipy.signal import savgol_filter

def pipeline_v2(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    The pipeline function processes gravitational wave data from the H1 and L1 detectors to identify potential gravitational wave signals.
    It takes strain_h1 and strain_l1 numpy arrays containing detector data, and times array with corresponding time points.
    The function returns a tuple of three numpy arrays: peak_times containing GPS times of identified events,
    peak_heights with significance values of each peak, and peak_deltat showing time window uncertainty for each peak.
    """
    eps = np.finfo(float).tiny
    dt = times[1] - times[0]
    fs = 1.0 / dt
    # Base spectrogram parameters
    base_nperseg = 256
    base_noverlap = base_nperseg // 2
    medfilt_kernel = 101       # odd kernel size for robust detrending
    uncertainty_window = 5     # half-window for local timing uncertainty

    # -------------------- Stage 1: Robust Baseline Detrending --------------------
    # Remove long-term trends using a median filter for each channel.
    detrended_h1 = strain_h1 - signal.medfilt(strain_h1, kernel_size=medfilt_kernel)
    detrended_l1 = strain_l1 - signal.medfilt(strain_l1, kernel_size=medfilt_kernel)

    # -------------------- Stage 2: Adaptive Whitening with Enhanced PSD Smoothing --------------------
    def adaptive_whitening(strain: np.ndarray) -> np.ndarray:
        # Center the signal.
        centered = strain - np.mean(strain)
        n_samples = len(centered)
        # Adaptive window length: between 5 and 30 seconds
        win_length_sec = np.clip(n_samples / fs / 20, 5, 30)
        nperseg_adapt = int(win_length_sec * fs)
        nperseg_adapt = max(10, min(nperseg_adapt, n_samples))
        
        # Create a Tukey window with 75% overlap.
        tukey_alpha = 0.25
        win = tukey(nperseg_adapt, alpha=tukey_alpha)
        noverlap_adapt = int(nperseg_adapt * 0.75)
        if noverlap_adapt >= nperseg_adapt:
            noverlap_adapt = nperseg_adapt - 1
        
        # Estimate the power spectral density (PSD) using Welch's method.
        freqs, psd = signal.welch(centered, fs=fs, nperseg=nperseg_adapt,
                                  noverlap=noverlap_adapt, window=win, detrend='constant')
        psd = np.maximum(psd, eps)
        
        # Compute relative differences for PSD stationarity measure.
        diff_arr = np.abs(np.diff(psd)) / (psd[:-1] + eps)
        # Smooth the derivative with a moving average.
        if len(diff_arr) >= 3:
            smooth_diff = np.convolve(diff_arr, np.ones(3)/3, mode='same')
        else:
            smooth_diff = diff_arr
        
        # Exponential smoothing (Kalman-like) with adaptive alpha using PSD stationarity.
        smoothed_psd = np.copy(psd)
        for i in range(1, len(psd)):
            # Adaptive smoothing coefficient: base 0.8 modified by local stationarity (±0.05)
            local_alpha = np.clip(0.8 - 0.05 * smooth_diff[min(i-1, len(smooth_diff)-1)], 0.75, 0.85)
            smoothed_psd[i] = local_alpha * smoothed_psd[i-1] + (1 - local_alpha) * psd[i]
            
        # Compute Tikhonov regularization gain based on deviation from median PSD.
        noise_baseline = np.median(smoothed_psd)
        raw_gain = (smoothed_psd / (noise_baseline + eps)) - 1.0
        
        # Compute a causal-like gradient using the Savitzky-Golay filter.
        win_len = 11 if len(smoothed_psd) >= 11 else ((len(smoothed_psd)//2)*2+1)
        polyorder = 2 if win_len > 2 else 1
        delta_freq = np.mean(np.diff(freqs))
        grad_psd = savgol_filter(smoothed_psd, win_len, polyorder, deriv=1, delta=delta_freq, mode='interp')
        
        # Nonlinear scaling via sigmoid to enhance gradient differences.
        sigmoid = lambda x: 1.0 / (1.0 + np.exp(-x))
        scaling_factor = 1.0 + 2.0 * sigmoid(np.abs(grad_psd) / (np.median(smoothed_psd) + eps))
        
        # Compute adaptive gain factors with nonlinear scaling.
        gain = 1.0 - np.exp(-0.5 * scaling_factor * raw_gain)
        gain = np.clip(gain, -8.0, 8.0)
        
        # FFT-based whitening: interpolate gain and PSD onto FFT frequency bins.
        signal_fft = np.fft.rfft(centered)
        freq_bins = np.fft.rfftfreq(n_samples, d=dt)
        interp_gain = np.interp(freq_bins, freqs, gain, left=gain[0], right=gain[-1])
        interp_psd = np.interp(freq_bins, freqs, smoothed_psd, left=smoothed_psd[0], right=smoothed_psd[-1])
        denom = np.sqrt(interp_psd) * (np.abs(interp_gain) + eps)
        denom = np.maximum(denom, eps)
        white_fft = signal_fft / denom
        whitened = np.fft.irfft(white_fft, n=n_samples)
        return whitened

    # Whiten H1 and L1 channels using the adapted method.
    white_h1 = adaptive_whitening(detrended_h1)
    white_l1 = adaptive_whitening(detrended_l1)

    # -------------------- Stage 3: Coherent Time-Frequency Metric with Frequency-Conditioned Regularization --------------------
    def compute_coherent_metric(w1: np.ndarray, w2: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        # Compute complex spectrograms preserving phase information.
        f1, t_spec, Sxx1 = signal.spectrogram(w1, fs=fs, nperseg=base_nperseg,
                                              noverlap=base_noverlap, mode='complex', detrend=False)
        f2, t_spec2, Sxx2 = signal.spectrogram(w2, fs=fs, nperseg=base_nperseg,
                                               noverlap=base_noverlap, mode='complex', detrend=False)
        # Ensure common time axis length.
        common_len = min(len(t_spec), len(t_spec2))
        t_spec = t_spec[:common_len]
        Sxx1 = Sxx1[:, :common_len]
        Sxx2 = Sxx2[:, :common_len]
        
        # Compute phase differences and coherence between detectors.
        phase_diff = np.angle(Sxx1) - np.angle(Sxx2)
        phase_coherence = np.abs(np.cos(phase_diff))
        
        # Estimate median PSD per frequency bin from the spectrograms.
        psd1 = np.median(np.abs(Sxx1)**2, axis=1)
        psd2 = np.median(np.abs(Sxx2)**2, axis=1)
        
        # Frequency-conditioned regularization gain (reflection-guided).
        lambda_f = 0.5 * ((np.median(psd1) / (psd1 + eps)) + (np.median(psd2) / (psd2 + eps)))
        lambda_f = np.clip(lambda_f, 1e-4, 1e-2)
        # Regularization denominator integrating detector PSDs and lambda.
        reg_denom = (psd1[:, None] + psd2[:, None] + lambda_f[:, None] + eps)
        
        # Weighted phase coherence that balances phase alignment with noise levels.
        weighted_comp = phase_coherence / reg_denom
        
        # Compute axial (frequency) second derivatives as curvature estimates.
        d2_coh = np.gradient(np.gradient(phase_coherence, axis=0), axis=0)
        avg_curvature = np.mean(np.abs(d2_coh), axis=0)
        
        # Nonlinear activation boost using tanh for regions of high curvature.
        nonlinear_boost = np.tanh(5 * avg_curvature)
        linear_boost = 1.0 + 0.1 * avg_curvature
        
        # Cross-detector synergy: weight derived from global median consistency.
        novel_weight = np.mean((np.median(psd1) + np.median(psd2)) / (psd1[:, None] + psd2[:, None] + eps), axis=0)
        
        # Integrated time-frequency metric combining all enhancements.
        tf_metric = np.sum(weighted_comp * linear_boost * (1.0 + nonlinear_boost), axis=0) * novel_weight
        
        # Adjust the spectrogram time axis to account for window delay.
        metric_times = t_spec + times[0] + (base_nperseg / 2) / fs
        return tf_metric, metric_times

    tf_metric, metric_times = compute_coherent_metric(white_h1, white_l1)

    # -------------------- Stage 4: Multi-Resolution Thresholding with Octave-Spaced Dyadic Wavelet Validation --------------------
    def multi_resolution_thresholding(metric: np.ndarray, times_arr: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
        # Robust background estimation with median and MAD.
        bg_level = np.median(metric)
        mad_val = np.median(np.abs(metric - bg_level))
        robust_std = 1.4826 * mad_val
        threshold = bg_level + 1.5 * robust_std

        # Identify candidate peaks using prominence and minimum distance criteria.
        peaks, _ = signal.find_peaks(metric, height=threshold, distance=2, prominence=0.8 * robust_std)
        if peaks.size == 0:
            return np.array([]), np.array([]), np.array([])

        # Local uncertainty estimation using a Gaussian-weighted convolution.
        win_range = np.arange(-uncertainty_window, uncertainty_window + 1)
        sigma = uncertainty_window / 2.5
        gauss_kernel = np.exp(-0.5 * (win_range / sigma) ** 2)
        gauss_kernel /= np.sum(gauss_kernel)
        weighted_mean = np.convolve(metric, gauss_kernel, mode='same')
        weighted_sq = np.convolve(metric ** 2, gauss_kernel, mode='same')
        variances = np.maximum(weighted_sq - weighted_mean ** 2, 0.0)
        uncertainties = np.sqrt(variances)
        uncertainties = np.maximum(uncertainties, 0.01)

        valid_times = []
        valid_heights = []
        valid_uncerts = []
        n_metric = len(metric)

        # Compute a simple second derivative for local curvature checking.
        if n_metric > 2:
            second_deriv = np.diff(metric, n=2)
            second_deriv = np.pad(second_deriv, (1, 1), mode='edge')
        else:
            second_deriv = np.zeros_like(metric)

        # Use octave-spaced scales (dyadic wavelet validation) to validate peak significance.
        widths = np.arange(1, 9)  # approximate scales 1 to 8
        for peak in peaks:
            # Skip peaks lacking sufficient negative curvature.
            if second_deriv[peak] > -0.1 * robust_std:
                continue
            local_start = max(0, peak - uncertainty_window)
            local_end = min(n_metric, peak + uncertainty_window + 1)
            local_segment = metric[local_start:local_end]
            if len(local_segment) < 3:
                continue
            try:
                cwt_coeff = signal.cwt(local_segment, signal.ricker, widths)
            except Exception:
                continue
            max_coeff = np.max(np.abs(cwt_coeff))
            # Threshold for validating the candidate using local MAD.
            cwt_thresh = mad_val * np.sqrt(2 * np.log(len(local_segment) + eps))
            if max_coeff >= cwt_thresh:
                valid_times.append(times_arr[peak])
                valid_heights.append(metric[peak])
                valid_uncerts.append(uncertainties[peak])

        if len(valid_times) == 0:
            return np.array([]), np.array([]), np.array([])
        return np.array(valid_times), np.array(valid_heights), np.array(valid_uncerts)

    peak_times, peak_heights, peak_deltat = multi_resolution_thresholding(tf_metric, metric_times)
    return peak_times, peak_heights, peak_deltat

Automatically discover and interpret the value of nonlinear algorithms
Facilitating new knowledge production along with experience guidance

PT Level 5

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

import numpy as np
import scipy.signal as signal
from scipy.signal.windows import tukey
from scipy.signal import savgol_filter

def pipeline_v2(strain_h1: np.ndarray, strain_l1: np.ndarray, times: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    The pipeline function processes gravitational wave data from the H1 and L1 detectors to identify potential gravitational wave signals.
    It takes strain_h1 and strain_l1 numpy arrays containing detector data, and times array with corresponding time points.
    The function returns a tuple of three numpy arrays: peak_times containing GPS times of identified events,
    peak_heights with significance values of each peak, and peak_deltat showing time window uncertainty for each peak.
    """
    eps = np.finfo(float).tiny
    dt = times[1] - times[0]
    fs = 1.0 / dt
    # Base spectrogram parameters
    base_nperseg = 256
    base_noverlap = base_nperseg // 2
    medfilt_kernel = 101       # odd kernel size for robust detrending
    uncertainty_window = 5     # half-window for local timing uncertainty

    # -------------------- Stage 1: Robust Baseline Detrending --------------------
    # Remove long-term trends using a median filter for each channel.
    detrended_h1 = strain_h1 - signal.medfilt(strain_h1, kernel_size=medfilt_kernel)
    detrended_l1 = strain_l1 - signal.medfilt(strain_l1, kernel_size=medfilt_kernel)

    # -------------------- Stage 2: Adaptive Whitening with Enhanced PSD Smoothing --------------------
    def adaptive_whitening(strain: np.ndarray) -> np.ndarray:
        # Center the signal.
        centered = strain - np.mean(strain)
        n_samples = len(centered)
        # Adaptive window length: between 5 and 30 seconds
        win_length_sec = np.clip(n_samples / fs / 20, 5, 30)
        nperseg_adapt = int(win_length_sec * fs)
        nperseg_adapt = max(10, min(nperseg_adapt, n_samples))
        
        # Create a Tukey window with 75% overlap.
        tukey_alpha = 0.25
        win = tukey(nperseg_adapt, alpha=tukey_alpha)
        noverlap_adapt = int(nperseg_adapt * 0.75)
        if noverlap_adapt >= nperseg_adapt:
            noverlap_adapt = nperseg_adapt - 1
        
        # Estimate the power spectral density (PSD) using Welch's method.
        freqs, psd = signal.welch(centered, fs=fs, nperseg=nperseg_adapt,
                                  noverlap=noverlap_adapt, window=win, detrend='constant')
        psd = np.maximum(psd, eps)
        
        # Compute relative differences for PSD stationarity measure.
        diff_arr = np.abs(np.diff(psd)) / (psd[:-1] + eps)
        # Smooth the derivative with a moving average.
        if len(diff_arr) >= 3:
            smooth_diff = np.convolve(diff_arr, np.ones(3)/3, mode='same')
        else:
            smooth_diff = diff_arr
        
        # Exponential smoothing (Kalman-like) with adaptive alpha using PSD stationarity.
        smoothed_psd = np.copy(psd)
        for i in range(1, len(psd)):
            # Adaptive smoothing coefficient: base 0.8 modified by local stationarity (±0.05)
            local_alpha = np.clip(0.8 - 0.05 * smooth_diff[min(i-1, len(smooth_diff)-1)], 0.75, 0.85)
            smoothed_psd[i] = local_alpha * smoothed_psd[i-1] + (1 - local_alpha) * psd[i]
            
        # Compute Tikhonov regularization gain based on deviation from median PSD.
        noise_baseline = np.median(smoothed_psd)
        raw_gain = (smoothed_psd / (noise_baseline + eps)) - 1.0
        
        # Compute a causal-like gradient using the Savitzky-Golay filter.
        win_len = 11 if len(smoothed_psd) >= 11 else ((len(smoothed_psd)//2)*2+1)
        polyorder = 2 if win_len > 2 else 1
        delta_freq = np.mean(np.diff(freqs))
        grad_psd = savgol_filter(smoothed_psd, win_len, polyorder, deriv=1, delta=delta_freq, mode='interp')
        
        # Nonlinear scaling via sigmoid to enhance gradient differences.
        sigmoid = lambda x: 1.0 / (1.0 + np.exp(-x))
        scaling_factor = 1.0 + 2.0 * sigmoid(np.abs(grad_psd) / (np.median(smoothed_psd) + eps))
        
        # Compute adaptive gain factors with nonlinear scaling.
        gain = 1.0 - np.exp(-0.5 * scaling_factor * raw_gain)
        gain = np.clip(gain, -8.0, 8.0)
        
        # FFT-based whitening: interpolate gain and PSD onto FFT frequency bins.
        signal_fft = np.fft.rfft(centered)
        freq_bins = np.fft.rfftfreq(n_samples, d=dt)
        interp_gain = np.interp(freq_bins, freqs, gain, left=gain[0], right=gain[-1])
        interp_psd = np.interp(freq_bins, freqs, smoothed_psd, left=smoothed_psd[0], right=smoothed_psd[-1])
        denom = np.sqrt(interp_psd) * (np.abs(interp_gain) + eps)
        denom = np.maximum(denom, eps)
        white_fft = signal_fft / denom
        whitened = np.fft.irfft(white_fft, n=n_samples)
        return whitened

    # Whiten H1 and L1 channels using the adapted method.
    white_h1 = adaptive_whitening(detrended_h1)
    white_l1 = adaptive_whitening(detrended_l1)

    # -------------------- Stage 3: Coherent Time-Frequency Metric with Frequency-Conditioned Regularization --------------------
    def compute_coherent_metric(w1: np.ndarray, w2: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        # Compute complex spectrograms preserving phase information.
        f1, t_spec, Sxx1 = signal.spectrogram(w1, fs=fs, nperseg=base_nperseg,
                                              noverlap=base_noverlap, mode='complex', detrend=False)
        f2, t_spec2, Sxx2 = signal.spectrogram(w2, fs=fs, nperseg=base_nperseg,
                                               noverlap=base_noverlap, mode='complex', detrend=False)
        # Ensure common time axis length.
        common_len = min(len(t_spec), len(t_spec2))
        t_spec = t_spec[:common_len]
        Sxx1 = Sxx1[:, :common_len]
        Sxx2 = Sxx2[:, :common_len]
        
        # Compute phase differences and coherence between detectors.
        phase_diff = np.angle(Sxx1) - np.angle(Sxx2)
        phase_coherence = np.abs(np.cos(phase_diff))
        
        # Estimate median PSD per frequency bin from the spectrograms.
        psd1 = np.median(np.abs(Sxx1)**2, axis=1)
        psd2 = np.median(np.abs(Sxx2)**2, axis=1)
        
        # Frequency-conditioned regularization gain (reflection-guided).
        lambda_f = 0.5 * ((np.median(psd1) / (psd1 + eps)) + (np.median(psd2) / (psd2 + eps)))
        lambda_f = np.clip(lambda_f, 1e-4, 1e-2)
        # Regularization denominator integrating detector PSDs and lambda.
        reg_denom = (psd1[:, None] + psd2[:, None] + lambda_f[:, None] + eps)
        
        # Weighted phase coherence that balances phase alignment with noise levels.
        weighted_comp = phase_coherence / reg_denom
        
        # Compute axial (frequency) second derivatives as curvature estimates.
        d2_coh = np.gradient(np.gradient(phase_coherence, axis=0), axis=0)
        avg_curvature = np.mean(np.abs(d2_coh), axis=0)
        
        # Nonlinear activation boost using tanh for regions of high curvature.
        nonlinear_boost = np.tanh(5 * avg_curvature)
        linear_boost = 1.0 + 0.1 * avg_curvature
        
        # Cross-detector synergy: weight derived from global median consistency.
        novel_weight = np.mean((np.median(psd1) + np.median(psd2)) / (psd1[:, None] + psd2[:, None] + eps), axis=0)
        
        # Integrated time-frequency metric combining all enhancements.
        tf_metric = np.sum(weighted_comp * linear_boost * (1.0 + nonlinear_boost), axis=0) * novel_weight
        
        # Adjust the spectrogram time axis to account for window delay.
        metric_times = t_spec + times[0] + (base_nperseg / 2) / fs
        return tf_metric, metric_times

    tf_metric, metric_times = compute_coherent_metric(white_h1, white_l1)

    # -------------------- Stage 4: Multi-Resolution Thresholding with Octave-Spaced Dyadic Wavelet Validation --------------------
    def multi_resolution_thresholding(metric: np.ndarray, times_arr: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
        # Robust background estimation with median and MAD.
        bg_level = np.median(metric)
        mad_val = np.median(np.abs(metric - bg_level))
        robust_std = 1.4826 * mad_val
        threshold = bg_level + 1.5 * robust_std

        # Identify candidate peaks using prominence and minimum distance criteria.
        peaks, _ = signal.find_peaks(metric, height=threshold, distance=2, prominence=0.8 * robust_std)
        if peaks.size == 0:
            return np.array([]), np.array([]), np.array([])

        # Local uncertainty estimation using a Gaussian-weighted convolution.
        win_range = np.arange(-uncertainty_window, uncertainty_window + 1)
        sigma = uncertainty_window / 2.5
        gauss_kernel = np.exp(-0.5 * (win_range / sigma) ** 2)
        gauss_kernel /= np.sum(gauss_kernel)
        weighted_mean = np.convolve(metric, gauss_kernel, mode='same')
        weighted_sq = np.convolve(metric ** 2, gauss_kernel, mode='same')
        variances = np.maximum(weighted_sq - weighted_mean ** 2, 0.0)
        uncertainties = np.sqrt(variances)
        uncertainties = np.maximum(uncertainties, 0.01)

        valid_times = []
        valid_heights = []
        valid_uncerts = []
        n_metric = len(metric)

        # Compute a simple second derivative for local curvature checking.
        if n_metric > 2:
            second_deriv = np.diff(metric, n=2)
            second_deriv = np.pad(second_deriv, (1, 1), mode='edge')
        else:
            second_deriv = np.zeros_like(metric)

        # Use octave-spaced scales (dyadic wavelet validation) to validate peak significance.
        widths = np.arange(1, 9)  # approximate scales 1 to 8
        for peak in peaks:
            # Skip peaks lacking sufficient negative curvature.
            if second_deriv[peak] > -0.1 * robust_std:
                continue
            local_start = max(0, peak - uncertainty_window)
            local_end = min(n_metric, peak + uncertainty_window + 1)
            local_segment = metric[local_start:local_end]
            if len(local_segment) < 3:
                continue
            try:
                cwt_coeff = signal.cwt(local_segment, signal.ricker, widths)
            except Exception:
                continue
            max_coeff = np.max(np.abs(cwt_coeff))
            # Threshold for validating the candidate using local MAD.
            cwt_thresh = mad_val * np.sqrt(2 * np.log(len(local_segment) + eps))
            if max_coeff >= cwt_thresh:
                valid_times.append(times_arr[peak])
                valid_heights.append(metric[peak])
                valid_uncerts.append(uncertainties[peak])

        if len(valid_times) == 0:
            return np.array([]), np.array([]), np.array([])
        return np.array(valid_times), np.array(valid_heights), np.array(valid_uncerts)

    peak_times, peak_heights, peak_deltat = multi_resolution_thresholding(tf_metric, metric_times)
    return peak_times, peak_heights, peak_deltat

Interpretability Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Out-of-distribution (OOD) detection

Generalization capability and robustness of the optimized algorithms

MCTS Depth-Stratified Performance Analysis.

Analyzed the relationship between MCTS tree depth and algorithm fitness across different optimization phases. The 10-layer MCTS structure was stratified into three depth groups: Depth I (depths 1-4), Depth II (depths 5-7), and Depth III (depths 8-10), representing shallow, intermediate, and deep exploration levels, respectively.

Algorithmic Component Impact Analysis.

A comprehensive technique impact analysis using controlled comparative methodology

Interpretability Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Algorithmic Component Impact Analysis.

A comprehensive technique impact analysis using controlled comparative methodology

Please analyze the following Python code snippet for gravitational wave detection and
extract technical features in JSON format.

The code typically has three main stages:
1. Data Conditioning: preprocessing, filtering, whitening, etc.
2. Time-Frequency Analysis: spectrograms, FFT, wavelets, etc.
3. Trigger Analysis: peak detection, thresholding, validation, etc.

For each stage present in the code, extract:
- Technical methods used
- Libraries and functions called
- Algorithm complexity features
- Key parameters

Code to analyze:
```python
{code_snippet}
```

Please return a JSON object with this structure:
{
  "algorithm_id": "{algorithm_id}",
  "stages": {
    "data_conditioning": {
      "present": true/false,
      "techniques": ["technique1", "technique2"],
      "libraries": ["lib1", "lib2"],
      "functions": ["func1", "func2"],
      "parameters": {"param1": "value1"},
      "complexity": "low/medium/high"
    },
    "time_frequency_analysis": {...},
    "trigger_analysis": {...}
  },
  "overall_complexity": "low/medium/high",
  "total_lines": 0,
  "unique_libraries": ["lib1", "lib2"],
  "code_quality_score": 0.0
}

Only return the JSON object, no additional text.

Interpretability Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

MCTS Algorithmic Evolution Pathway

Complete MCTS tree structure showing all nodes associated with the optimal algorithm (node 486, fitness=5041.4).

Interpretability Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Edge robustness analysis for three critical evolutionary transitions.

The distributions demonstrate the stochastic nature of LLM-driven code generation while confirming the consistent discovery of high-performance algorithmic variants.

52.8% achieving superior fitness with 100% Tikhonov regularization inheritance

89.3% variants exceeding preceding node performance

70.7% variants outperforming node 204, 25.0% surpassing node 485

Interpretability Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

MCTS Algorithmic Evolution Pathway

Complete MCTS tree structure showing all nodes associated with the optimal algorithm (node 486, fitness=5041.4).

Framework Mechanism Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Integrated Architecture Validation

A comprehensive comparison of our integrated
Evo-MCTS framework against its constituent components operating in isolation.
- Evo-MCTS: MCTS + Self-evolve + Reflection mech.
- MCTS-AHD: MCTS framework for CO.
- ReEvo: evolutionary framework for CO.

Contributions of knowledge synthesis

Compare to w/o external knowledge
- non-linear vs linear only

LLM Model Selection and Robustness Analysis

Ablation study of various LLM contributions (code generator) and their robustness.
- ```
o3-mini-medium
o1-2024-12-17
gpt-4o-2024-11-20
```
```
claude-3-7-sonnet-20250219-thinking
```

59.1%

115%

Framework Mechanism Analysis

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

Contributions of knowledge synthesis

Compare to w/o external knowledge
- non-linear vs linear only

59.1%

115%

59.1%

### External Knowledge Integration
1. **Non-linear** Processing Core Concepts:
    - Signal Transformation: 
        * Non-linear vs linear decomposition
        * Adaptive threshold mechanisms
        * Multi-scale analysis
    
    - Feature Extraction:
        * Phase space reconstruction
        * Topological data analysis
        * Wavelet-based detection
    
    - Statistical Analysis:
        * Robust estimators
        * Non-Gaussian processes
        * Higher-order statistics

2. Implementation Principles:
    - Prioritize adaptive over fixed parameters
    - Consider local vs global characteristics
    - Balance computational cost with accuracy

Computational Resources and Parallelization

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

“东方”超算系统（ORISE，北京）

第三方大模型推理服务

闭源LLMs，访问外网需求，按token计费
~ \(10^3\) dollars

Key Takeaways: ... against Symbolic Regression

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

任何算法的设计问题都可被看作是一个优化问题

引力波数据处理的很多中间流程，都可以看做是“算法优化”问题，如滤波器设计、噪声建模、探测统计量构造等等
理论物理和宇宙学等中的很多解析建模和“符号回归”等方法，也都可以看做是“算法优化”问题
- 符号回归 vs 算法优化：

Other Opt. Problem Egs:
- AI-driven design of experiments. [Phys. Rev. X 15, 021012 (2025)]
- RL design for multiple filters in LIGO control system. [Science (2025)]

vs

Key Takeaways: ... against Symbolic Regression

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

What are our thoughts on LLMs?

Code
Natural Language Programming
Manifold Interpolation

Product vs. Innovation

鱼龙混杂

\(\Rightarrow\) Research object
\(\Rightarrow\) Method (LLM)
\(\Rightarrow\) Solution (new physics / scientific discovery)

arXiv:2506.13131 [cs.AI]

AlphaEvolve 提出了一种矩阵乘法的计算方法，在某些情况下比 1969 年德国数学家 Volker Strassen 提出的最快算法更快。（暂未开源）

AI for Science

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

2025年的后半年，大家纷纷开始押注 AI for Science!

OpenAI 首席产品官（CPO）Kevin Weil 在社交媒体上正式宣布 OpenAI 将开启一项新计划 ——OpenAI for Science，旨在打造下一代科学工具：一个由人工智能驱动、能加速科学发现的平台。
2025年11月24日，美国白宫，特朗普正式签署，「创世纪计划」（Genesis Mission）正式启动！这是一项被比作「AI曼哈顿计划」的重大行政命令。这项计划的核心目标是：加速利用AI推动科学突破！
DeepMind：把“AI 科学合作者”直接送进国家实验室

NVIDIA：把科学问题变成 AI 基建问题。在 NVIDIA 的叙事里，AI for Science 并不局限于“模型”，而是一整套算力平台 + 仿真系统 + 自动化实验 + 工程化工作流，这是 Genesis 能“站得住”的物理基础。

OpenAI 计划组建一个由顶尖学者组成的小型团队，这些学者需要满足三个条件：

在其研究领域达到世界级水准；
深度认同人工智能理念；
具备卓越的科学传播能力。

AI Scientist

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

PhysMaster: 构建自主AI物理学家，面向理论与计算物理研究

arXiv:2512.19799 [cs.AI]

AI Scientist

Motivetion: AI 能否不仅是处理文字的助手，而成为推导公式、编写代码、甚至独立做出科学发现的“物理学家”？

AI Scientist 的天花板是什么？

当前AI科学系统的局限性主要体现为：
- AI科学系统仍然深度依赖现有数据模式，而非真正理解科学本质。难以进行纯符号的数学推导，这类任务需要精确的逻辑链。（符号推导脆弱性）
- 科学探索需要质疑既有范式的能力，但AI系统通常只能在已有知识框架内优化，缺乏独立提出根本性新问题的直觉。（范式突破盲区）
- 真正颠覆性的AI科学家尚未出现，本质上是因为科学发现需要综合直觉、跨领域联想和对未知的主动探索，这远超当前AI的技术边界。现有系统在实验验证环节尤为薄弱——它们能处理模拟数据或文献中的计算，但无法亲自设计实验、操作仪器、应对真实世界的噪声和意外。科学理论需要实验验证才能成立，而AI缺乏这种与物理世界直接互动的能力。（实验验证隔阂）
- 更重要的是，科学突破往往来自非线性思维，比如爱因斯坦提出相对论时对"同时性"的重新定义，这种跳出既有逻辑链的创新至今仍是人类特有的能力。AI更像是高效的工具助手，要过渡到真正的"科学家"，还需要解决验证可靠性、跨领域迁移和创造性本质等深层挑战。（创造性鸿沟）

AI for Cosmology

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

基于 AI 的宇宙学模型空间集合与拟合普适性规律探索

探索观测数据的普适性规律：

宇宙学模型“简并”困境与几何视角的引入：
利用AI探索“模型空间几何”（Model Space Geometry）。这是一种极其前沿的视角：我们将每一个理论模型视为高维函数空间中的一个点或流形。通过分析不同观测数据（LSS、CMB、GW、21cm）在这个高维空间中切割出的“约束曲面”，我们可以研究这些曲面的拓扑结构和几何相交性。

传统的回归是拟合系数，符号回归是拟合公式。团队的AI技术能够处理多模态观测数据，自动搜索最能描述数据特征的数学表达式，而不预设其物理形式。这种方法被称为“模式反演”。例如，针对暗能量的状态方程w(z)，AI不再局限于CPL参数化（w_0 + w_a z/(1+z)），而是能够在函数空间中自由搜索。如果AI发现所有拟合效果最好的模型都共享某种特定的数学结构（例如，都包含一个特定的指数衰减项），那么这个共同的数学结构就揭示了暗能量本质的“普适性规律”。这种规律可能直接指向某种未知的“暗流体”统一理论，或者揭示了暴胀场与晚期加速膨胀之间的深层对称性。

宇宙学模型“简并”困境：探索观测数据的普适性规律

利用AI探索“模型空间几何”（Model Space Geometry）。这是一种极其前沿的视角：我们将每一个理论模型视为高维函数空间中的一个点或流形。通过分析不同观测数据（LSS、CMB、GW、21cm）在这个高维空间中切割出的“约束曲面”，我们可以研究这些曲面的拓扑结构和几何相交性。
利用 AI 技术对不同种类的宇宙学观测数据（LSS、CMB、GW 等）与各种理论模型之间的拟合关系进行大规模映射和分析，尝试揭示某种关于拟合的普遍规律或对称性，从而为寻找“暗组分统一流体”等新宇宙学模型提供本质性的启示，并探索相关方法向其他学科领域迁移的潜在价值。

面向新物理发现的“可控自进化”

Code
Natural Language Programming
Manifold Interpolation

引入“可控自进化”机制，使 AI 智能体能够在高维模型空间中，根据观测数据的反馈，自动修正或重新构造理论模型（超越传统符号回归的复杂非线性函数），自动优化模拟算法以得到更快更精确的理论预言，以实现对宇宙学模型和算法的自动发现和解释。

\(\Rightarrow\) Research object
\(\Rightarrow\) Method (LLM)
\(\Rightarrow\) Solution (new physics / scientific discovery)

Code
Natural Language Programming
Manifold Interpolation

引入“可控自进化”机制，使 AI 智能体能够在高维模型空间中，根据观测数据的反馈，自动修正或重新构造理论模型（超越传统符号回归的复杂非线性函数），自动优化模拟算法以得到更快更精确的理论预言，以实现对宇宙学模型和算法的自动发现和解释。

arXiv:2511.22512 [astro-ph.CO]

AI Scientist？

He Wang | ICTP-AP, UCAS

AI and Cosmology: From Computational Tools to Scientific Discovery

面向复杂算法和模型设计的“可控自进化”智能体技术

引入“可控自进化”机制，使 AI 智能体能够在高维模型空间中，根据观测数据的反馈，自动修正或重新构造理论模型（超越传统符号回归的复杂非线性函数），自动优化模拟算法以得到更快更精确的理论预言，以实现对宇宙学模型和算法的自动发现和解释，真正体现唯能力、唯创新的原则。

基于 AI 的宇宙学模型空间集合与拟合普适性规律探索

探索观测数据的普适性规律：利用 AI 技术对不同种类的宇宙学观测数据（LSS、CMB、GW 等）与各种理论模型之间的拟合关系进行大规模映射和分析，尝试揭示某种关于拟合的普遍规律或对称性，从而为寻找“暗组分统一流体”等新宇宙学模型提供本质性的启示，并探索相关方法向其他学科领域迁移的潜在价值。

宇宙学模型“简并”困境与几何视角的引入：
利用AI探索“模型空间几何”（Model Space Geometry）。这是一种极其前沿的视角：我们将每一个理论模型视为高维函数空间中的一个点或流形。通过分析不同观测数据（LSS、CMB、GW、21cm）在这个高维空间中切割出的“约束曲面”，我们可以研究这些曲面的拓扑结构和几何相交性。

传统的回归是拟合系数，符号回归是拟合公式。团队的AI技术能够处理多模态观测数据，自动搜索最能描述数据特征的数学表达式，而不预设其物理形式。这种方法被称为“模式反演”。例如，针对暗能量的状态方程w(z)，AI不再局限于CPL参数化（w_0 + w_a z/(1+z)），而是能够在函数空间中自由搜索。如果AI发现所有拟合效果最好的模型都共享某种特定的数学结构（例如，都包含一个特定的指数衰减项），那么这个共同的数学结构就揭示了暗能量本质的“普适性规律”。这种规律可能直接指向某种未知的“暗流体”统一理论，或者揭示了暴胀场与晚期加速膨胀之间的深层对称性。

面向新物理发现的“可控自进化”

Code
Natural Language Programming
Manifold Interpolation

\(\Rightarrow\) Research object
\(\Rightarrow\) Method (LLM)
\(\Rightarrow\) Solution (new physics / scientific discovery)

arXiv:2212.11926 [astro-ph.CO]

AI × 宇宙学：从算法工具到科学发现的新范式

AI and Cosmology: From Computational Tools to Scientific Discovery

—— A forward-looking perspective on AI-driven scientific research

王赫

—— 一些关于未来研究方向的思考

国际理论物理中心（亚太地区）

中国科学院大学

Content

LLM for Scientific Discovery

AI x Cosmology

How can LLMs be used for scientific discovery?

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

The "Real" Reasons We Apply LLMs to Scientific Discovery

Discover new knowledge and efficient algorithms using AI

Discover new knowledge and efficient algorithms using AI

Discover new knowledge and efficient algorithms using AI

Discover new knowledge and efficient algorithms using AI

Discover new knowledge and efficient algorithms using AI

The Optimization Potential of Large Language Models

The Optimization Potential of Large Language Models

The Optimization Potential of Large Language Models

The Optimization Potential of Large Language Models

Automated Heuristic Design: Problem Definition

Automated Heuristic Design: Problem Definition

Automated Heuristic Design: Problem Definition

Automated Heuristic Design: Problem Definition

Algorithmic Synergy: MCTS, Evolution & LLM Agents

Algorithmic Synergy: MCTS, Evolution & LLM Agents

Algorithmic Synergy: MCTS, Evolution & LLM Agents

Algorithmic Synergy: MCTS, Evolution & LLM Agents

MLGWSC1 Benchmark: Optimization Performance Results

MLGWSC1 Benchmark: Optimization Performance Results

MLGWSC1 Benchmark: Optimization Performance Results

MLGWSC1 Benchmark: Optimization Performance Results

Interpretability Analysis

Interpretability Analysis

Interpretability Analysis

Interpretability Analysis

Interpretability Analysis

Interpretability Analysis

Interpretability Analysis

Framework Mechanism Analysis

Framework Mechanism Analysis

Computational Resources and Parallelization

Key Takeaways: ... against Symbolic Regression

Key Takeaways: ... against Symbolic Regression

Content

LLM for Scientific Discovery

AI x Cosmology

AI for Science

AI Scientist

AI for Cosmology

AI Scientist？

AI × 宇宙学：从算法工具到科学发现的新范式

More from He Wang