王赫
2026/04/26
ICTP-AP, UCAS
hewang@ucas.ac.cn
Who Am I
— A quick intro and how I got into this field
What Is Machine Learning?
— The basics and why it matters
Deep Learning: When Machines Start to See and Think
— From neural networks to powerful representations
Gravitational Waves Meet Machine Learning
— How ML is reshaping data analysis in GW astronomy
Let’s Get Practical: Searching for Gravitational Waves
— A hands-on look at applying ML in real GW searches
LLMs for Gravitational Waves: My Ongoing Work
— Towards automated and interpretable scientific discovery
引力波是时空的涟漪。
大物体的引力扭曲空间和时间,或称为“时空”,就像保龄球在弹跳床上滚动时改变其形状一样。较小的物体因此会以不同的方式移动——就像弹跳床上朝向保龄球大小的凹陷螺旋而去的弹珠,而不是坐在平坦的表面上。
# AI for PE
爱因斯坦于1916年提出广义相对论,并预言了引力波的存在
引力波是广义相对论中的一种强场效应
2015年:首次实验探测到双黑洞并合引力波
2017年:首次双中子星多信使探测,开启多信使天文学时代
2017年:引力波探测成果被授予诺贝尔物理学奖
至今:发现了超过 90 个引力波事件
2024年:中国科学院大学加入地面引力波实验LIGO科学合作组织,成为LIGO目前在中国大陆地区的第二家成员单位。
未来规划:
2024-2025年:有希望探测到更多不同类型的引力波事件
空间引力波探测计划 (LISA/Taiji/Tianqin) + XG (CE/ET)
LIGO-VIRGO-KAGRA network
Gravitational waves generated by binary black holes system
GW detector
# AI for PE
引力波探测打开了探索宇宙的新窗口
不同波源,频率跨越 20 个数量级,不同探测器
四种系外信使包括:电磁辐射、引力波、中微子,以及宇宙射线。
多信使天文学
# AI for PE
# AI for PE
DOI:10.1063/1.1629411
The first GW event of GW150914
LISA / Taiji project
LIGO-VIRGO-KAGRA
# AI for PE
GW Data Characteristics
LIGO-VIRGO-KAGRA
LISA Project
Noise: non-Gaussian and non-stationary
Signal challenges:
(Earth-based) A low signal-to-noise ratio (SNR) which is typically about 1/100 of the noise amplitude (-60 dB).
(Space-based) A superposition of all GW signals (e.g.: 104 of GBs, 10~102 of SMBHs, and 10~103 of EMRIs, etc.) received during the mission's observational run.
Matched Filtering Techniques (匹配滤波方法)
In Gaussian and stationary noise environments, the optimal linear algorithm for extracting weak signals
Statistical Approaches
Frequentist Testing:
Bayesian Testing:
# AI for PE
# AI for PE
# AI for PE
真理的仲裁:从物理假说到算法验证
Everything begins with physics. Everything ends with algorithms.
# AI for PE
# GW: DL
书中例子多而形象,适合当做工具书
模型+策略+算法
(从概率角度)
机器学习
(公理化角度)
讲理论,不讲推导
经典,缺前沿
神书(从贝叶斯角度)
2k 多页,难啃,概率模型的角度出发
花书:DL 圣经
科普,培养直觉
# GW: DL
工程角度,无需高等
数学背景
参数非参数
+频率贝叶
斯角度
统计角度
统计方法集大成的书
讲理论,
不会讲推导
贝叶斯角度
DL 应用角度
贝叶斯角度完整介绍
大量数学推导
# GW: DL
优秀课程资源:
值得关注的公众号:
机器之心(顶流)
量子位(顶流)
新智元(顶流)
专知(偏学术)
微软亚洲研究院
将门创投
旷视研究院
DeepTech 深科技(麻省理工科技评论)
极市平台(技术分享)
爱可可-爱生活(微博、公众号、知乎、b站...)
陈光老师,北京邮电大学PRIS模式识别实验室
记得给课程 Star
# AI for PE
第一章 绪论
1.1 引言
1.2 多信使天文学
1.3 研究现状、机遇与挑战
1.4 本文研究的目标与框架
第二章 引力波探测和数据分析理论
2.1 引言
2.2 引力波探测技术
2.3 信号处理与数据分析方法
2.4 匹配滤波技术
第三章 深度学习的理论基础
3.1 引言
3.2 机器学习理论
3.3 深度神经网络
3.4 卷积神经网络
第四章 引力波探测中关于神经网络的可解释性研究
4.1 引言
4.2 神经网络的结构
4.3 数据集的制备和优化策略
4.4 引力波信号识别的泛化能力
4.5 引力波信号特征的可视化表示
4.6 引力波波形特征的灵敏度分析
第五章 卷积神经网络结构对引力波信号识别的性能研究
5.1 引言
5.2 引力波数据的制备和处理流程
5.3 引力波数据分析中信噪比的比较分析
5.4 卷积神经网络的超参数调优和性能比较
5.5 总结与结论
第六章 匹配滤波-卷积神经网络(MF-CNN)模型的应用研究
6.1 引言
6.2 时域中的匹配滤波
6.3 用于匹配滤波的卷积神经单元
6.4 匹配滤波-卷积神经网络(MF-CNN)模型的构造
6.5 搜寻疑似引力波信号的策略
6.6 数据准备与模型微调
6.7 真实 LIGO 引力波数据上的搜寻结果
6.8 总结与结论
第七章 总结与展望
附录
A. 采样定理与 Nyquist 频率
B. 关于功率谱密度性质的数学证明
C. 最大似然估计和交叉熵
# AI for PE
# AI for PE
# AI for PE
# AI for PE
Data quality improvement
Credit: Marco Cavaglià
LIGO-Virgo-KAGRA data processing
GW waveform modeling
GW searches
Astrophsical interpretation of GW sources
Space-based GW detection (Taiji program)
# AI for PE
Bayesian Inference
Traditional parameter estimation (PE) techniques rely on Bayesian analysis methods (posteriors + evidence)
For CBC, LIGO-Virgo-KAGRA parameter estimation software:
Bilby / LALInference / PyCBC Inference / RIFT
Thrane, Eric, and Colm Talbot. “An Introduction to Bayesian Inference in Gravitational-Wave Astronomy: Parameter Estimation, Model Selection, and Hierarchical Models.” Publications of the Astronomical Society of Australia 36 (September 2019): e010. https://doi.org/10.1017/pasa.2019.2.
# AI for PE
An example: Posterior probability distribution of the complete 15-dimensional parameters
# AI for PE
He Wang+, Big Data Mining and Analytics, 2021
# AI for PE
# AI for PE
進撃の DINGO in GW inference area.
2002.07656: 5D toy model [1] (PRD)
2008.03312: 15D binary black hole inference [1] (MLST)
2106.12594: Amortized inference and group-equivariant neural posterior estimation [2] (PRL)
2111.13139: Group-equivariant neural posterior estimation [2] (ICLR 2022)
2210.05686: +Importance sampling [2] (PRL)
2211.08801: Noise forecasting [2] (PRD)
2311.12093: Population studies [2] (PRD)
2404.14286: Find evidence for eccentric binaries. [2] (?)
2407.09602: BNS inference [2] (Nature)
2512.02968: +Transformer, (Dingo-T1) [3] (?)
2603.20431: For LISA [4] (?)
https://github.com/dingo-gw/dingo (2023.03)
https://github.com/dingo-gw/dingo-T1 (2025.11)
https://github.com/AliSword/dingo-lisa (2026.04)
https://github.com/stephengreen/gw-school-corfu-2023 (Tutorial)
https://github.com/annalena-k/tutorial-dingo-introduction (Tutorial)
# AI for PE
The main idea of flow-based modeling is to express \(\mathbf{y}\in\mathbb{R}^D\) as a transformation \(T\) of a real vector \(\mathbf{z}\in\mathbb{R}^D\) sampled from \(p_{\mathrm{z}}(\mathbf{z})\):
(Based on 1912.02762)
Note: The invertible and differentiable transformation \(T\) and the base distribution \(p_{\mathrm{z}}(\mathbf{z})\) can have parameters \(\{\boldsymbol{\phi}, \boldsymbol{\psi}\}\) of their own, i.e. \( T_{{\phi}}\) and \(p_{\mathrm{z},\boldsymbol{\psi}}(\mathbf{z})\).
Change of Variables:
Equivalently,
The Jacobia \(J_{T}(\mathbf{u})\) is the \(D \times D\) matrix of all partial derivatives of \(T\) given by:
【【机器学习】白板推导系列(三十三) ~ 流模型(Flow based Model)】
base density
target density
# AI for PE
(Based on 1912.02762)
Rational Quadratic Neural Spline Flows
(RQ-NSF)
base density
target density
# AI for PE
Objective:
在信息论中,可以通过某概率分布函数 \(p(x),x\in X\) 作为变量,定义一个关于 \(p(x)\) 的单调函数 \(h(x)\),称其为概率分布 \(p(x)\) 的信息量(measure of information): \(h(x) \equiv -\log p(x)\)
定义所有信息量的期望为随机变量 \(x\) 的 熵 (entropy):
若同一个随机变量 \(x\) 有两个独立的概率分布 \(p(x)\) 和 \(q(x)\),则可以定义这两个分布的相对熵 (relative entropy),也常称为 KL 散度 (Kullback-Leibler divergence),来衡量两个分布之间的差异:
可见 KL 越小,表示 \(p(x)\) 和 \(q(x)\) 两个分布越接近。上式中,我们已经定义了交叉熵 (cross entropy) 为
# AI for PE
Objective:
当对应到机器学习中最大似然估计方法时,训练集上的经验分布 \(\hat{p}_ \text{data}\) 和模型分布之间的差异程度可以用 KL 散度度量为:
由上式可知,等号右边第一项仅涉及数据的生成过程,和机器学习模型无关。这意味着当我们训练机器学习模型最小化 KL 散度时,我们只需要等价优化地最小化等号右边的第二项,即有
Recall:
由此可知,对于任何一个由负对数似然组成的代价函数都是定义在训练集上的经验分布和定义在模型上的概率分布之间的交叉熵。
# AI for PE
base density
target density
e.g., Autoregressive Flow
Autoregressive flow 的核心思想是按维度顺序逐个变换变量,每一步只依赖“已经生成/变换过的前面变量”,从而保证整体变换可逆且 Jacobian 易计算。
更具体地说:
一句话总结:
👉 autoregressive flow = “按顺序逐维做条件可逆变换”,用因果结构换取可逆性 + 高效概率计算。
# AI for PE
base density
target density
# AI for PE
base density
target density
Train
nflow
# AI for PE
base density
target density
Train
nflow
Test
nflow
nflow
nflow
# AI for PE
base density
target density
Conditioner 的思路框架图 (略)
( . , a )
( . , b)
( . , a+b)
( . , hidden_dims)
Linear
BN+ReLU+Linear
+BN+ReLU
+Dropout+Linear
( . , hidden_dims)
num_layers x
( . , hidden_dims)
Flow input
Context
Linear
( . , 2 x hidden_dims)
( . , hidden_dims)
copy
( . , hidden_dims)
( . , hidden_dims)
( . , a)
Cat
Cat
GLU
Flow output
num_blocks x
# AI for PE
(Based on 1912.02762)
1024 sec
8 sec
ref_time
GPS time
6 sec
Step.1: Generate reduced basis based on SVD.
Step.1
Step.0
Step.0: Estimate PSD around the target event.
# AI for PE
Training
1024 sec
8 sec
ref_time
GPS time
6 sec
Step.2: Train the model
Step.2
base dist.
target dist.
Coupling architecture:
Rational Quadratic Neural Spline Flows (RQ-NSF)
# AI for PE
1024 sec
8 sec
ref_time
GPS time
6 sec
Testing
Step.3: Test the model (inference)
Step.3
base dist.
target dist.
# AI for PE
Training
1024 sec
8 sec
ref_time
GPS time
6 sec
200
200
200
800
128
Embedding network
num of residual block \(10 \rightarrow 5\)
num of flows \(15 \rightarrow 30\)
(1024, 512, 256, 128)
\(n\sim p(S_n)\)
\(S^{(i)}_n\sim p(S_n)\)
~28 days
~50 days
3 models
time shift
\(\delta t_I \sim \kappa(\delta t_I)\)
?
# AI for PE
A check to ensure that the probability distributions we recover are truly representative of the confidence we should hold in the parameters of the signal.
By setting up a large set of test injections we can see if this is statistically true by determining the frequency with which the true parameters lie within a certain confidence level.
For each run we calculate credible intervals from the posterior samples, for each parameter. We can then examine the number of times the injected value falls within a given credible interval. If the posterior samples are an unbiased estimate of the true probability, then 10% of the runs should find the injected values within a 10% credible interval, 50% of runs within the 50% interval, and so on.
(1409.7215)
Median-unbiased estimators involve random errors and no systematic errors.
def pp_plot_scratch(Posterior, TrueParams,
x_values = np.linspace(0, 1, 1001)):
'''
Posterior - (Num of injections, Num of sampleing)
TrueParams - (Num of injections, )
'''
credible_levels = np.array([sum(pd.Series(Posterior[i]) < T)/len(Posterior[i]) \
for i, T in enumerate(TrueParams)])
pp = np.array([sum(credible_levels < xx) /
len(credible_levels) for xx in x_values])
return pp# AI for PE
A check to ensure that the probability distributions we recover are truly representative of the confidence we should hold in the parameters of the signal.
(1409.7215)
A test for pp-plot:
# AI for PE
A check to ensure that the probability distributions we recover are truly representative of the confidence we should hold in the parameters of the signal.
(1409.7215)
A test for pp-plot:
# AI for PE
A check to ensure that the probability distributions we recover are truly representative of the confidence we should hold in the parameters of the signal.
(1409.7215)
(2008.03312)
(2002.07656)
(1909.06296)
Some cases:
# AI for PE
🚀 针对空间引力波探测中大规模黑洞双星(MBHB)在复杂噪声背景下的参数快速估计挑战,该研究提出了一种基于可伸缩Normalizing Flow (NF) 模型的方法。
💡 该方法创新性地简化了数据复杂度,并利用变换映射克服了Taiji一年周期时间依赖响应函数的挑战,实现了对11维MBHB参数的全面无偏估计。
✨ 结果表明,该方法比传统技术快几个数量级,同时保持高精度,并揭示了到达时间参数中以前未见的额外多模态性,极大地提高了引力波数据分析效率。
# AI for PE
🌌 针对引力波数据分析中传统方法(如Markov chain Monte Carlo)面临的计算挑战,该综述探讨了基于机器学习的模拟推断(SBI)方法作为一种高效解决方案。
💫 论文详细阐述了Normalizing Flows、Neural Posterior Estimation (NPE)、Neural Ratio Estimation (NRE) 和 Flow Matching 等SBI技术,并展示了它们在单源参数估计、叠加信号分析、检验广义相对论及族群研究中的应用。
🚀 尽管SBI方法在速度上显著提升,但其模型依赖性、先验假设敏感性、可解释性及验证挑战仍是其广泛采纳的障碍,未来研究将着重于结合AI与传统方法的混合范式。
arXiv:2507.11192.
literature covered up to early 2025 only.
# AI for PE
Let's be honest about our motivations... 😉
The perfectly valid "scientific" reasons:
Credit: Chris Messenger (MLA meeting,, Jan 2025)
# AI for PE
The core motivations behind nearly all AI+GW research
So much data, so little time!
• Bayesian parameter estimation
• Replaces computationally intensive components
Consistently outperforms traditional approaches
• Unmodelled burst searches
• Continuous GW searches
Provides deeper insights into complex problems
• Reveals patterns through interpretability
• Enables previously impractical approaches
* When properly trained and validated on appropriate datasets
Credit: Chris Messenger (MLA meeting,, Jan 2025)
Credit: Chris Messenger (MLA meeting,, Jan 2025)
Key question: If an ML (or any) analysis doesn't do 1 or more of these things, then from a scientific perspective,
what is the point?
# AI for PE
在用SBI等生成模型做参数估计(PE)时,社区里其实逐渐分化出两种不同的范式,可以概括为 Validation-driven 与 Discovery-driven:
arXiv:2310.13405, LIGO-P2300306
PRL 127, 24 (2021) 241103.
PRL 130, 17 (2023) 171403.
arXiv:2310.12209
Fast Parameter Inference on Pulsar Timing Arrays with Normalizing Flows
arXiv:2404.14286
DOI:10.1103/PhysRevLett.130.171402
# AI for PE
The reality of ML in scientific research is more nuanced
No: We need to think more critically
Twitter: @DeepLearningAI_
# AI for PE
The mathematical inevitability and the path to understanding
The existence theorem that guarantees solutions
The solution is mathematically guaranteed — our challenge is finding the path to it
Machine learning will win in the long run
AI models still have vast potential compared to the human brain's efficiency. Beating traditional methods is mathematically inevitable given sufficient resources.
The question is not if AI/ML will win, but how
Understanding AI's inner workings is the real challenge, not proving its capabilities.
That's where we can learn something exciting with Foundation Models.
# AI for PE
一个不可回避的张力:
因此,一个更现实的中间路径往往是:先在受控设置中完成对 MCMC 的严格对齐(包括覆盖率、校准性、极端尾部行为等),建立可信度;再在真实复杂数据(如非高斯噪声、模型不完备)中系统性地分析偏差来源。如果差异在多种独立实现、不同架构与数据切片下保持稳健,并且能被物理或仪器效应解释,那么才有资格被讨论为“发现”。
换句话说:要么作为无偏加速器被验证;要么对偏差来源给出可解释的物理或统计依据。
arXiv:2310.13405, LIGO-P2300306
PRL 127, 24 (2021) 241103.
PRL 130, 17 (2023) 171403.
arXiv:2310.12209
Fast Parameter Inference on Pulsar Timing Arrays with Normalizing Flows
arXiv:2404.14286
DOI:10.1103/PhysRevLett.130.171402
Fast is easy to claim. Better needs an explanation.
结营仪式与课程总结
主讲老师:王赫
2024/01/14
ICTP-AP, UCAS
| 但易 | 活动策划+算力支持 |
| ... |
| 田昕峣 | 特邀嘉宾 |
| 赵俊杰 | 特邀嘉宾 |
| 高民权 | 特邀嘉宾 |
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
# GWData: Bootcamp
Why AI Was Proposed
Earliest Form of AI and Solutions
Similarities between AI and Physics Methodologies
From Symbolic Systems to Machine Learning
Principles of Deep Learning
Breakthroughs Brought by Deep Learning
Typical Deep Learning Scenarios
Pre-trained Models and Large Models
Principles of GPT
Breakthroughs in AIGC (AI Generated Content)
Current Challenges in AI
Frontiers of AI Research
# GWData: Bootcamp
# GWData: Bootcamp
Python: 108 quizzes
Numpy: 10 quizzes
Pandas: 12 quizzes
LeetCode: 5 problems
Matplotlib: 4 datasets
Seaborn: 4 datasets
Git / GitHub: Pull Request
Credit Scoring dataset
Modeling
Finetune
Kaggle competition
Can you find the GW signal?
# GWData: Bootcamp
| 总得分 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|
| 频数 | 4 | 5 | 6 | 10 | 7 | 23 | 8 |
| 前#百分比排名 | 100.00% | 93.65% | 85.71% | 76.19% | 60.32% | 49.21% | 12.70% |
部分获奖同学:
# GWData: Bootcamp
概述
欢迎参加“引力波数据探索:编程与分析实战训练营”系列课程的最后挑战 - “你能找到引力波信号吗?”Kaggle数据科学竞赛(黑客马拉松)!这个竞赛旨在应用你在整个课程中学到的知识和技能,重点关注引力波数据分析和研究。
任务目标
本次竞赛的目标是开发一个能够准确识别引力波信号的模型。我们将提供一个包含噪声和引力波信号的数据集。你的任务是开发一个能够准确区分两者的模型。
时间线(7天)
本竞赛将于北京时间 2023年12月29日22:00 开始,并于北京时间 2024年1月6日23:59 结束。请确保在截止日期前提交你的解决方案。
# GWData: Bootcamp
# GWData: Bootcamp
记得给课程 Star