ALDEN: Dual-Level Disentanglement with Meta-learning for Generalizable Audio Deepfake Detection

Shenzhen University, Politecnico di Milano, Afirstsoft Technology Group Co.
ACM MM 2025

*Corresponding author
ALDEN Concept Illustration

An illustration of the ALDEN framework, structured along two key axes: low-level signal disentanglement (vertical) and high-level semantic disentanglement (horizontal). ALDEN incorporates dual-level disentangled learning (scissors) and meta-learning (recycling) to improve generalization across different vocoders. By focusing on vocoder-agnostic features and synthetic-relevant cues, ALDEN enhances the model's generalization ability while minimizing sensitivity to irrelevant variations.

Framework

ALDEN Framework Diagram

Overall framework of the proposed ALDEN. The ALDEN consists of three key components: (a) An adversarial-training-based disentanglement learning (ADL) module employs a multi-task learning strategy to disentangle vocoder-specific features fd from vocoder-agnostic features fa. (b) A reconstruction-based disentanglement learning (RDL) module uses audio reconstruction to disentangle fa, content features fc, and speaker features fs. (c) A vocoder-agnostic meta-learning (VAML) module mitigates overfitting to specific vocoders and facilitates the effective updating of the vocoder-agnostic encoder Ea and the forgery classifier Ca.

Algorithm

ALDEN Algorithm Pseudocode

Algorithm 1: The Proposed ALDEN Framework

Cross-vocoder and In-the-wild Scenarios

Detailed Results on Different Datasets

BibTeX

If you find our work useful, please consider citing:

@inproceedings{xu2025alden,
author = {Xu, Yuxiong and Li, Bin and Li, Weixiang and Mandelli, Sara and Negroni, Viola and Li, Sheng},
title = {ALDEN: Dual-Level Disentanglement with Meta-learning for Generalizable Audio Deepfake Detection},
year = {2025},
url = {https://doi.org/10.1145/3746027.3754741},
doi = {10.1145/3746027.3754741},
booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
pages = {7277–7286},
numpages = {10},
}