Fractional Denoising for 3D Molecular Pre-training

Shikun Feng (Co-first Author), Yuyan Ni (Co-first Author), Yanyan Lan*, Zhi-Ming Ma, Weiying Ma

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

26 Citations (Scopus)

Abstract

Coordinate denoising is a promising 3D molecular pre-training method, which has achieved remarkable performance in various downstream drug discovery tasks. Theoretically, the objective is equivalent to learning the force field, which is revealed helpful for downstream tasks. Nevertheless, there are two challenges for coordinate denoising to learn an effective force field, i.e. low sampling coverage and isotropic force field. The underlying reason is that molecular distributions assumed by existing denoising methods fail to capture the anisotropic characteristic of molecules. To tackle these challenges, we propose a novel hybrid noise strategy, including noises on both dihedral angel and coordinate. However, denoising such hybrid noise in a traditional way is no more equivalent to learning the force field. Through theoretical deductions, we find that the problem is caused by the dependency of the input conformation for covariance. To this end, we propose to decouple the two types of noise and design a novel fractional denoising method (Frad), which only denoises the latter coordinate part. In this way, Frad enjoys both the merits of sampling more low-energy structures and the force field equivalence. Extensive experiments show the effectiveness of Frad in molecular representation, with a new state-of-the-art on 9 out of 12 tasks of QM9 and on 7 out of 8 targets of MD17. The code is released publicly at https://github.com/fengshikun/Frad.
© 2023 by the author(s).
Original languageEnglish
Title of host publicationProceedings of the 40th International Conference on Machine Learning
EditorsAndreas Krause, Emma Brunskill, Kyunghyun Cho
PublisherML Research Press
Pages9938-9961
Publication statusPublished - 2023
Externally publishedYes
Event40th International Conference on Machine Learning (ICML 2023) - Hawaii Convention Center, Honolulu, United States
Duration: 23 Jul 202329 Jul 2023
https://icml.cc/

Publication series

NameProceedings of Machine Learning Research
Volume202
ISSN (Print)2640-3498

Conference

Conference40th International Conference on Machine Learning (ICML 2023)
Abbreviated titleICML'23
PlaceUnited States
CityHonolulu
Period23/07/2329/07/23
Internet address

Funding

This work is supported by National Key R&D Program of China No.2021YFF1201600, Vanke Special Fund for Public Health and Health Discipline Development, Tsinghua University (NO.20221080053) and Beijing Academy of Artificial Intelligence (BAAI).

Fingerprint

Dive into the research topics of 'Fractional Denoising for 3D Molecular Pre-training'. Together they form a unique fingerprint.

Cite this