Abstract
Missing values and unreleased figures are common but highly important for backtesting and real-time analysis in the financial industry, yet underexploited in the existing literature. In this paper, we focus on the issue of empirical asset pricing, where the cross-section of future asset returns is a function of lagged firm characteristics that vary in time frequencies and missing ratios. Most of the existing imputation methods cannot fully capture the complex and evolving spatio-temporal relations among firm-level characteristics. In particular, these methods fail to explicitly consider the spatial relations and feature structure in the stock network where we have to process granular data of thousands of stocks and hundreds of characteristics for each stock. To address these challenges, we propose a spatio-temporal diffusion model (STDM) that gradually recovers the masked financial data conditioning on high-dimensional stock-and-characteristics historical data. We propose characteristic-specific projection to construct characteristic-level features at both ends of the STDM, meanwhile maintaining firm-level features in the middle of the STDM to largely reduce the computational memory. Moreover, along with the temporal attention, we design a spatial graph convolutional network, making it computationally efficient and effective to learn time-varying spatio-temporal interdependence across firms. We further employ an implicit sampler that greatly accelerates the inference procedure so that the STDM is able to produce high-quality point and density estimates of missing and real-time firm characteristics within a few steps. We evaluate our model on the most comprehensive open-source dataset 'OSAP' and generate state-of-the-art performance in extensive experiments. © 2024 ACM.
| Original language | English |
|---|---|
| Title of host publication | CIKM '24 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management |
| Place of Publication | New York, NY |
| Publisher | Association for Computing Machinery |
| Pages | 602-611 |
| ISBN (Print) | 9798400704369 |
| DOIs | |
| Publication status | Published - Oct 2024 |
| Externally published | Yes |
| Event | 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024) - Boise Centre, Boise, United States Duration: 21 Oct 2024 → 25 Oct 2024 https://cikm2024.org/ |
Publication series
| Name | International Conference on Information and Knowledge Management, Proceedings |
|---|---|
| ISSN (Print) | 2155-0751 |
Conference
| Conference | 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024) |
|---|---|
| Abbreviated title | CIKM '24 |
| Place | United States |
| City | Boise |
| Period | 21/10/24 → 25/10/24 |
| Internet address |
Research Keywords
- diffusion model
- financial data processing
- missing value imputation
- real-time nowcasting
Fingerprint
Dive into the research topics of 'A Spatio-Temporal Diffusion Model for Missing and Real-Time Financial Data Inference'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver