News impact analysis in algorithmic trading
新聞在算法交易中的影響分析
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 3 Oct 2014 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(cde33df0-73bb-4b9d-b141-b1d2cbb71008).html |
---|---|
Other link(s) | Links |
Abstract
The stock market is one of the most important financial markets. Investors in the
market gather and process market information to enhance their trading decisions.
Among all forms of information, market news that reports the latest market status
is one of the most important information sources that are believed to have an
impact on the stock prices. With the advancement of algorithmic trading, news
agencies, such as Bloomberg, have made a tremendous improvement on the reporting
speed and the volume of their news. However, the format of the news is not
machine-readable, and the voluminous news stream makes it increasingly difficult
to be processed manually. Therefore, how to model and automatically process
the market news, and analyze its market impact have become a set of challenging
problems in both academic study and industrial practice. In this thesis, news impact
is modelled and analyzed from three perspectives. For each perspective, we
use one chapter to describe the approach we propose and discuss the experiment
setup and results.
Firstly, we study the problem of how news sentiment can help stock price
prediction. Bag-of-words approach analyzes the latent relationship between statistical
patterns of words and stock price movements. In contrast, news sentiment,
which is an important ring in the chain of mapping from word patterns to price
movements, analyzes the news impact in sentiment space. We first implement a
generic stock price prediction framework which can make use of different external
signals to predict the stock prices. We then use the Harvard psychological dictionary
and Loughran-McDonald financial sentiment dictionary to construct the
sentiment space. Text news articles are then quantitatively measured and projected onto the sentiment space. Predictions generated by either the bag-of-words
approach or sentiment analysis are evaluated and compared at different market
classification levels. Experiments are conducted on five-year daily historical Hong
Kong Stock Exchange prices and news articles. Results show that: (1) At individual
stock level, sector index level and market index level, the models with
sentiment analysis outperform the bag-of-words model in both the validation set
and the independent testing set; (2) The models which use sentiment polarity cannot
provide useful predictions; (3) There is a minor difference between the models
using the two different sentiment dictionaries.
Secondly, we study the problem of how news summarization can help stock
price prediction. A multiple document summarization algorithm is proposed to
summarize the daily news articles. Compared with conventional summarization
methods, the proposed algorithm constructs and preserves sentence relevance
structures during the recursive calculation of sentence significance values. Potential
important sentences "present" themselves gradually by gaining higher significance
values, and the summary paragraph is then generated by selecting top-k
scored sentences. Convergence of the algorithm is proved, and experiment, which
is conducted on two standard data sets (DUC 2006 and DUC 2007), shows that
the proposed model gives convincing results. In the second step, we reuse the stock
price prediction framework implemented in the sentiment analysis. The summarization
model generates summaries from news articles, which are then evaluated
according to whether they can improve the prediction of stocks' daily return. Experiments
are conducted on five-year daily Hong Kong Stock Exchange data, with
the news reported by FINET. Evaluations are done at individual stock level, sector
index level and market index level. Results show that the predictions based on
news article summaries outperform the predictions based on full-length articles in
both the validation and independent testing sets.
Finally, we study the problem whether integrating the information from news
and short-term historical prices can help stock price prediction. Previous works
focus either on market news purely as exogenous factors that tend to lead the price process, or on the analysis of how past stock price processes can affect future
stock returns. Taking one step further, we quantitatively integrate information
from both market news and stock prices in order to improve the accuracy of prediction
of stock future price return in an intra-day trading context. We present
the design and architecture of our approach for market information fusion. By
means of multiple kernel learning, the hidden information behind the two sources
is effectively extracted, and more importantly, seamlessly integrated rather than
simply combined by a single kernel approach. Experiments of comprehensive comparisons
between our approach and three baseline methods (which use only one
type of information, or naively combine the two sources) have been undertaken
on the intra-day tick-by-tick data of the Hong Kong Stock Exchange and market
news archives of the same period. It has been shown that for both cross-validation
and independent testing, our approach achieves the best results.
- Journalism, Commercial, Program trading (Securities), Stock price forecasting