FinSIR: Financial SIR-GCN for Market-Aware Stock Recommendation

FinSIR: Financial SIR-GCN for Market-Aware Stock Recommendation

In our previous work on Contextualized Messages Boost Graph Representations, we introduced the soft-isomorphic relational graph convolution network (SIR-GCN), which featured an anisotropic (i.e., a function of both the features of the center and neighboring nodes) and dynamic (i.e., a universal function approximator) message function for graph neural networks (GNNs). For more information, refer to the accompanying blog post.

  1. Limitations of Existing Models
  2. Financial SIR-GCN (FinSIR)
    1. Temporal Module 1
    2. Spatial Module
    3. Temporal Module 2
    4. Prediction Module
    5. Loss Function
  3. Backtesting Results on NYSE and NASDAQ
    1. Dataset
      1. Node Features
      2. Edge Features
        1. Wiki Graph
        2. Industry Graph
    2. Backtesting Strategy
    3. NYSE Results
    4. NASDAQ Results
  4. Conclusion

In collaboration with experts from the Chinese University of Hong Kong, we developed a model based on SIR-GCN designed for market-aware stock recommendation. This work was presented at the 2025 International Joint Conference on Neural Networks (IJCNN) in Rome, Italy.

Limitations of Existing Models

Stock recommendation models in literature largely treat stocks in isolation and leverage models such as CNNs, RNNs, and Transformers to learn meaningful representations independently. This approach, however, ignores the rich stock relations (e.g., stocks belonging to the same industry) in a market.

Graph-based models address this issue by treating stock markets as spatio-temporal graphs, where nodes represent stocks and edges represent different types of stock relations. Typically, these models employ a temporal module (e.g., LSTMs) followed by a spatial module (e.g., GNNs) before the final prediction module. Notably, this decoupled nature in processing the spatial and temporal dimensions of stock market graphs potentially limits their performance.

Financial SIR-GCN (FinSIR)

Suppose Gt=(Vt,Et)G_t = \left(V_t, E_t\right) is the spatio-temporal stock market graph at time tt. Furthermore, suppose xt(s)\boldsymbol{x_t^{(s)}} represents the features of stock ss, xt(s,s)\boldsymbol{x_t^{(s,s')}} represents the features of the edge connecting stock ss and ss', Pt(s)P_t^{(s)} represents the true closing price of stock ss, and rt(s)=Pt(s)Pt1(s)Pt1(s)r_t^{(s)} = \frac{P_{t}^{(s)} - P_{t-1}^{(s)}}{P_{t-1}^{(s)}} represents the true one-day (percentage) return of stock ss.

Our proposed Financial SIR-GCN (FinSIR) integrates SIR-GCN with the “sandwich” structure employed in GNN for time series analysis (GNN4TS) to jointly process the two key dimensions of stock market graphs and obtain spatio-temporally contextualized hidden states. It consists of four key modules described below.

FinSIR

Financial SIR-GCN (FinSIR) Architecture.

Temporal Module 1

For every stock ss, the temporal module 1 independently processes the ww sequential features xtw+1(s),xtw+2(s),,xt(s)\boldsymbol{x_{t-w+1}^{(s)}}, \boldsymbol{x_{t-w+2}^{(s)}}, \ldots, \boldsymbol{x_{t}^{(s)}} to extract the ww sequential hidden states ftw+1(s),ftw+2(s),,ft(s)\boldsymbol{f_{t-w+1}^{(s)}}, \boldsymbol{f_{t-w+2}^{(s)}}, \ldots, \boldsymbol{f_{t}^{(s)}}. ftw+1(s),ftw+2(s),,ft(s)=LSTM(xtw+1(s),xtw+2(s),,xt(s))\boldsymbol{f_{t-w+1}^{(s)}}, \boldsymbol{f_{t-w+2}^{(s)}}, \ldots, \boldsymbol{f_{t}^{(s)}} = \text{LSTM}\left(\boldsymbol{x_{t-w+1}^{(s)}}, \boldsymbol{x_{t-w+2}^{(s)}}, \ldots, \boldsymbol{x_{t}^{(s)}}\right)

Spatial Module

The spatial module then performs ww message-passing for every time tt based on a modified SIR-GCN that accepts edge features. gt(s)=ϕs(sNt(s)1Nt(s)Nt(s)WR ϕs(WQft(s)+WKft(s)+WExt(s,s)+b))\boldsymbol{g_t^{(s)}} = \phi_s\left(\sum_{s' \in N_t(s)} \dfrac{1}{\sqrt{\left|N_t(s)\right|}\sqrt{\left|N_t(s')\right|}} \boldsymbol{W_R} ~ \phi_s\left(\boldsymbol{W_Q} \boldsymbol{f_t^{(s)}} + \boldsymbol{W_K} \boldsymbol{f_t^{(s')}} + \boldsymbol{W_E} \boldsymbol{x_t^{(s,s')}} + \boldsymbol{b}\right)\right)

Temporal Module 2

Motivated by the “sandwich” structure in GNN4TS, FinSIR introduces a second LSTM temporal module. With the two LSTM temporal modules, FinSIR can jointly and effectively process the spatial and temporal dimensions of stock market graphs. ht(s)=LSTM([ftw+1(s)  gtw+1(s)],[ftw+2(s)  gtw+2(s)],,[ft(s)  gt(s)])\boldsymbol{h_{t}^{(s)}} = \text{LSTM}\left( \left[ \boldsymbol{f_{t-w+1}^{(s)}} ~\Big\Vert~ \boldsymbol{g_{t-w+1}^{(s)}} \right], \left[ \boldsymbol{f_{t-w+2}^{(s)}} ~\Big\Vert~ \boldsymbol{g_{t-w+2}^{(s)}} \right], \ldots, \left[ \boldsymbol{f_{t}^{(s)}} ~\Big\Vert~ \boldsymbol{g_{t}^{(s)}} \right] \right)

Prediction Module

Finally, the prediction module predicts the return of every stock on the following trading day based on historical data from the past ww trading days. P^t+1(s)=ϕp(wpht(s)+bp),r^t+1(s)=P^t+1(s)Pt(s)Pt(s)\begin{align*} \hat{P}_{t+1}^{(s)} &= \phi_p\left(\boldsymbol{w_p}^\top \boldsymbol{h_{t}^{(s)}} + \boldsymbol{b_p}\right), \\ \hat{r}_{t+1}^{(s)} &= \dfrac{\hat{P}_{t+1}^{(s)} - P_{t}^{(s)}}{P_{t}^{(s)}} \end{align*}

Loss Function

FinSIR is then trained using a loss function that combines both point-wise regression loss and pair-wise ranking-aware loss, defined as =1Tt=1T[1Ss=1S(r^t(s)rt(s))2+αs=1Ss=1Smax{0,(r^t(s)r^t(s))(rt(s)rt(s))}].\ell = \dfrac{1}{T} \sum_{t=1}^T \left[\dfrac{1}{S} \sum_{s=1}^{S} \left(\hat{r}_{t}^{(s)} - r_{t}^{(s)}\right)^2 + \alpha \sum_{s=1}^S \sum_{s'=1}^S \max\left\{0, - \left(\hat{r}_{t}^{(s)} - \hat{r}_{t}^{(s')}\right) \cdot \left(r_{t}^{(s)} - r_{t}^{(s')}\right)\right\}\right].

Backtesting Results on NYSE and NASDAQ

To evaluate the performance of FinSIR, we considered the NYSE and NASDAQ. The code to reproduce the results is available at the FinSIR repository.

Dataset

The data was obtained from the Temporal Relational Ranking for Stock Prediction repository. A summary of the market data considered is presented below.

MarketStocksTrain Days
Jan 2, 2013 - Dec 31, 2015
Validation Days
Jan 4, 2016 - Dec 30, 2016
Test Days
Jan 3, 2017 - Dec 8, 2017
NYSE1,737756252237
NASDAQ1,026756252237

Summary of Market Data.

Node Features

The node features xt(s)R5\boldsymbol{x_t^{(s)}} \in \mathbb{R}^5 consist of the closing price Pt(s)P_t^{(s)} as well as the 5, 10, 20, and 30 moving average closing prices. These features are then normalized by their average over the past ww trading days.

Edge Features

The edge features xt(s,s){0,1}d\boldsymbol{x_t^{(s,s')}} \in \{0, 1\}^d are static multi-hot binary encodings that indicate the presence or absence of dd predefined relations between the two stocks. Moreover, two stocks are connected by an edge if and only if at least one of the dd relations is present. Two types of relational graphs are considered.

Wiki Graph

The Wiki graph describes the first-order and second-order company relations based on information from Wikidata. In particular,

  • stocks ss and ss' share a first-order relation if there exists a statement that has stock ss as the subject and stock ss' as the object; and
  • stocks ss and ss' share a second-order relation if there exist statements with stocks ss and ss' as the subjects sharing a common object.
Industry Graph

The industry graph describes stocks belonging to the same industry based on the official classifications of NYSE and NASDAQ.

Backtesting Strategy

To evaluate the performance of FinSIR, the daily buy-hold-sell trading strategy was considered, where

  • the top KK stocks based on the predicted one-day return r^t+1(s)\hat{r}_{t+1}^{(s)} are bought when the market closes on every trading day tt; and
  • these stocks are sold when the market closes at the next trading day t+1t + 1.

Moreover, the following assumptions were also made:

  • The total amount invested on every trading day is constant;
  • The market is liquid such that buy and sell orders always get filled at the closing price of every trading day; and
  • The transaction costs are negligible.

NYSE Results

The cumulative investment return ratio (IRR), mean reciprocal rank (MRR), and mean squared error (MSE) for the baseline models, FinSIR, and SimpleFinSIR (an ablation model of FinSIR) on NYSE are presented below.

ModelIRR1 (↑)IRR5 (↑)MRR1 (↑)MRR5 (↑)MSE (↓)
RankLSTM0.01400.06050.02600.01682.27 × 10-4
RSR-I (Wiki)0.61480.44650.02650.02342.27 × 10-4
RSR-E (Wiki)0.94910.40750.03390.02262.28 × 10-4
STGCN (Wiki)0.09590.15580.01490.01342.96 × 10-4
DCGRU (Wiki)-0.8051-0.13250.02650.02312.29 × 10-4
SimpleFinSIR (Wiki)1.34570.56730.03690.02442.28 × 10-4
FinSIR (Wiki)1.60340.43370.02980.02002.27 × 10-4
RSR-I (industry)1.19370.47340.03480.02292.27 × 10-4
RSR-E (industry)1.20930.43350.03620.02352.27 × 10-4
STGCN (industry)0.4507-0.06050.03900.02603.64 × 10-4
DCGRU (industry)0.35530.05710.03100.02212.28 × 10-4
SimpleFinSIR (industry)1.27390.47960.03430.02212.28 × 10-4
FinSIR (industry)1.47610.53380.03530.02462.28 × 10-4

: best model; : second best model; : third best model; bold: statistically significant vs. best baseline model with same relational graph.

Summary of Performance Metrics for Baseline and Proposed Models on NYSE.

The cumulative IRR1_1 for the baseline and proposed models across the backtesting period is also presented below.

NYSE-IRR1

Cumulative IRR1_1 for Baseline and Proposed Models on NYSE.

Notably, FinSIR and SimpleFinSIR consistently outperform the baseline models—RankLSTM, RSR-I, RSR-E, STGCN, and DCGRU—in terms of IRR, which is the primary objective of stock recommendation. This highlights the significance of the “sandwich” structure and the second LSTM temporal module in jointly processing the spatial and temporal dimensions of the stock market graphs.

While performance in terms of MRR and MSE suggests that both FinSIR and SimpleFinSIR perform comparably with the baseline models, this does not necessarily translate to better IRR performance as illustrated by Feng et al. In particular, a model may exhibit worse MSE performance yet still achieve better IRR performance due to its ability to correctly rank the stocks. Conversely, a model may also exhibit better MSE performance yet achieve worse IRR performance if it fails to correctly rank the stocks. Thus, while nearly all models perform comparably in terms of accurately predicting future stock returns, the proposed models perform best at ranking stocks based on true future stock returns, thereby providing better investment recommendations.

NASDAQ Results

The performance metrics for the baseline and proposed models on NASDAQ are also presented below.

ModelIRR1 (↑)IRR5 (↑)MRR1 (↑)MRR5 (↑)MSE (↓)
RankLSTM0.28820.14850.03400.02013.78 × 10-4
RSR-I (Wiki)0.24760.06300.03080.01753.79 × 10-4
RSR-E (Wiki)0.20850.20440.02990.01883.79 × 10-4
STGCN (Wiki)0.41710.32220.02980.02424.27 × 10-4
DCGRU (Wiki)0.03420.30390.02950.02173.79 × 10-4
SimpleFinSIR (Wiki)1.11610.44600.04720.02623.77 × 10-4
FinSIR (Wiki)0.78380.30510.04080.02413.95 × 10-4
RSR-I (industry)0.59340.29830.03090.02203.80 × 10-4
RSR-E (industry)1.11140.56700.04620.02733.78 × 10-4
STGCN (industry)0.86690.18500.04090.02729.66 × 10-4
DCGRU (industry)0.64230.34930.03920.02473.81 × 10-4
SimpleFinSIR (industry)0.93340.31060.04290.02423.77 × 10-4
FinSIR (industry)1.23070.67470.04870.03103.78 × 10-4

: best model; : second best model; : third best model; bold: statistically significant vs. best baseline model with same relational graph.

Summary of Performance Metrics for Baseline and Proposed Models on NASDAQ.

The cumulative IRR1_1 for the baseline and proposed models across the backtesting period is also presented below.

NASDAQ-IRR1

Cumulative IRR1_1 for Baseline and Proposed Models on NASDAQ.

Similarly, FinSIR and SimpleFinSIR outperform the baseline models in terms of IRR1_1. This highlights the role of SIR-GCN in capturing the complex and non-linear stock relations in the market.

Conclusion

Overall, building upon our previous work on SIR-GCN, we developed FinSIR for market-aware stock recommendation. It integrates SIR-GCN with the “sandwich” structure in GNN4TS to jointly process the two key dimensions of stock market graphs and obtain spatio-temporally contextualized representations. This work perfectly lies at the intersection of my two research interests—finance and graph theory—and serves as a great way to expand my professional and academic network with distinguished experts in the fields.

Let me know your thoughts!