UMYU Scientifica

A periodical of the Faculty of Natural and Applied Sciences, UMYU, Katsina

ISSN: 2955 – 1145 (print); 2955 – 1153 (online)

ORIGINAL RESEARCH ARTICLE

Development of Multivariate Extreme Gradient Boost Technique for Vector Autoregressive Model

Nura Isah¹*, Sani Ibrahim Doguwa², Hussai Garba Dikko³ and Bukar Baba Alhaji⁴

¹Department of Statistics, Collage of Science &Technology, Jigawa State Polytechnic, Dutse, Nigeria

^2&3Department of Statistics, Faculty of Physical Sciences, Ahmadu Bello University, Zaria, Kaduna, Nigeria

⁴Department of Mathematics, Nigerian Defense Academy, Kaduna, Nigeria

Corresponding Author: [email protected]

Abstract

Machine Learning is a type of Artificial Intelligence (AI) that enables software to obtain models with good prediction results. This research aims to develop the Multivariate Extreme Gradient Boost Technique (XGBoost) for the VAR model. Simulated stationary time series data and a real-time series dataset were applied to model the VAR model to compare the forecast performance of the proposed technique with the conventional VAR Model. Augmented Dickey Fuller (ADF) was applied to test the stationarity of the time series data. The forecast performance for the proposed technique and the existing technique for VAR models for short-term and long-term forecasting would be compared based on Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). The proposed multivariate Extreme Gradients Boosting technique for the VAR model has been developed by applying the multivariate Extreme Gradients Boost technique to the VAR model. The results of the unit root test for real-life data indicate that all the variables are stationary without any difference. The result for simulated data in terms of forecast performance indicated that the proposed multivariate Extreme Gradients Boosting techniques for VAR model outperform existing technique (conventional VAR model) in long term forecasting base on MAE (1.008 and 1.316) and RMSE (1.357 and 1.669), while the conventional VAR model outperform proposed multivariate XGBoost technique in short term forecast using MAE (1.642 and 2.625) and RMSE (2.016 and 2.652). The results for real life dataset demonstrated that the proposed multivariate XGBoost technique for VAR model is superior in short-term forecast than that of the conventional VAR model using MAE (191.96 and 719.14) and RMSE (267.14 and 1901.49), while for long-term forecasting the result is similar to the short-term with metric value of RMSE (730.57 and 1487.53). The proposed technique for the VAR model is effective in both long-term and short-term forecasting for real-life data.

Keywords: Machine Learning, XGBoost Technique, Ensemble Techniques, Development, Regularization. Vector Auto Regressive Model.

INTRODUCTION:

Nowadays, machine learning techniques have remarkable success in solving prediction problems due to their robustness to overfitting and flexibility especially ensemble techniques like XGBoost, and random forest (Breiman, 2001). Despite their proven performance in univariate cases and regression problems, the application of XGBoost in the multivariate time series context, particularly the generalization of the VAR framework, remains underexplored (Chen and Guestrin, 2016).

Recent advances in machine learning, particularly ensemble methods like XGBoost, offer an alternative method for modelling nonlinear relationships without strict parametric assumptions. Despite their success in univariate time series forecasting and regression tasks, the application of EGBoost techniques to multivariate time series, particularly in the context of VAR models, remains relatively underexplored. The conventional VAR model plays important roles in modeling and forecasting multivariate time series data and has many applications in different ﬁelds. The model has an important role in ﬁnance (Tsay, 2005) and econometrics (Sims, 1980).

The development of the multivariate XGBoost technique for VAR models represents an emerging area in time series modeling, combining the strengths of traditional econometric models with advanced machine learning. Below is a list of related literature that explores this area, providing initial knowledge and recent improvements in the field of time series modeling and forecasting.

Jung et al. (2008) proposed a method for estimating VAR models using the LASSO technique. The performance of this technique was compared using conventional information-based methods such as AIC and BIC, and some other existing subset selection methods with parameter constraints, such as the top-down and bottom-up strategies for simulated and real data for U. S. macroeconomic data. Based on simulation and real data, the results indicate that the LASSO method outperforms other conventional subset selection methods for small samples in terms of prediction mean squared errors and estimation errors under various settings.

Korobilis (2009) proposed a method for estimating sparse VAR models using a Bayesian approach. The technique was computationally efficient for stochastic variable selection in linear and nonlinear VAR. The performance of the proposed variable selection method is assessed in a small Monte Carlo experiment, and in forecasting real data for four UK macroeconomic series using time-varying parameter vector auto-regressions (TVP-VARs). The proposed method outperforms the unrestricted counterparts in forecasting.

Nicholson et al. (2015) introduce the VARX-L framework, which applies structured regularization techniques to VAR models with exogenous variables, addressing high-dimensionality challenges in microeconomic forecasting using simulated and macroeconomic data. The results show that the proposed technique demonstrates superior forecast accuracy compared to other traditional VAR and other regularization approaches in low and high-dimensional data.

Billio et al. (2019) proposed a new Bayesian nonparametric LASSO prior (BNP-LASSO) for high-dimensional VAR models, which can improve estimation and prediction accuracy. To validate the performance of the new approach on forecasting abilities, the forecast performance was compared with that of BNP-LASSO, Elastic-Net (EN), Bayesian LASSO (B-LASSO), and SSVS using simulated data and real data. The result indicates that the proposed method BNP-LASSO outperforms Elastic-Net (EN), Bayesian LASSO (B-LASSO), and SSVS. Based on the findings, they suggest that the BNP-LASSO is useful not only for a better estimation but could also be used for forecasting purposes in macroeconomics.

Li and Chen (2020) compared the forecast performance of some ensemble techniques, such as random forest, AdaBoost, XGBoost, LightGBM, and Stacking with that of five traditional individual learners (neural network, decision tree, logistic regression, Naïve Bayes, and support vector machine) using real-world credit dataset for Lending Club in the United States. The findings indicate that the forecast performance of ensemble learning is better than individual learners.

Zhai et al. (2021) explore a hybrid approach to forecast an industrial setting by combining XGBoost and Gated Recurrent Units (GRU). XGBoost captures complex nonlinear relationships and handles structured data, while GRU is used to model temporal dependencies in data. The proposed technique was applied to predict temporal dependencies in heating finance. The results shows that the proposed technique outperformed individual models using only XGBoost or GRU.

Suotsalo et al. (2021) proposed a novel method called pseudo-likelihood Vector Auto-regressive model (PLVAR) by combining fractional marginal likelihood and pseudo-likelihood. The method can decide the complete VAR model structure, including the lag length, in a single run. The performance of the PLVAR method was compared with that of the smoothly clipped absolute deviation (SCAD) methods, LASSO, and unrestricted VAR models. The PLVAR method is both faster and produces more accurate estimates than the other methods based on penalized regression. The proposed technique outperforms other methods on both simulated and real data.

Sun et al. (2022) proposed a hybrid model by combining the informer (a transformer-based deep learning model), XGBoost, and Genetic Algorithms (GA) for multi-step time series forecasting. The informed component captures long-term dependencies, XGBoost captures non-linear relationships, and GA optimizes the ensemble weight. The proposed approach shows superior forecast accuracy compared to traditional models.

Lubbers (2023) compared the prediction performance for XGBoost and Random Forest techniques using real data on cash flow for transactions of small and medium-sized enterprises. The research intended to identify the forecast performance of the two different algorithms and assess their feasibility for practical use in their daily operations. The result of the two algorithms showed that the Random Forest technique outperformed XGBoost, but the performance of the two techniques varied depending on the training data used.

Sundari and Mahardika (2024) developed a predictive model that can predict house prices accurately based on relevant features. They adopted some ensemble learning techniques, including Gradient Boosted Regression Trees (GBRT), LASSO, and Extra Gradient Boosted Technique (XGBoost), using the Ames Housing data set. The performance of the predicted model was evaluated using Root Mean Square Error (RMSE). However, the results indicate that the combination of two ensemble techniques (GBRT and XGBoost) outperforms other methods in predicting housing prices.

This study proposed a novel extension of the XGBoost technique for VAR models by combining the structure of conventional VAR models with the multivariate XGBoost proposed by Guang (2021). The proposed technique aims to bridge the gap between the conventional VAR model and modern machine learning approaches. The technique captures linear interdependencies across multiple time series, improves forecast accuracy, and model flexibility (Rahman and Davis, 2013). The performance of the proposed technique would be validated on simulated and real-life financial datasets and compared with the conventional VAR model.

This paper contributes to the literature by developing a Multivariate XGBoost technique for the VAR model framework that captures nonlinear cross-lagged relationships and enhances the ensemble power of XGBoost for a robust forecast.

METHODOLOGY

Vector Autoregressive Model (VAR)

The VAR model is one of the most successful, flexible, and easy models to use for the analysis of multivariate time series. It is a natural extension of the univariate autoregressive model to dynamic multivariate time series. VAR model is useful for describing the dynamic behavior of economic and financial time series and for forecasting (Lutkepohl, 2005). Consider a column vector for k different time series variables:

\(\mathbf{Y}_{t} = (\mathbf{y}_{1t},\mathbf{y}_{2t},\mathbf{y}_{3t},...,\mathbf{y}_{kt})^{1}\) (1)

and the model is in terms of past values of the vector. The result is a vector autoregressive or VAR. The VAR(p) process is of the form:

\(\mathbf{Y}_{t} = \mathbf{M} + \mathbf{A}_{1}\mathbf{Y}_{t - 1} + \mathbf{A}_{2}\mathbf{Y}_{t - 2} + ... + \mathbf{A}_{p}\mathbf{Y}_{t - p} + \mathbf{\varepsilon}_{t}\) (2)

where

\(\mathbf{A}_{i} = \begin{pmatrix} a_{11}^{(r)} & a_{12}^{(r)}... & a_{1k}^{(r)} \\ a_{21}^{(r)} & a_{22}^{(r)}.... & a_{2k}^{(r)} \\ a_{31}^{(r)} & a_{32}^{(r)}... & a_{3k}^{(r)} \end{pmatrix}\) \(i,j = 1,...,k;r = 1,...,p\) (3)

A_i is a k by k square matrix of coefficients; M is a k by 1 column vector, and \(\mathbf{\varepsilon}_{\mathbf{t}}\) is a k by 1 column vector of white noise process, with the properties that:

\(E(\mathbf{\varepsilon}_{\mathbf{t}}) = 0\) for all t

\(E(\mathbf{\varepsilon}_{t}\mathbf{\varepsilon}_{s}) = \left\{ \begin{matrix} \mathbf{v},ifs = t \\ \mathbf{0},ifs \neq t \end{matrix} \right.\ \) (4)

where \(\mathbf{\nu}\) the covariance matrix is assumed to be positive definite. Thus, the \(\mathbf{\varepsilon}_{\mathbf{t}}\)are serially uncorrelated but may be contemporaneously correlated.

For k =3 and p = 2 Equation 3.7 becomes:

\(\begin{pmatrix} y_{1t} \\ y_{2t} \\ y_{3t} \end{pmatrix} = \begin{pmatrix} m_{1} \\ m_{2} \\ m_{3} \end{pmatrix} + \begin{pmatrix} a_{11}^{1} & a_{12}^{1} & a_{13}^{1} \\ a_{21}^{1} & a_{22}^{1} & a_{23}^{1} \\ a_{31}^{1} & a_{32}^{1} & a_{33}^{1} \end{pmatrix}\begin{pmatrix} y_{1t - 1} \\ y_{2t - 1} \\ y_{3t - 1} \end{pmatrix} + \begin{pmatrix} a_{11}^{2} & a_{12}^{2} & a_{13}^{2} \\ a_{21}^{2} & a_{22}^{2} & a_{23}^{2} \\ a_{31}^{2} & a_{32}^{2} & a_{33}^{2} \end{pmatrix}\begin{pmatrix} y_{1t - 2} \\ y_{2t - 2} \\ y_{3t - 2} \end{pmatrix} + \begin{pmatrix} \varepsilon_{1} \\ \varepsilon_{2} \\ \varepsilon_{3} \end{pmatrix}\) (5)

Multivariate XGBoost Method

The multivariate XGBoost method was proposed by Guang (2021) by generalizing the XGBoost method with multi-objective functions and forms multi-objective parameters regularized tree boosting.

For a given data set: \(\{\mathbf{D} = (\mathbf{x}_{i},\mathbf{y}_{i}),i = 1,...,n,\mathbf{x}_{i}\mathbb{\in R},\mathbf{y}_{i}\mathbb{\in R}\}\), contained m features and n sample points.

For any sample point \((\mathbf{x}_{i},\mathbf{y}_{i})\) consider \(l(\theta_{1i},...,\theta_{li};y_{i})\) as 𝑙- variable loss function and \(\theta_{1i},...,\theta_{li}\)is the independent variables, the value range of each \(\theta_{ji}(j = 1,2,...,l)\) is a subinterval of \(\mathbb{R}\). In multi-objective parameter regularized tree boosting method.

Consider \(\theta_{1i},...,\theta_{li}\)as loss function parameters to be estimated in the multi-objective parameter regularized tree boosting model, add 𝐾_𝑗 tree functions to obtain the predicted result of the parameter \(\theta_{ji}(j = 1,2,...,l)\) of \(l(\theta_{1i},...,\theta_{li};\mathbf{y}_{i})\).

\({\widehat{\mathbf{\theta}}}_{ji} = \phi_{j}(\mathbf{x}_{i}) = \sum_{k}^{k_{j}}{f_{\theta_{jk}}(\mathbf{x}_{i}),}\) ( 6)

where \(F = \{ f(x) = \omega_{q(x)}\}\) is the space of regression trees. q represents the structure of each tree, which associates a sample with a corresponding leaf index. 𝑇 is the number of leaves in the tree. Each \(f_{\theta_{jk}}\)is similar to an independent tree structure 𝑞 and leaf weights 𝜔. To study these tree functions in the model, the following regularization objectives are minimized:

\(l(\theta_{1},...,\theta_{l}) = \sum_{i = 1}^{l}{l(\theta_{1i},...,\theta_{li};\mathbf{y}_{i}) + \sum_{k_{1}}^{}{\Omega_{\theta_{1}}(f_{k_{1}})}} + ,..., + \sum_{k_{l}}^{}{\Omega_{k_{l}}(f_{k_{l}})}\) (7)

where

\(\mathbf{\Omega}_{\theta_{r}}(f_{k_{r}}) = \mathbf{\gamma}_{\theta_{r}}\mathbf{T}_{\theta_{r}} + \frac{1}{2}\mathbf{\lambda}_{\theta_{r}}(\omega)^{2},r = 1,2,...,l\) (8)

\(\Omega_{\theta_{1}}(f_{k_{1}}) = \gamma_{\theta_{1}}T_{\theta_{1}} + \frac{1}{2}\lambda_{\theta_{1}}(\omega)^{2}\)

\(\mathbf{\gamma}_{\theta_{j}}\)and \(\mathbf{\lambda}_{\theta_{j}}\)are respectively the regularization parameters of \(\mathbf{T}_{\theta_{j}}\)and \(\phi_{j} = (\omega)^{2}\)

The objective function for t ^th iteration is

\(l_{t}(\theta_{1},...,\theta_{l}) = \sum_{i = 1}^{l}{l(\theta_{1i},...,\theta_{li};\mathbf{y}_{i}) + \sum_{i = 1}^{l}{\lbrack g\{ f(\mathbf{x}_{1}),...,f(\mathbf{x}_{l})\} +}}\frac{1}{2}h\{ f(\mathbf{x}_{1})^{2},...,f(\mathbf{x}_{l})^{2}\}\rbrack + \sum_{i = 1}^{l}{\mathbf{\gamma}_{\theta_{i}}\mathbf{T}_{\theta_{i}} + \frac{1}{2}\mathbf{\lambda}_{\theta_{I}}(\omega)^{2}}\) (9)

A maximum of 𝑙 trees could be trained simultaneously in each iteration of training, that is, all the parameters could be estimated simultaneously. Each tree corresponds to one parameter to be estimated and has its own independent hyperparameters.

The Proposed Multivariate XGBoost Technique for the VAR Model

The proposed technique would be developed by hybridizing the multivariate XGBoost method proposed by Guang (2021) and the conventional VAR model to estimate the Models. Thus, by estimating all regression equations simultaneously. The model estimation and forecasting can be more efficient compared to solving each regression equation independently, as noted by Evgeniou and Pontil (2004). The limitation of the proposed technique would only apply to multivariate data, and the data should be a time series. For a given train data set: \(\{\mathbf{D} = (\mathbf{y}_{it},\mathbf{y}_{jt - r}),i,j = 1,...,k;r = 1,...,p;\mathbf{y}_{it}\mathbb{\in R},\mathbf{y}_{jt - r}\mathbb{\in R}\}\)contained k features.

For any sample point\((\mathbf{y}_{it},\mathbf{y}_{jt - r})\), consider \(l(\mathbf{y}_{1t},...,\mathbf{y}_{kt};\mathbf{y}_{it})\) as 𝑙- variable loss function and\(\mathbf{y}_{1t - r},...,\mathbf{y}_{kt - r}\), are the independent variables, Consider \(\theta_{1t},...,\theta_{kt}\) as loss function parameters to be estimated in the multi-objective parameter regularized tree boosting model, add 𝐾_𝑗 tree functions to obtain the predicted result of parameter \(\mathbf{\theta}_{it}(i = 1,...,k)\)of \(l(\mathbf{y}_{1t},...,\mathbf{y}_{kt};\mathbf{y}_{it})\).

\({\widehat{\mathbf{\theta}}}_{it} = \phi_{it}^{r}(\mathbf{y}_{ji - r}) = \sum_{j = 1}^{k}{\sum_{r = 1}^{p}f(\mathbf{y}_{jt - r})}\) (10)

where \(\{ f(\mathbf{y}_{ijt - r}) = \omega_{q(\mathbf{y}_{ijt - r})}\}\) was the space of regression trees. q represents the structure of each tree, which associates a sample with a corresponding leaf index. 𝑇 was the number of leaves in the tree. Each \(f_{\theta_{jk}}\) is similar to an independent tree structure 𝑞 and leaf weights 𝜔. To study these tree functions in the model, minimize the following regularization objectives.

\(\theta_{1t},...,\theta_{kt} = \sum_{j = 1}^{k}{\sum_{r = 1}^{p}{l\lbrack(\mathbf{y}_{1t - r},...,\mathbf{y}_{kt - r};\mathbf{y}_{jt - r})\rbrack + \sum_{j = 1}^{k}{\sum_{r = 1}^{p}\Omega}f(\mathbf{y}_{jt - r})}}\) (11)

where Ω represents the regularization term, a factor used to measure the complexity of the tree \(f(\mathbf{y}_{jt - r})\).

We can obtain the optimum \(f(\mathbf{y}_{jt - r})\) by adding first and second order for gradian statistics for each loss function to minimized the objective functions.

\(\theta_{1t},...,\theta_{kt} = \sum_{j = 1}^{k}{\sum_{r = 1}^{p}{l\lbrack(\mathbf{y}_{1t - r},...,\mathbf{y}_{kt - r})\rbrack + \sum_{j = 1}^{k}{\sum_{r = 1}^{p}{\lbrack g\{ f(\mathbf{y}_{1t - r}),...,f(\mathbf{y}_{kt - r})\}}}}} + \frac{1}{2}h\{ f(\mathbf{y}_{1t - r})^{2},...,f(\mathbf{y}_{kt - r})^{2}\}\rbrack + \sum_{j = 1}^{k}{\sum_{r = 1}^{p}{\Omega f(\mathbf{y}_{jt - r})}}\) (12)

where h and g are first and second order of loss functions.

and \(\Omega f(\mathbf{y}_{jt - r}) = \mathbf{\gamma}_{jt - r}\mathbf{T}_{jt - r} + \frac{1}{2}\mathbf{\lambda}_{jt - r}\mathbf{\omega}_{jt - r}^{2}\)

where \(\mathbf{\gamma}_{jt - r}\)and \(\mathbf{\lambda}_{jt - r}\)are the degrees of regularization. \(T_{jt - r}\) and \(\mathbf{\omega}_{jt - r}\) are the numbers of leaves and the vector of values attributed to each leaf, respectively.

By removing the constant terms, we have:

\(\theta_{1t},...,\theta_{kt} = \sum_{j = 1}^{k}{\sum_{r = 1}^{p}{\lbrack g\{ f(\mathbf{y}_{1t - r}),....,f(\mathbf{y}_{kt - r})\}}} + \frac{1}{2}h\{ f(\mathbf{y}_{1t - r})^{2},...,f(\mathbf{y}_{kt - r})^{2}\}\rbrack + \sum_{j = 1}^{k}{\sum_{r = 1}^{p}{\{\mathbf{\gamma}_{jt - r}\mathbf{T}_{jt - r} + \frac{1}{2}\mathbf{\lambda}_{jt - r}\mathbf{\omega}_{jt - r}^{2}\}}}\) (13)

This is the derivation for the proposed Multivariate XGBoost technique for the VAR model to obtain the best model based on the forecast performance.

RESULTS

Simulated Data Results

The simulated data contained 12 features, each with 100 samples. Data points were obtained using the Python package. The conventional VAR model was estimated using information-criteria-based methods, such as AIC and BIC, and the proposed VAR model (using multivariate XGBoost techniques). The forecast performance for proposed and existing techniques for estimated VAR models using MAE and RMSE would be computed and compared based on short-term and ten-step-ahead long-term forecasts.

Figure. 1: Time series plot for 12 monthly simulated multivariate time series data

The nature of the simulated time series graph in Figure 1 indicates the existence of stationarity in the time series data at level.

Lag order selection

First, we undertake a VAR Lag Order selection process. The results for various selection criteria are presented in Table 1.

Table 1: Lag order selection

Lag Order	AIC	BIC	HQIC
0	-1.096	-0.7249*	-0.9476
1	0.05013	5.322	2.426
2	1.732	11.00	5.434
3	2.268	15.55	7.746
4	-0.4128	17.76	6.842
5	-10.74*	11.88	-1.710*

Table 1 shows that, based on AIC and HQIC, lag 5 was selected as the optimal lag for estimating the VAR Models.

Forecast Performance for conventional VAR Models and the Proposed Multivariate XGBoost technique for the VAR model

After estimating the conventional and proposed techniques for VAR models, the forecast performance for the estimated VAR models will be estimated using short-term and long-term forecast horizons based on MAE and RMSE. The summary results are shown in Table 2.

Summary of Results for Simulated Data

Table 2: Forecast performance for existing and proposed techniques based on short-term forecast

Models	Short-term Forecast		Long-term Forecast
Models	MAE	RMSE	MAE	RMSE
Conventional VAR Model	1.642	2.016	1.316	1.669
VAR (MXGB) Model	2.625	2.652	1.008	1.357

From Table 2, the conventional VAR model outperforms the proposed Multivariate XGBoost technique for the VAR Model in short-term forecast, since the conventional VAR model has a smaller value of MAE and RMSE than the proposed technique. The result for long-term forecasting shows that the proposed Multivariate XGBoost technique for VAR (MXGB) model demonstrates a high forecast accuracy compared to the conventional VAR model, since the proposed technique has the least value of MAE and RMSE compared to the conventional VAR model for simulated data.

Results for Real Dataset

The real dataset consists of monthly data for Nigerian financial time series spanning the period January 2010 to July 2024, a total of 175 data points to estimate the models. The data for all the variables were obtained from the Central Bank of Nigeria website are considered in fitting the estimated models. In this research, nine years of Nigeria’s trade in goods and services time series data, categorized into visible goods and services, are considered in estimating the models. The nine variables are: Oil Sector (OILSE), Log of Financial Sector (LFINS), Industrial Sector (INDSE), Balance of Trade (BOT), Log of Commercial Services (LCOMS), Transport Services (TRANS), Tourism and Travel services (TTS), Health and related Social services ( HRSS), and Mining Sector (MINSE).

Figure 2: Time series plot for 9 monthly Trade in Goods and Services data

Figure 2 presents the monthly plot of the financial time series data, while Table 3 reports the corresponding unit root test results. Both the visual inspection of the time series plot and formal diagnostic testing confirm stationarity for all nine time series at level, as evidenced by ADF Test result meeting conventional significance thresholds (\(p\ \)< 0.05).

Unit Roots test

Table 3: Augmented Dickey Fuller Unit root test for real dataset

Variables	ADF Statistic	p-value	Order of Integration
MINSE	-3.87861	0.002198 **	I(0)
OILSE	-5.516953	1.915 × 10⁻⁶ **	I(0)
HRSS	-12.33116	6.422 × 10⁻²³ **	I(0)
TTS	-4.08723	0.001017 **	I(0)
INDSE	-3.07069	0.02880 **	I(0)
BOT	-3.41569	0.010436 **	I(0)
MINSE	-3.52964	0.007258**	I(0)
FINS	-3.52964	0.007258 **	I(0)
COMS	-2.34250	0.158626 *	I(0)
TRANS	-2.32547	0.163885 *	I(0)
LFINS	-3.87670	0.002198 **	I(0)
LCOMS	-3.98152	0.001510 **	I(0)

Note: ** indicates the variable is significant at 5% and * indicates the variable is significant at the 10% level. The null hypothesis of a unit root test is rejected, implying that all the variables are stationary. The null hypothesis states that there is the existence of unit root in the data.

A stationary series must be obtained before it can be used to specify and fit a model. The unit roots test, which has as the null hypothesis of existence of a unit root versus the alternative hypothesis of the nonexistence of a unit root, will help us to determine the stationarity of a series. The Augmented Dickey-Fuller (1981) was used to test for the stationarity of the series. The results for the unit root test presented in Table 3 indicate that all the variables are stationary at the level, except FINS and COMS, which are stationary using a log transformation. Thus, all the 9 variables are stationary.

Lag order selection

First, we undertake a VAR Lag Order selection process. The results for various selection criteria are presented in Table 4.

Table 4: Lag order selection

Lag Order	AIC	BIC	HQIC
0	44.25	44.47*	44.34
1	43.54	45.77	44.46
2	43.97	48.15	45.67
3	44.55	50.70	47.04
4	45.01	53.14	48.31
5	45.27	55.38	49.37
6	44.98	57.06	49.88
7	45.55	59.61	51.25
8	44.91	60.94	51.41
9	43.00	61.02	50.31
10	39.11	59.10	47.22
11	30.34*	52.31	39.25*

The lag order selection results show that, based on AIC and HQIC, lag 11 is selected as the optimal lag for estimating the VAR Models.

Forecast Performance for conventional VAR Models and the Proposed Multivariate XGBoost technique for the VAR model

After estimating the conventional VAR model based on information criteria and proposed technique for VAR models, the forecast performance for the estimated VAR models would be compared using short-term and long-term forecast horizons based on MAE and RMSE. The summary results are shown in Table 5.

Summary Result for Real Data

Table 5: Comparative Analysis on the forecast performance for existing and proposed techniques based on short-term and long-term forecasts

Models	Short-term Forecast		Long-term Forecast.
Models	MAE	RMSE	MAE	RMSE
Conventional VAR Model	719.14	1901.49	428.81	1487.53
VAR (MXGB) Model	191.96	267.14	451.23	730.57

Result in Table 5, indicate that the proposed Multivariate XGBoost technique for VAR model outperform the conventional VAR Model in short-term forecast, since the proposed technique has smaller value of MAE and RMSE than that of conventional VAR model in both short-term, the result for long-term forecasting indicated that the proposed technique outperformed the conventional VAR model, since the metric value of RMSE for proposed technique is smaller than that of conventional VAR model for real dataset.

CONCLUSION

In this study, the proposed multivariate Extreme Gradients Boosting technique for the VAR model was developed by hybridizing the multivariate Extreme Gradients Boost technique to the VAR model. The results for simulated data indicated that the proposed multivariate Extreme Gradients Boosting techniques for VAR model outperform the existing (conventional VAR model) in long-term forecasting, while the conventional VAR model is superior in terms of forecast accuracy than the proposed multivariate XGBoost technique in short-term forecasting based on MAE and RMSE. The results for the real dataset indicated that the proposed multivariate XGBoost technique for the VAR model outperformed the conventional VAR model in terms of forecast accuracy in both short-term and long-term forecasting, as measured by MAE and RMSE.

RECOMMENDATION

Based on the findings of this study, it is recommended that:

The multivariate XGBoost technique is the best technique for short-term and long-term forecast in real-life time series data for the VAR model.

The proposed techniques are to be considered as alternative techniques for VAR models in modelling financial time series data.

REFERENCES

Billio, M., Casarin, R., & Rossini, L. (2019). Bayesian nonparametric sparse VAR models. Journal of Econometrics, 212, 97–115.[crossref]

Chen, T., & Guestrin, C. (2016). A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. .[crossref]

Evgeniou, T., & Pontil, M. (2004). Regularized multitask learning. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 109–117. [crossref]

Guang, Y. (2021). Generalized XGBoost method. Blue Print.

Jung, N. H., Lin, H. H., & Mei, Y. C. (2008). Subset selection for vector autoregressive processes using LASSO. Journal of Computational Statistics and Data Analysis, 52(7), 3645–3657. [crossref]

Korobilis, D. (2009). VAR forecasting using Bayesian variable selection. University of Strathclyde and Rimini Center for Economic Analysis Review. [crossref]

Li, Y., & Chen, W. (2020). A comparative performance assessment of ensemble learning for credit scoring. Tianjin University Review, China. [crossref]

Lubbers, L. (2023). Comparing the performance of XGBoost and random forest models for predicting company cash position [Master’s thesis, Utrecht University].

Lütkepohl, H. (2005). New introduction to multiple time series analysis. Springer. [crossref]

Nicholson, W., Matteson, D., & Bien, J. (2015). Structured regularization for large VAR with exogenous variables. arXiv preprint. arXiv:1508.07497.

Rahman, M. G., & Davis, D. N. (2013). Addressing the data imbalance problem in software defect prediction using cost sensitive learning. International Journal of Machine Learning and Computing, 3(4), 333–336. [crossref]

Sims, C. A. (1980). Macroeconomics and reality. Econometrica, 48(1), 1–48. [crossref]

Sun, C., Chain, Z., Qin, Y., & Want, B. (2022). Multi-steps time series forecasting based on Informer - XGBoost - GA. Journal of Physics: Conference Series, 2333(1), 012009. [crossref]

Sundari, P. S., & Mahardika, K. P. (2024). Optimization house price prediction model using gradient boosted regression trees (GBRT) and Xgboost algorithm. Journal of Student Research Exploration, 2(1), 1–10. [crossref]

Suotsalo, K., Xu, Y., Corander, J., & Pensar, J. (2021). High-dimensional structure learning of sparse vector autoregressive models using fractional marginal pseudo-likelihood. Columbia University Review. [crossref]

Tsay, L. S. (2005). Analysis of financial time series. John Wiley & Sons. [crossref]

Zhai, Y. (2021). Multivariate time series forecast in industrial process based on XGBoost and GRU. 2020 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 9. [crossref]