Wheat futures price prediction based on GA-VMD-CS-LSTM model

. Wheat is an important agricultural product. Its price discovery is of great significance to the transaction efficiency of the futures market and even the price fluctuation of the supply and demand relationship in the spot market. Due to the futures price fluctuation itself being a nonlinear relationship, in front of massive data, to achieve a better prediction effect, this paper uses the GA-VMD method to clean the data, and the CS-LSTM model to predict the futures price of wheat in Zhengzhou Commodity Institute. Comparing with the prediction results of SVR, BP and LSTM, it is found that CS-LSTM has lower prediction error and better prediction effect. This paper studies the changes in wheat futures prices and provides theoretical support for government to timely specify appropriate policies to adjust wheat output and prices and improve farmers' lives.


Introduction
Wheat is the world's largest food crop. In the world grain trade structure, the wheat trade is the main body of grain trade, with a wide range and many participating countries. Wheat is the thirdlargest food crop in China, and its yield and imports rank at the forefront of the world. For the futures market, it plays the role of social resource allocation and provides theoretical data support for agriculture. The wheat futures market of Zhengzhou Commodity Exchange started early, trading actively, with a sound institutional guarantee, and played a role in promoting the integration of domestic and international markets in China. This paper takes the wheat futures price of Zhengzhou Commodity Exchange as the research object, and carries on the forecast analysis.
Many scholars have carried out relevant research on the futures. Pang Zhenyan et al. (2013), based on discrete wavelet change and the GARCH model, showed that the futures market had a persistent impact on the volatility of spot market prices. For different varieties, the impact would be different [1]. The existing research models can be roughly divided into linear and nonlinear prediction models in terms of futures price prediction. Wang et al. (2008) predicted the long-term trend of wheat futures price by constructing the Markov model [2]. This method is simple and practical, and has a certain reference value. Through deep learning, Wang (2009) proposed a neural network based on the fusion of grey correlation analysis (GRA), Cuckoo search (CS) and back propagation (BP) to predict the prices of futures [3]. He provides a theoretical basis for integrating CS with others to predict wheat futures. Wang (2020) proposed the stock index futures price prediction scheme of the LSTM-SVR hybrid model based on long short-term memory artificial neural network algorithm (LSTM) and support vector regression (SVR) [4]. She extended the LSTM model to the field of stock index futures price prediction, and innovatively integrated LSTM and SVR models to form a hybrid model, which confirms the rationality of the hybrid model with a more accurate result. This paper extends based on others from its topic that the prediction of wheat price and models used. We choose the GA-VMD model, a noise reduction method combined with a genetic algorithm (GA) and variational mode decomposition (VMD), to preprocess the data. In addition, the hybrid deep learning model of the CS- The innovation of this paper is mainly divided into two points, innovation in research content and models.
(1) For the research on wheat futures, some scholars have studied the effectiveness, price volatility, and so on. However, in terms of using the deep learning algorithm to predict the price of wheat futures, the content is relatively small. This paper expands the content here.
(2) This paper uses the GA-VMD model to process the data through the deep learning algorithm and uses the CS-LSTM model to analyze the data, so as to obtain the prediction results. CS is added to the LSTM model to optimize, and the result proves that the model can obtain better results and support the model's effectiveness.
In this paper, the price of wheat futures is predicted by deep learning, which realizes the expansion of deep learning in the price prediction of agricultural products taking wheat as an example. In addition, a higher accuracy prediction can be realized. More accurate futures price signals can be transmitted to the food sector in time, which is beneficial to guide farmers to reasonably adjust the variety structure of wheat, further improve the quality of wheat and farmers' futures price consciousness, so as to realize the rise of income.

GA model
Genetic Algorithm (GA) is based on the evolution of organisms in nature. The computational model simulates the natural selection and genetic mechanism of Darwin'ss biological evolution process to simulate the search for the optimal solution in the natural evolution process [5].

GA-VMD model
The VMD decomposition effect is greatly affected by the quadratic penalty factor. The larger the parameter, the smaller the bandwidth of the modal component signal, and vice versa. Futures prices are often varied and complex. Only selecting reasonable parameter values can improve the decomposition effect of VMD on futures prices. The parameter decomposition parameters determine the decomposition effect of VMD, so the most appropriate parameters need to be selected. By establishing the GA-VMD model, a genetic algorithm is used to calculate the optimal penalty parameters in the VMD decomposition model [6].

LSTM model
Long short-term memory neural network (LSTM network) is proposed, which better solves the problem of gradient disappearance and gradient explosion in long sequence training [7].

CS (Cuckoo Search Algorithm)
Cuckoo search algorithm, also known as the cuckoo algorithm, is a new algorithm mainly based on two strategies: the Levy flight mechanism and cuckoo nest parasitism. This algorithm can solve optimization problems [8].
The following three idealized hypotheses are first made: (1) Cuckoo lays one egg each time, and randomly looks for a nest hatching egg. (2) The best nest will be reserved for the next generation in a random selection.
Set the number of available nests ton, and the host bird's probability of finding eggs in the nest is [0,1] P  . (3) If the alien eggs are found, the original host will find a new nest or destroy the eggs [9]. Therefore, the location formula of cuckoo searching for bird nest can be obtained : Where  represents step size scaling factor, and obeys normal distribution ; represents the position of the i th nest at the t th ; ⨂ represents a point multiplication operation [10] . Specific cuckoo search algorithm process steps are as follows :

CS-LSTM model
Before the prediction of the neural network model, it is necessary to set the super parameters, that is, the cuckoo search algorithm is used to optimize the data to find the super parameters of the neural network model. After training, since the CS algorithm optimizes the setting of super parameters, the CS-LSTM model will be more convincing than the basic LSTM model.

Data selection
The data set selected in this paper comes from Zhengzhou Commodity Exchange. The closing price, opening price, maximum price, minimum price and trading volume of wheat futures from January 4, 2007, to March 15, 2022, are one day. Apart from holidays and double breaks, a total of 3573 days of data were collected, with a total of 17870 valid data.

Data processing
In this paper, the 3573 days' closing price data are normalized, which is beneficial to the subsequent prediction. The closing price of the original wheat futures is normalized to the interval [ 0,1 ]. As (5)  Where, is the original wheat futures price, , is the minimum wheat futures price and the maximum wheat futures price, respectively, and − is the normalized data of the original wheat futures price.

Evaluation indicators
In this paper, three error indicators are used to evaluate the model's performance. The specific formula is as follows: 1) Root mean square error ( RMSE ) : Among them, n: the total number of samples; : predicting the actual closing price on day i; ̂ : Predicted closing price on day i.
When the values of RMSE and MAE are smaller, the error is smaller, and the prediction effect is better. 2 larger, the smaller the error, the better the prediction effect.

Comparison parameters of the GA-VMD model
This paper uses the GA algorithm to calculate the optimal penalty parameter in VMD. The result is as shown in Figure 2:

Decomposition model results
Then the optimal penalty parameter VMD model decomposes the closing price of wheat futures. The IMF decomposition results show that the closing price of wheat futures is decomposed into seven components. The optimization process curve is shown in Figure 3:

Figure4: IMF Time Domain
In the evolution times of 0 to 30, the number of decomposition modes is 7. The specific IMF timedomain diagram and IMF spectrum diagram are shown in Figure 4

The closing price prediction of wheat futures based on the CS-LSTM algorithm
In the data sample, the first 70 % of data are selected as the training set, and the last 30 % of data are selected as the verification set. Firstly, this paper uses CS to optimize the hyper-parameters of LSTM, and obtains the optimal set of hyper-parameters through continuous iterative optimization. In addition, this paper sets the hidden layer number of the LSTM model as two layers, and sets super parameters: Batch-size, Learning-rate, and node-number.  The first layer node-number 10 10 The second layer node-number 100 100 After CS, the optimal super-parameters are substituted into the LSTM neural network model to obtain the CS-LSTM prediction results in Figure 6.
These are the four indexes of CS-LSTM that can be calculated by relevant code.

Comparative analysis of four prediction models (SVR, BP, LSTM, CS-LSTM)
Select the other three prediction models (support vector regression (SVR) model, BP neural network model, LSTM neural network model), respectively, to predict the sample data. Then compared with the CS-LSTM model in this paper, it fully illustratess the CS-LSTM model's accuracy.

Figure 6: CS-LSTM Prediction Chart
Secondly, to verify the four models' prediction effect on the closing price of wheat futures, Figure  7 below shows the comparison chart of the four models. In order to make the CS-LSTM model more convincing, Figure 8 below shows the comparison of the four models that locally magnify the 900th to 1000th sample points.  The above diagram shows that the prediction results of CS-LSTM are better than those of the other three models, and the fitting degree with the real value is the best. By calculating relevant codes, the following table shows the evaluation indexes of the four models. Through the above table analysis, it can be seen that the CS-LSTM model is basically better. Since CS-LSTM adds iterative optimization based on the LSTM model to find out the optimal hyperparameters, the fitting effect of CS-LSTM is better. Therefore, the optimization of the CS-LSTM model is proved by the above practical verification.