Research on the Returns of SSE STAR Market Chip Index Based on the ARMA Model

. Since 2021, China’s semiconductor industry and chip industry have been in the hot stage. While the scale of the two industries has continued to expand, there are also problems of bubble economy and excessive competition. In the first half of 2022, the stock market has entered an adjustment cycle. In order to predict the returns trend of the adjustment cycle and help investors make reasonable decisions. This paper uses AIC and BIC criteria to determine the order of the model, test the significance of the model and its parameters, then build the ARMA(2,1) model to predict the closing price of the SSE STAR Market Chip Index from January 2 nd , 2020 to February 21 st , 2022. According to the results, this paper finds out that both fitting effect and forecasting effect are good. The index will maintain a downward trend in the next 30 periods. Through the above empirical analysis, this paper puts forward two suggestions: firstly, in the face of industry differentiation, people need to invest rationally; secondly, semiconductor companies and chip companies should attach importance to mergers and acquisitions.


Introduction
With the advent of the digital economy era, the diversification of consumer demand has led the integrated circuit industry chain to expand into digital and intelligent application fields. The industry chain covers semiconductor material processing, chip manufacturing, network communications and other popular industries. In recent years, it is increasingly valued by investors because it is not only the frontier industry of the information technology industry, but also an emerging industry that highlights national strength, promotes economic development and ensures national security [1]. The semiconductor industry is the hottest in this industry chain, because semiconductor chips involve various downstream terminals such as automobiles, communications, medical care, and consumer electronics. Almost every electronic device in our daily life is inseparable from the application of semiconductors. With continuous popularity of new energy vehicles, automotive chips have become an important growth opportunity for semiconductor production. The demand of chips outstripping supply and the profit margin of the industry remaining huge both make semiconductor investment the focus of the market [2].
However, in the first half of 2022, the semiconductor industry has entered a downward cycle. The enthusiasm of consumers' purchase has widened the gap between industries, showing a situation of uneven popularity. For example, new energy vehicles and energy storage related industries are still improving steadily, while the chips involved in smart home appliances are affected by the decline in the real estate industry and the demand volume fluctuates violently. In general, the market growth rate and dividends are decreasing. External reasons include repeated epidemics, inflation, wars, etc., and internal reasons include high valuations and blind competition in relevant industries, forming a bubble economy. The growth of returns has slowed down and there generates a future trend of industry reshuffle. This paper believes that observing an industry-representative index can reflect the long-term fluctuations of the market, predict the income trend of the adjustment cycle and help some investment institutions to slow down, which is of great significance for making investment decisions.
With the rapid development of the securities market, scholars have conducted extensive research on the forecasting of individual stocks, industry indices, and funds. Among them, there are many applications of time series models for financial time series forecasting. These statistical models include Auto Regressive Moving Average model (ARMA) [3], Autoregressive Integrated Moving Average Model (ARIMA) [4], Generalized Auto Regressive Conditional Heteroskedasticity model (GARCH), ARMA-GARCH [5], etc., which are all based on the relationship between current and historical value.
In terms of stock price, Yang and Zhang modeled the opening price of Volkswagen public stock with small prediction error, indicating the ARIMA model could provide stock investment decisionmaking for relevant institutions [6]. Sun and Zhou used the BIC principle to determine the ARIMA model's order and predict the closing price of China Merchants Bank in the next 40 periods [7]. In terms of indexes, Li et al., established ARIMA model to forecast the Shanghai Composite Index closing price of the last two months in 2016, the fitting effect of which was very significant [8]. K Murali et al., established the ARIMA(2,1,2) model to predict the closing BSE Index in the next 12 months, expounding the excellent forecasting effect of the ARIMA model [9]. Paulo Rotela Junior et al., used AR(1) model, ARIMA(0,2,1) model and exponential smoothing model to fit the Brazilian stock market Index respectively. The MAPE of AR(1) is 0.052%, which is the lowest when compared to other options [10].
In the semiconductor-related industries field, the existing literatures mainly include research on the valuation of specific semiconductor and chip companies and modeling the prosperity index of the semiconductor industry. There is no time series analysis of the comprehensive index of the semiconductor industry and chip industry. Therefore, the innovation of this paper is to analyze the SSE STAR Market Chip Index to explore the returns when the relevant markets are down. This paper uses ARMA(2,1) model to predict the closing price of the SSE STAR Market Chip Index in order to explore the future trends of the semiconductor industry and the chip industry during the adjustment period from a comprehensive perspective. The research steps are as follows: firstly, use the AIC and BIC criteria to determine the order of the ARMA model; secondly, perform a significance test on the model and its parameters to obtain the predicted value of the closing price in the next 30 periods; finally, compare with the actual value to analyze fitting and forecasting effect. This paper believes that the ARMA model has excellent properties in predicting the short-term yield of the chip index. In the short term, the demand side of the semiconductor industry and the chip industry will continue to tighten, and the lack of enthusiasm for consumers' purchase will not be eased soon. In the face of industry differentiation, investors need to return to rationality and semiconductorrelated companies should pay attention to mergers and acquisitions.
The following framework of this paper is: data and methodology, results and conclusion.

Data Interpretation
The data studied in this paper comes from Wind. The selected research object is the closing price of the SSE STAR Market Chip Index (Index code: 000685) from January 2 nd , 2020 to February 21 st , 2022, with a total of 516 pieces of data, denoted as . The training set is the closing price from January 2 nd , 2020 to December 31 st , 2021, and the test set is from January 2 nd , 2022 to February 21 st , 2022, during which the semiconductor industry is undergoing the adjustment cycle. This index selects securities related to semiconductor materials and equipment, chip design, chip manufacturing, chip packaging and testing from STAR Market companies as index samples, which can reflect the overall performance of the representative securities of companies in the chip industry.
This paper then calculates the linear returns and compounded returns of the series and obtains the following data characteristics. As shown in table 1, the maximum and minimum values of the linear returns are 12.480% and -18.549% respectively, showing a right-skewed distribution while the maximum and minimum values of the compounded returns are 11.761% and -20.516% respectively, showing a left-biased distribution. Besides, both returns show steep spikes.

Stationarity Test
Before building a time series model, it is necessary to confirm that the series is stationary. There are two main stationarity test methods and both are introduced below [11].
Graph test method: Stationary time series have constant mean and variance, which means that the time series diagram should show that the series always fluctuates around a constant value, without obvious trend and periodicity. In addition, the autocorrelation plot of stationary time series usually decays to 0 in the short term.
DF test/ADF test: The DF test was proposed by statisticians Dickey and Fuller, which is only suitable for testing AR(1) model, while the ADF test can test the AR(p) model. Both of them are based on the relationship between stationary series and the unit roots, that is, the stationarity of the series is judged by checking whether there are unit roots on or outside the unit circle.

White noise test
If the time series are not completely disordered random fluctuations, it is meaningful to continue modeling. As a result, it needs to pass the white noise test next. According to Bartlett's theorem, the autocorrelation coefficient of the non-zero delayed period of a white noise sequence with observation periods { , = 1,2, … , } approximately obeys the normal distribution with mean value of zero and variance of 1⁄ , which can be written as ̂~̇(0, 1 ) , ∀ ≠ 0.
From this, it is easy to obtain the hypotheses for testing the white noise sequence, which are 0 : 1 = 2 = ⋯ = = 0, ∀ ≥ 1 and 1 : at least one ≠ 0, ∀ ≥ 1, ≤ . In order to test this joint hypothesis, the Q statistic in the case of large samples and the LB statistic in the case of small samples can be constructed, both of which approximately obey the chi-square distribution with degrees of freedom, namely: If the statistic is greater than the 1 − quantile of the chi-square distribution with degrees of freedom, the sequence will be considered as a non-white noise sequence.

Definition of the ARMA(p,q) model
The ARMA model was jointly proposed by Box and Jenkins, also known as the Auto Regressive Moving Average model, and has the following structure: This model is also called a centralized ARMA(p,q) model when 0 = 0. It can be abbreviated as the following formula: After introducing the backshift operator, the ARMA (p, q) model can be written as ( ) = ( ) .

Significance test of the model
After using AIC and BIC criteria to determine the order of the model, the validity of the model needs to be checked and it is necessary to observe whether the extracted information is already sufficient. The fitted residual sequence should be a white noise sequence.

Significance test of parameters
In order to simplify the model, it is also necessary to perform a significance test on the parameters and then retain the parameters that have a significant impact on the dependent variable. The null and alternative hypotheses are 0 : = 0, ∀1 ≤ ≤ and 1 : ≠ 0, ∀1 ≤ ≤ . Due to the assumption of normal distribution, the least squares estimate ̂ of the ℎ position parameter follows a distribution with mean value of 0 and variance value of 2 , where is the ℎ element on the diagonal of matrix ( ′ ) −1 . Then t-test statistic can be constructed as the following formula: If the absolute value of the statistic is greater than the 1 − 2 quantile of the distribution with − degrees of freedom, the null hypothesis will be rejected. That is to say, the parameter is considered to be significantly non-zero and the model should retain this parameter.

Forecast
Next, Box and Jenkins use linear minimum variance forecast to estimate the value of a future period and obtain a 95% confidence interval. In the case of the ARMA(p,q) model. The predicted value of each period can be written as: (6) In this equation, ̂( ) = {̂( ), ≥ 1 + , ≤ 0 .
The variance of the predicted value is: Finally, it is common to calculate the average error to check the accuracy of the model and make appropriate decisions according to the future trend of the time series.

Stationarity test
First, this paper observes the stationarity of the index closing price time series by graph test method. Figure 1 shows that the time series diagram of has no significant non-stationary characteristics, that is, there is no obvious trend and periodicity. However, figure 2 shows that the autocorrelation coefficient of falls within twice the standard deviation after 30 lags and the decay rate is slow, indicating that the autocorrelation coefficient has a constant correlation, which is the characteristics of a non-stationary series. So, it is difficult to judge the stationarity of this series based on the above.  It is highly subjective to judge the stationarity of the series only by the graph test, so this paper is going to carry out the ADF test next. As shown from table 2 that the t statistic of the ADF test is -2.881 and the p value is significant, which means that null hypothesis is rejected at the 5% confidence level, that is, the series is considered stationary. The test formula obtained according to the coefficient output results in table 2 is:

White noise test
The Q statistic and LB statistic of lag 6 and their p-values were calculated respectively (table 3). The results significantly rejected the null hypothesis that the sequence was a white noise sequence and considered the sequence to be a non-white noise sequence.

Model order determination
Next, this paper draws the partial autocorrelation coefficient diagram of the sequence and then combines figure 2 with figure 3. The autocorrelation diagram shows the characteristics of tailing and the partial autocorrelation diagram shows the characteristics of second-order truncation. So, this paper considers using AR(2), ARMA(2,1) and ARMA(2,2) to fit the data respectively.

Significance test of the model
This paper then compares the AIC and BIC values of AR(2), ARMA(2,1), and ARMA(2,2) and determine the order of model. Table 4 shows that the AIC value of ARMA (2,1) model is 5060.544, which is the smallest among the three models and the BIC value is very similar to that of AR(2) model. Therefore, ARMA(2,1) model is finally used to fit the time series. As shown in figure 4(c), the white noise test results of the residual sequence show that the p-values of each order of delay are all higher than the dotted line position, that is, greater than 0.05. Figure 4(d) shows the residual series are densely distributed around the diagonal and are considered to be approximately normally distributed. In summary, the ARMA(2,1) model is significantly effective. Figure 4. Significance Test of the Model

Significance test of parameters
This paper then estimates the coefficients of the ARMA(2,1) model. As shown in table 5, the estimated coefficient of AR(1) is 0.009, the estimated coefficient of AR(2) is 0.953, and the estimated coefficient of MA(1) is 0.999. Except for AR(1), the remaining p-values are significant and the fitting effect is good. So, the final expression of the model is: Finally, the variance of the predicted values is calculated to obtain the 80% and 95% confidence intervals. It can be seen from table 7 that the error values are all at a low level and fitted values are very close to the real values. As is shown in figure 5, the 30-period forecast values completely fall within the 80% confidence interval and almost fall within the 95% confidence interval. The model predicts that the index will continue to decline sharply in the future, which is consistent with the actual trend, but the decline in the predicted values is smaller than the actual values. In general, the prediction effect of the model is good.

Forecast of returns
The linear returns are calculated based on the predicted values of the closing price. As is shown in figure 6, the number of periods of ups and downs in the next 30 periods of forecasted returns is similar, but the decline is more significant than the increase, so the overall trend is a sharp decline. As is shown in figure 7, the number of periods of decline of real returns is 19 and it also shows a downward trend.

Conclusion
This paper found that although investors poured into the semiconductor industry and the chip industry in 2021, making relevant companies to rise together, 2022 will be an adjustment cycle for these industries and it will last for a longer time than people thought before. This paper uses the ARMA(2,1) model to predict the closing price of the SSE STAR Market Chip Index. According to the final expression of the best model, the chip index will still decline sharply in the next 30 periods, indicating that the demand side of the semiconductor industry and chip industry will continue to tighten, and the situation of consumers lacking of purchasing enthusiasm will not be eased up anytime soon.
Based on the empirical analysis results of the model, this paper proposes the following two suggestions for industry development and investment so that investors can better cope with the certain recession in the stock market brought about by the slowdown in the growth of the semiconductor industry and the chip industry.
Firstly, in the face of industry differentiation, investors need to return to rationality. Affected by varying degrees of demand weakening, downstream industries face a situation of different prosperity.
For example, chips in the general and consumer electronics fields are affected more seriously, therefore the price reduction is more obvious, while the prosperity of automotive chips and semiconductor equipment and materials is relatively strong. From above it can be seen that the price of this industry sector is being pulled back to a reasonable valuation point.
Secondly, semiconductor companies and chip companies should pay attention to mergers and acquisitions. In recent years, the capital market has been optimistic about these industries. Numbers of entrepreneurs have poured in making the homogenization competition between enterprises fierce, which will cause some companies facing operational difficulties to seek opportunities of mergers and acquisitions. It is expected that there will be a wave of mergers and acquisitions in the domestic semiconductor industry in the near future.
There are still some shortcomings in this paper. Although the predicted values are consistent with the future trend of the real values, the prediction error is positively correlated with the prediction step size. In addition, a single application of the ARMA model cannot determine the specific reasons that affect the returns. In the follow-up studies, this paper recommends to use the Granger Causality Test to explore the relationship between the rate of returns and other indicators of microeconomics and macroeconomics.