Research on the investment strategy of new energy vehicle industry based on multi-factor stock selection

. With the development of the Chinese stock market and the continuous improvement of financial engineering technology, quantitative investment represented by the multi-factor stock selection strategy has been developing and growing in the Chinese stock market. This paper takes the quarterly data of the China Securities New Energy Vehicle Index from January 2017 to March 2021 as the research object, uses the Fama-Macbeth test to select factors, and constructs the regression equation based on the regression method. The regression equation is used to predict the stock return rate of the next quarter through the factor data of each quarter in 2020. The top 7 stocks with returns are selected to construct the investment portfolio in the way of equal-weighted capital allocation, and the investment portfolio is updated quarterly. Through the empirical test, this paper mainly draws the following three conclusions: First, the driving factors of stock return in the new energy vehicle industry mainly include the ratio of long-term debt to working capital (LTDWC), total profit growth rate (TPG), price/earnings to growth ratio (PEG), and equity turnover ratio (ET); secondly, the portfolio return based on the stock selection model in this paper is higher than the average level of the industry, and the investment return obtained is stable, suitable for investment targets; third, the new energy vehicle industry is in a period of rapid development, and the portfolio return rate is higher than the average level of A-share market. Therefore, it has huge potential investment value. This paper provides suggestions and solutions for investors' stock investment in the new energy vehicle industry.


Introduction
The multi-factor stock selection model has become a hot spot in academic research and practical application in quantitative investment strategy due to its high stability and wide coverage. Under the background of the strong randomness of the rise and fall of the Chinese stock market, the main content of this paper is how to combine the specific industries and use the multi-factor stock selection model to provide investors with a higher return portfolio.
The multi-factor stock selection model is mainly based on the three-factor model proposed by Fama and French. It is composed of market factor, value factor, and scale factor, and constantly innovates and develops new factor selection and model construction [1]. Griffin screened the combinations with higher and lower return rankings and obtained momentum factors in the form of return difference, thus constructing the four-factor model. Through empirical research, it is concluded that the momentum effect can achieve better returns no matter what the macroeconomic situation is [2]. Fama and French introduced the concepts of investment factor and profit factor and proposed a new five-factor model by means of expanded independent variables [3]. Zhao Zhihui theoretically deduced the multi-factor stock selection model of binary classification. Combined with market experience and mathematical mining knowledge, he proposed the multi-factor stock selection model of the three-layer filtering mode [4]. Ouyang Zhigang constructed a four-factor model with the historical data of CSI 300 as samples, and verified the reversal effect and dynamic effect of stocks in the trading market, respectively. The research found that the four-factor model with the momentum factor lagging 6 months could be applied to the domestic financial market [5]. Liu Shuai completed the defect data of Shanghai Stock Exchange 50 index from two aspects of A-share stock characteristics and accounting information quality, constructed a new multi factor stock selection model suitable for A-share market, scored more than 500 factors in the market one by one with multi factor strategy, and carried out empirical analysis. The results show that the yield of the new model is far higher than that of the same period index [6].
The multi-factor stock selection model needs to select appropriate factors to judge whether the stock is worth investing in. Research on the construction of factor pool: Liu Yang, Xia Siyu, and Hu Sirui conducted an empirical study on the GARP quantitative stock selection model and quantitative timing strategy based on the constituent stock data of the Shanghai Composite Index from 2012 to 2015. It is found that the growth attribute is not all the driving factors of stock return rate. Only six indicators such as price/earnings ratio and price/book ratio are effective stock selection factors at this stage [7]. Liu Shujun subdivided the market value factor in testing the effectiveness of value investment in A-share market and set the price-earning ratio as the index of the investment value of listed groups and enterprises [8]. Che Yang considered three-factor analysis methods, combined with three methods of stock classification prediction, analyzed the strategy performance of different factor screening methods combined with different stock classification methods [9]. Shi Yue considered the impact of market sentiment, introduced emotional factors, and combined them with fundamental factors to verify the effectiveness of fundamental factors and emotional factors in stock selection [10]. Zhang Tiancheng used the multiple linear regression method to analyze the influencing factors of stock return rate. The results showed significant differences between the independent determinants identified by the multi-factor model and the significant factors found by the single-factor model [11].
For the construction of the stock selection model, the machine learning algorithm is the main research trend. The latest research results show that: Zhang Weinan, Lu Tongyu and sun Jianming take the data of CSI 500 component stocks as the empirical research object, aiming at the prediction ability of the optimization model, and verify that using support vector machine to predict and classify stocks can improve the precision of stock selection [12]. Zhang Hu, Shen Hanlei, and Liu Yecheng constructed a stock selection model based on the multi-head self-attention neural network structure and verified the model's superiority in predicting performance and profitability, and risk by using the sample data of CSI 300 stocks [13]. Zhang Ning, Shi Hongwei, and Zheng Lang studied the application of the deep architecture of PCANet in quantitative stock selection. Two years of actual data backtesting showed that this method could obtain higher excess returns and Sharpe ratio than the traditional linear regression model and deep learning CNN model [14].
At present, the research on the stock selection model tends to take index constituent stocks as the research object. In contrast, the research on multi-factor stock selection model with new energy vehicles as the research object is less. This paper makes up for the above deficiencies. Taking 37 new energy vehicle sector constituent stocks in China's A-share market as the research object, this paper selects a number of financial indicators to construct a factor pool, and on the basis that the current stock selection strategy can beat the average level of the industry, it confirms the rationality and practicability of the model.

Data
The data in this paper are all from the Ricequant platform, and all the constituent stocks (50) of the China Securities New Energy Vehicle Index from January 2017 to March 2021 are taken as the initial samples. Python is used to preprocess the data, excluding ST stocks, restructuring stocks, delisting stocks, and newly entered stocks, as well as stocks with large data missing. There are 37 stocks remained, which are used as research samples. Then, the authors obtain 12 indicators of each stock index in each quarter during the period, including quick ratio, long-term debt to working capital ratio, current ratio, total assets turnover, equity turnover ratio, book-to-market ratio, fixed asset turnover, net asset per share, non-current assets ratio, total profit growth, revenue growth, price/earnings to growth ratio. Eventually, we make these indicators get rid of the extreme value, and the standardization is carried out.

Fama-Macbeth Test
Fama-Macbeth model is used as a single factor screening method. The steps are as follows: (a) The regression coefficient is calculated by linear regression between the single factor value and the stock return with one lag period (b) The t statistic of regression coefficient time series is calculated and compared with the critical value.
In the regression, we use the stock return rate of one lag period as the dependent variable and the factor value of the current period as the independent variable for regression, as shown in formula 1:

yt+1=at+btxt+et
(1) In the formula, yt+1 is the stock yield of the t+1 period, xt is the factor value of the t period, at is the constant term, bt is the coefficient value, and et is the error term.
After calculating the coefficient bt of each period of each factor, we start performing the Fama-Macbeth test. In this case, we need to calculate the t statistic, as shown in formula 2: In this model, the meaning of each index is as follows: μ(bt) is the mean value of bt sequence; σ(bt) is the standard deviation of bt sequence; t is the number of periods. After all the factors are regressed, the t statistic of each factor is calculated. To avoid missing factors as much as possible, the significance level of the two-sided test alpha is determined as 0.1. In this case, the corresponding critical value t is equal to 1.78, which means if the absolute value of the t statistic is greater than 1.78, the factor passes the test.

Regression Method
In the regression method, the stock return with one lag period is used as the dependent variable. The current value of the factor is used as the independent variable to construct the multiple linear regression equation. In the back-testing stage, the multiple linear regression equation is used to calculate the expected rate of return of each stock with the current value of each factor so that we can construct the portfolio. The steps are as follows:

Modeling Process
Considering the correlation between the factors that pass the single factor validity test, we should eliminate similar factors with high correlation. Then, multiple linear regression operation is performed on the residual factors, and the equation is shown in formula 3:

yt+1=β0+β1x1t+β2x2t+⋯+βkxkt+εt
(3) Then, the factor value and the stock return data with one lag period are substituted into the regression equation to calculate the coefficient value βk.

Back-Testing Process
Supposing the regression relationship still holds in the next period. In that case, the factor value of the latest period can be substituted into the regression equation to calculate the expected return of the next period. Then all the stocks are sorted according to the return rate, and the top stocks are selected as the portfolio to hold. In this way, n portfolios can be obtained by testing n periods. If the return rate of portfolios is higher than the average level of all the stocks, the model is proved to be effective.

Results and Discussion
This paper first selects 12 factors from four major categories in the model-building stage as the primary selection factors. The quarterly value of 12 periods from the first quarter of 2017 to the fourth quarter of 2019 and the quarterly return of 12 periods from the second quarter of 2017 to the first quarter of 2020 is selected as the sample data. Among the 12 factors, 6 effective factors pass the Fama-Macbeth test. After the factor correlation test and stepwise regression method, two factors with strong correlation are eliminated, and the remaining four effective factors are used to construct multiple regression equations.
In the empirical test stage, we select the second quarter of 2020 to the first quarter of 2021 as the back-testing period and use the constructed multiple regression equation to screen out the best seven new energy vehicle industry stocks (about 1 / 5 of the research sample) in the holding period. The holding period is four consecutive quarters. This paper calculates the return of the dynamic portfolio. It compares it with the average return of China Securities' new energy vehicle index and the average return of the A-share market. At the same time, it carries out a backtest on the stock portfolio and calculates its maximum withdrawal rate and sharp ratio to verify the model's effectiveness.

Determining Class Factors
Considering the current situation of the new energy vehicle market and economic logic, we finally determine to select four categories of factors (debt repayment factor, asset operation factor, value factor, and growth factor) for research to ensure that the factors with strong prediction ability of stock return can be selected in the next step.

Determining Subdivision Factors
After four categories of factors closely related to the new energy vehicle industry are determined, three representative factors are selected for each category of factors, which means 12 factors in total are considered. They are shown in Table Ⅰ: Finally, a total of 6 factors pass the single factor test. They are Long Term Debt to Working Capital ratio (LTDWC), Total Assets Turnover ratio (TAT), Equity Turnover ratio (ET), Non-Current Assets ratio (NCA), Total Profit Growth rate (TPG), Price/Earnings to Growth ratio (PEG).

Redundancy Factor Elimination
In view of the fact that there is more than one factor of the same type, there will inevitably be a large correlation among the factors, and the effects of these factors are similar. To avoid it, we should perform a correlation test among the factors that pass the FM test to eliminate the factors with high correlation according to the results. The results are shown in Table Ⅲ: According to the correlation coefficient table above, we can find that the correlation between TAT and ET is very large, and the correlation among other different types of factors is relatively small. Therefore, we should eliminate one of these two factors. The regression results between the two factors and the stock return with one lag period respectively are as follows: According to the regression test results of the two single factors in the above table, ET is better, so TAT is eliminated. After excluding TAT, the remaining 5 factors are selected.

Regression Results
Through the single factor test and redundancy factor elimination, we have five factors left. After the panel data stability test, the panel data of the five factors are stable. Next, the stepwise regression method is used to eliminate the factors that are not obvious and build a multi-factor stock selection model. The regression results are as follows: Factors remaining in the equation: LTDWC, TPG, PEG, ET. Among them, the t value and P value of LTDWC are respectively -2.31 and 0.01. The t value and P value of TPG are respectively 2.12 and 0.02. The t value and P value of PEG are respectively -1.93 and 0.05. The t value and P value of ET are respectively 1.95 and 0.04.
Finally, factors that pass the test are LTDWC, TPG, PEG, and ET. The coefficient of each factor is shown in Table Ⅴ:(C is a constant)

Build a dynamic stock portfolio
This paper assigns an equal weights method to each stock and combines the multiple linear regression equation with the corresponding data of each quarter to form a dynamic stock portfolio. This combination will be held from the second quarter of 2020 to the first quarter of 2021, with a total holding period of 4 quarters. In other words, this dynamic investment portfolio consists of 4 static investment portfolios. As time goes by, the investment portfolio composition in the different quarters will also be adjusted and changed accordingly. The specific changes are as follows:

Portfolio 1(held in the second quarter of 2020)
Substituting the actual factor values in the first quarter of 2020 into the equation for regression and obtaining the expected return of these stocks in the second quarter of 2020. After ranking the stocks' expected return rate from high to low, use the top 7 stocks to construct the initial stock portfolio 1. The stock selection results are ranked according to the top seven of the expected return, as shown in Table Ⅵ: The 7 stocks of portfolio 1 should be opened in the second quarter of 2020.

Portfolio 2(held in the third quarter of 2020)
Substituting the actual factor values for the second quarter of 2020 into the equation for regression, the expected return of these stocks in the third quarter of 2020 is obtained. At this time, the situation of the top 7 stocks in the expected return rate has changed, and the stocks that should be held have also changed accordingly. The top 7 stocks with expected returns in the third quarter of 2020 are shown in Table Ⅶ: At this time, GOTION, NBSS, SUNWODA, GANFENG LITHI, CHANGAN AUTOM, and CHANGAN AUTOM, which have fallen out of the top seven in return, should be closed. YTCO, DFAC, WOLONG ELECTR, CAPCHEM, and Yahua Group will be in the top seven as a new stock held in the third quarter of 2020.
The 7 stocks in the above portfolio 2 should be opened in the third quarter of 2020.

Portfolio 3(held in the fourth quarter of 2020)
Substituting the actual factor values in the third quarter of 2020 into the equation for regression, the expected return of these stocks in the fourth quarter of 2020 is obtained. The top 7 stocks have changed. The top 7 stocks with expected returns in the fourth quarter of 2020 are shown in Table Ⅷ: Only the two stocks of Yahua Group and WOLONG ELECTR ranked in the top 7. It is necessary to keep these two stocks and liquidate all the remaining five poorly performing stocks. At the same time, NBSS, DMEGC, Zhong Ke San, LEAD INTELLIG, and SAIC MOTOR, expected to grow rapidly in return, should be the new stocks held in the fourth quarter of 2020.
The 7 stocks of portfolio 3 should be opened in the fourth quarter of 2020.

Portfolio 4(held in the first quarter of 2021)
Substituting the actual factor values in the fourth quarter of 2020 into the equation for regression, the expected return of these stocks in the first quarter of 2021 is obtained. At this time, the top 7

FIBA 2022
Volume 26 (2022) 944 stocks have changed, and the stocks that should be held have also changed accordingly. The top 7 stocks with expected returns in the first quarter of 2021 are shown in Table Ⅸ: At this time, it should continue to retain LEAD INTELLIG SAIC MOTOR, and NBSS in the investment portfolio and liquidate all the remaining stocks whose expected returns have fallen out of the top seven. Correspondingly, EASPRING, XTC, SUNWODA, and GEM, expected to increase in return, should be regarded as new stocks held in the first quarter of 2021. The 7 stocks of portfolio 4 should be opened in the first quarter of 2021.

Comparison of the investment portfolio
Calculate the portfolio income through the equal weight method, and obtain the comparison of the stock investment portfolio return and the average return of the new energy vehicle market, as shown in Table Ⅹ:   It can be seen that the dynamic portfolio constructed in this article can outperform the average return of the new energy vehicle industry in the four holding quarters and obtain a certain amount of excess return.
We use the same method to compare the stock portfolio return and the average return of the new energy vehicle index. This article uses the CSI 300 quarterly return index to calculate the average return of the A-share market, as shown in Table Ⅺ: A more direct comparison trend chart can be drawn from the data in the above table, as shown in Figure 2:  Figure 2, the dynamic portfolio constructed in this article outperformed the average return of the A-share market in the first three holding periods (the second quarter of 2020, the third quarter of 2020, and the fourth quarter of 2020). In the last period (the first quarter of 2021), it lost to the market (-3.13%) with a slightly larger loss (-9.53%). Although the investment portfolio does not outperform the market in every period, its overall return is much greater than the average return of the A-share market, and its effectiveness is still excellent. In addition, due to the promotion of national policies, the steady development of the hydrogen fuel supply system, and the leadership of the three major markets of China, Europe, and the United States, the new energy vehicle industry is in a golden age of rapid growth in demand. The industry predicts that pure electric vehicles will become sales within 15 years. The mainstream of vehicles, so the potential investment value of the new energy vehicle industry is huge.

Fig. 3 Portfolio backtest results
The author backtested the stock portfolio through Python. As shown in Figure 3, the investment portfolio has been adjusted four times at the beginning of the four quarters (the second quarter of 2020, the third quarter of 2020, the fourth quarter of 2020, and the first quarter of 2021). In this backtest interval, the maximum return rate of the stock portfolio constructed in this article reached 128.26%, the Sharpe ratio, and the maximum drawdown rate was 2.2718 and 17.57%, which performed well. It shows that the return on the investment portfolio is considerable and relatively stable, and it is suitable as an investment target for investors.

Analysis of empirical test results
The return of the stock selection portfolio constructed in this article is better than the average performance of the new energy vehicle industry and the average performance of the A-share market. Although there are individual holding periods lower than the market performance, the difference is small and can be ignored. Suppose we hold the stock mentioned above portfolio for a long time. In that case, we can get a good excess return. The effect is more significant, which shows that the multifactor stock selection model in this article has certain practicality in the new energy vehicle industry and can be used in the real trading market.

Conclusion
In this paper, a total of 12 representative factors from four categories of factors (debt repayment factor, asset operation factor, value factor, and growth factor) are selected. Fama-Macbeth model and multiple linear regression analysis are used to screen the factors, and a stepwise regression method is used to further eliminate the factors without obvious performance. Finally, a multivariate linear regression equation containing four factors is obtained to form the stock investment strategy. In the empirical test, we select the investment portfolio constructed by the stocks of seven new energy vehicle companies with the best performance during the holding period, compare its returns with the average returns of the new energy vehicle industry and A-share market, and then analyze the Sharpe ratio and the maximum drawdown rate of the investment portfolio. The main conclusions of this paper are as follows: (1) In the new energy vehicle industry, the 12 fundamental factors that reflect the attributes of corporate debt paying, asset operation, value, and growth are not all the driving factors of stock return. Only long-term debt to working capital ratio (LTC), total profit growth rate (PGR), P/E ratio to earnings growth ratio (PEG), equity turnover ratio (ET) passed the test and are effective stock selection factors.
(2) The return rate of the stock portfolio constructed by the multiple linear regression model established in this paper has an overall advantage over the average return rate of stocks in the new energy vehicle industry and the A-share market. Combined with the Sharpe ratio, maximum drawback rate, and other indicators, the effectiveness of this model in the stock selection of the new energy vehicle industry is illustrated.
(3) The investment return of the new energy vehicle industry is significantly higher than the average level of the stock market. Combined with the policy support and technical improvement at home and abroad, the industry has huge potential investment value and is an appropriate investment choice for investors.