Exploring the application of machine learning and python model lstm in predicting Tesla stock price

. Machine learning involves many fields such as probability theory, statistics, and algorithm complexity theory. Using existing algorithms and models to make judgments or predictions without clear instructions, relying on existing data, patterns, and reasoning. It is a subset of Artificial Intelligence that includes statistical techniques and knowledge to deal with a dataset with vast amounts of data. It can be applied to our daily life, such as investment decisions of stocks, options futures and other derivatives. In this article, the author will predict price performance by using a just-in-time LSTM model. The goal of the mission is to use past historical price data to make accurate predictions about Tesla's stock price over the future. Due to the characteristics of Tesla's case of high-volatility stocks, this problem is extremely challenging and has become a popular research topic in stock price forecasting. From a machine learning perspective, this is a good, critical but challenging prediction problem. That's because Tesla is already a leader in electric vehicle (EV) production. The linear regression model and LSTM (long and short-term model) model are respectively used to forecast Tesla's stock price. The RMSE value is used to evaluate the accuracy of the two models and the results are obtained: the accuracy of the LSTM model prediction is much higher than that of a linear regression model.


Introduction
There are various methods and models to forecast the future price of a stock, for example, the Linear regression model, the random forest model, the LSTM model, and so on. Machine learning also has many subdivisions, for example, supervised learning learns a function and trains the data with a certain purpose, the result should be imagined and the user knows what the result will be. Enables the model to make a good prediction about its corresponding output for any given input. Typical examples are KNN and SVM. Unsupervised learning is training without a purpose, Users cannot predict the results in advance. Data needs to be marked in supervised learning but not in unsupervised learning.
Tesla's car production is in a leading position. Not only that, but it also studies new energy, autonomous driving, the ability to integrate software, hardware, and services, and artificial intelligence. So Tesla is a complicated company. Today, the company is worth $846.53 billion. However, there are financial reports that put Tesla's price target at $1,300. There are clear financial motivations for applying analytical tools to stock market data, to generate profit and avoid investment risk as possible, so it will be a helpful tool to help the traders make better decisions during their personal investment. The complexity of Tesla stock and the inaccuracy of stock valuation brings challenges to the prediction of stock price, so it is more meaningful to predict Tesla stock price. Therefore, the research topic of this project is to use different models in machine learning to predict the stock price of Tesla. The research content is to use the existing data sets from 2010 to 2017 to make short-term predictions on Tesla's stock price by building a simple linear regression model and a high complexity LSTM model. After the prediction results are obtained, the results are analyzed according to the drawn curves and accuracy.
The remainder of the paper is organized as follows: Section 2 provides the source of the data set, describes the theoretical knowledge of several alternative methods, and describes their advantages and disadvantages; Section 3 builds linear regression and the LSTM model and forecasts the stock price; Section 4 analyze the prediction results through the prediction accuracy and RMSE value.

Dataset
The selected data set comes from Yahoo Finance. The period represented is from June 29, 2010, to March 17, 2017. The analysis of stock price generally converts the data set into OHLC. O is the open price, H and L are the highest and lowest prices, respectively, and C is the close price. Volume is the stock trading volume, and adj close is the adjusted closing price.1692 rows and 7 columns in total.

Linear Regression
The regression prediction method is a prediction method based on the correlation principle, which is the regression analysis in mathematical statistics. Application of the method in prediction. Starting from the causal relationship between market phenomena, through the establishment of regression prediction models [1]. In economic forecasting, people usually regard the prediction object as a dependent variable and the factors related to the prediction object as independent variables. By collecting sufficient independent variable data, the regression formula is obtained by correlation analysis and regression analysis and uses the regression equation to predict [2]. In regression prediction, the prediction object y is a random variable, and the common variable X is called an independent variable [3]. The independent variables in the correlation analysis are random. Since the regression analysis method has a relatively strict theoretical basis and a relatively mature analysis and calculation method. All kinds of social phenomena are generally related to some factors in varying degrees, so regression prediction methods are widely used in the prediction of the stock market [4].

KNN
The future price of a stock may be predicted using a variety of techniques and models, including the random forest model, the LSTM model, the linear regression model, and others. There are several forms of machine learning, such as supervised learning, which teaches a function and trains the data for a specific goal, the result of which can be predicted by the user. makes it possible for the model to accurately anticipate the appropriate output for any given input. Examples of this include KNN and SVM. Unsupervised learning is training without a goal; the user cannot predict the outcome in advance. Data labeling is necessary for supervised learning but not for unsupervised learning.

RNN
A recursive neural network (RNN) is a subset of the recurrent neural network, which uses sequence data as input and recurses in the evolution direction of the sequence, and connects all of its nodes (also known as recurrent units) in a chain to create a closed loop. (recursion-based neural network).
Applications and use cases for RNN have: (1) Text generation: similar to the fill-in-the-blank question above, it predicts the word that should go in the blank based on the context.
(2) The outcomes of translation are directly impacted by the word order, which is another common sequence difficulty in machine translation activity.
(3) Determine the text that corresponds to the input audio using speech recognition.
(4) Create an image description: Giving a picture can convey the contents of the picture, much like looking at a photo and talking about it. This frequently combines CNN and RNN.
RNN's ability to process sequential data effectively sets it apart from other technologies. For instance, stock price patterns, voice and audio quality, and article content. He is able to handle sequence data because each input in a series will have an impact on the output that comes after it, which is the same as having a "memory function." However, RNN has significant short-term memory issues and minimal effect from long-term data (even if it is important information). As a result, RNNbased variant algorithms like LSTM and GRU emerged. The following are the primary traits of these variation algorithms: (1) Long-term information can be effectively retained (2) Only important knowledge should be kept, whereas unimportant information is "forgotten." RNN is actually a deep neural network with all layers having the same weight. It is not easy to learn and store information for a long time. The problem of authentication can be effectively solved by adding network storage. LSTM with a special implicit unit is used to save input for a long time [5].

LSTM
LSTM has a "gate" structure. It controls whether the data is updated or lost through the logic of the gate unit. It can make the network converge better because it overcomes the disadvantage that RNN is easy to cause gradient disappearance, and can effectively improve the prediction accuracy. LSTM has three logic gates, namely forgetting, input, and output gates, to determine the memory and forgetting of information at each moment [6]. It is widely used for deep learning. It uses the current deep neural-inspired structure as a method to solve the corresponding differential equations so that we can stabilize large models with low computational costs [7]. The following traits of LSTM models make them particularly well suited for processing and forecasting financial time series data: (1) Deep learning models are not constrained by dimensions and can incorporate all feature data associated with dependent variables into the model for improved performance; (2) It is ideal for fitting time series data with significant irregular fluctuations and has good nonlinear fitting capabilities; (3) It is difficult to fall victim to model overfitting and local optimization; (4) It requires less manual labor than typical machine learning and linear regression. Through the use of multi-layer neural networks, which have more robust deep network expression capabilities, create features and extract hidden features from data.

Comparison between RNN model and LSTM model
Unlike RNN, which can just solve short-term dependencies, both short-term and long-term dependencies can be solved by LSTM. Therefore, in the final stage of this project, a simpler linear regression model and a more complex LSTM model were selected for use. By using these two methods to analyze and forecast stocks, the accuracy and error of the two models are discussed and summarized.
Long Short-Term Memory has a special function to figure out long-term dependencies. LSTM is introduced by Hochreiter & Schmidhuber [8]. It has a "gate" structure. It decides whether to update or discard data through the logic control of the gate unit. It conquers the shortcoming of RNN that the weight of RNN is too large, which is easy to cause the problem of gradient disappearance and explosion, so that the network can converge faster, and can effectively improve the prediction accuracy. LSTM has three gates to determine whether to memorize or forget information at each time node. The input gate determines how much new information is added to the cell, the forgetting gate controls whether the information will be forgotten at each moment, and the output gate determines whether the information is output at each moment. Time series prediction is a kind of difficult prediction problem.
LSTM makes a big change which is to employ flexible logic and break the rigid logic -only important information is retained. Because of the gradient's disappearance, RNN can only have shortterm memory. LSTM networks connect short-term and long-term memory through exquisite gate control, and since the LSTM model has an advanced memory capacity, it can learn long-term dependent information. Remembering long-term information is in practice the default behavior of LSTMs, not a very expensive ability to acquire. All RNNs have a form of a chain of repeating neural network modules. It predicts the model of future values by pre-existence or data patterns. More crucially, the inclusion of memory and vital elements in the time sequence prediction is what makes it more efficient.

Operating principles of LSTM
In order to reduce the gradient explosion and disappearance probability, LSTM introduces the sigmoid function by using three gates, input gate, forgetting gate, and output gate with the tanh function.
LSTM has three gates, as shown in Fig.1 below: the input gate, the forget gate, and the output gate. The forget gate can decide which information is not important for the whole predicting process, and therefore, throw these data away from the cell state. The forget gate uses the binary system, if the value of the forget gate is 1, then the model will keep this information stored, however, if it is 0, it means that this data can be forgotten. The cell gate is comparable to a conveyor belt. With only a few linear exchanges, it operates directly on the whole chain. It will be simple for unaltered information to spread. It is not feasible to add or remove data if there is simply a horizontal line above. but via a structure known as gates. Through a sigmoid neural layer and point-by-point multiplication, the gate may selectively allow information to pass. A real integer between 0 and 1 represents the proportion that permits the related information to pass for each element of the sigmoid layer output, which is a vector. For instance, the values 0 and 1 denote "no information is permitted to flow" and "all information is allowed to pass," respectively. Fig.1 to Fig.4 shows the key components of LSTM and how it works.

Background and related works
Firstly, we collected the past data for Tesla price and volume from Kaggle spanning from 29th June 2010 to 3rd February 2020. Then we used the Pandas data frame to read and store the data. After that, we imported the plotting library in order to represent and visualize the historical data. The graph below is the result of the visualization.
Data pre-processing is needed before training the data, for example, we separated the stock price into training datasets X_train & Y_train, as well as testing sets X_test and Y_test. The training data set contained about 85% of the total data, and the remaining 15% would be the test data set so that the look-ahead bias will be smaller, this ratio is used to form a more accurate result.
After that, we normalize the data ranging from 0 to 1 using the MinMaxScaler function and then create the training and testing dataset separately. Therefore, two data frames are needed to store the close price column and convert it into NumPy arrays. Finally, the LSTM model can be built and trained.
Later, we imported the model I want to use, in this case, the Long Short-Term Memory LSTM model. LSTMs are commonly used in sequence prediction, it is one of the most effective models since it stores past data and ignores the information that is not important for future prediction.

Time Series & Stationarity
A time series is a collection of data points, each of which has a timestamp. Taking, as an illustration, the price of a stock on the stock exchange during the day. The stock price is represented by the time sequence data that was used to train the model of a neural network to learn the mode from the trend of the existing data. We'll presume that our data are stationary. If the process of generating a time series is considered stable, it means that its statistical characteristics do not change over time. The characteristics of the stable time series are independent of the observation time. On the other hand, a stationary time series often won't exhibit any discernible patterns over time. But the LSTM model can manage the prediction in the long term. There are two main purposes of time-series analysis: (1) To identify the nature of the observation sequence (2) To accurately predict or foresee time-series variable There are two components of the Time series [9]: (1)Trend: a broad course in which something evolves or transforms. An overall price gain characterizes an uptrend in the stock market, whereas a price fall signifies a downturn.
(2)Seasonality: Predictable pattern or change that occurs on a regular basis. For example, the data pattern could be repeated every 3 months, 12 months...

Build the model and Results
There are many models that can realize stock price prediction, For example, relatively simple and traditional machine learning models -linear regression models, RNN (cyclic neural network), and more complex cyclic neural networks, LSTM models, etc. RNN has no cellular state; The LSTM remembers information through cell states. The RNN activation function only has TANH. RNNS can only handle short-term dependencies. Therefore, in the final stage of this project, a simpler linear regression model and a more complex LSTM model were selected for use. By using these two methods to analyze and forecast stocks, the accuracy and error of the two models are discussed and summarized.

Linear Regression
First, a relatively simple linear regression model was used to predict the stock price in the next 30 days. After importing the data, first, a tag is created. The tag value is the closing price in the next 30 days. Then a series of preprocessing and standardization is carried out for the data. Separate the training set from the test set. After preprocessing the data, build a fitting model. In figure7, it can be seen that our accuracy rate is 92%. The accuracy rate is very high, but it may not be enough for the stock forecast. Finally, use matplotlib to draw the prediction curve. Use functions such as the for() loop and DateTime to import 30 days of data one by one. The prediction curve is shown in figure7. The blue part is the predicted part for the next 30 days.

LSTM
The previous data pretreatment and feature vector creation stages are essentially the same when using an LSTM model as they are when using a linear regression model. However, LSTM must employ Keras and the sequential model inside while creating the model. Some network layers should be stacked. To build a model, use Add(). We utilize the model's Compile() method to compile it after it has been built. Following the compilation of the model (as shown in figure 8 below), we train the network by performing a certain number of iterative training sessions on the training set in accordance with the batch. The difference in average between the extrapolation values and the actual values when the LSTM model is used is not terrible. The value of RMSE while using the LSTM model is 32.088 (as shown in figure 9 below). The outcome of applying the LSTM model to forecast the stock price is shown in Figure 10. The first 85% of the front blue area is designated as the training portion. It is clear that our prediction is more accurate since the orange line and the green prediction line of the valid set essentially coincide. For the prediction, I calculated the RMSE from the equation below, which refers to the Root Mean Square Error. It is a decent indicator of how effectively the model predicts the reaction of the RMSE. [10] In this dataset and LSTM model applied project, the RMSE equals 32.08852652036192. The efficiency or correctness of the model may be clearly assessed using RMSE since it allows us to depict the difference between estimated and actual values of a model parameter. With a smaller RMSE, the model is more able to predict or match the dataset.

Conclusion
Using these two models, we discovered that overfitting is common but that it can be prevented by dimension reduction. However, the Linear Regression model oversimplifies problems in the real world by identifying a linear relationship among the variables, it is not recommended for the majority of real-world applications. The cells in the LSTM model add the long-term memory in an even more standards-compliant approach because it permits the learning of more parameters. This makes it the most effective method for predicting, particularly when your data show a long-term trend. It thus makes it the most effective method for predicting, particularly when your data show a long-term trend. Furthermore, in machine learning, RMSE is frequently used to assess how well regression models perform. The RMSE values obtained by using linear regression and LSTM are shown in Figure.8. Figure.8 displays the RMSE values produced through the use of LSTM and linear regression. A quality output of a model should have a low RMSE value, the lower the RMSE, the closer the prediction is to the real-life situation. However, compared to the RMSE which is 32.088 in the LSTM model, the RMSE of linear regression is roughly 151. Last but not least, we can conclude that the LSTM model still has a much higher prediction accuracy than Linear regression [10][11]. This shows that although the construction of the LSTM model is more complex (short-term memory network), its accuracy is much higher than that of a linear regression model. If you want to make a simple prediction, you can choose a simpler linear regression model.
In this project, we only apply the Linear regression model and LSTM model with RMSE, however, in order to see the accuracy of prediction, more python machine learning models can be used to compare to each other, for example, the random forest model, Time Series Analysis Method depending on forecasts or projections of data, or Graphed-based approach seeing the stock market as a network of different components affecting each other and the price. This is the point that can be developed if we do this project again.