Time Series Forecasting with TensorFlow.js

Pull stock prices from online API and perform predictions using Recurrent Neural Network and Long Short-Term Memory (LSTM) with TensorFlow.js framework.
Time Series Forecasting with TensorFlow.js

Machine learning is becoming increasingly popular these days and a growing number of the world’s population see it is as a magic crystal ball: predicting when and what will happen in the future. This experiment uses artificial neural networks to reveal stock market trends and demonstrates the ability of time series forecasting to predict future stock prices based on past historical data.

Disclaimer: As stock markets fluctuation are dynamic and unpredictable owing to multiple factors, this experiment is 100% educational and by no means a trading prediction tool.

There are 4 parts to this experiment:

  • get historical stocks prices data
  • prepare training data for our neural network model
  • training the neural network
  • do some prediction
  • Also, do check out this repo for the PyTorch version where we dig deeper on the model and the data processing steps.

    Get Stocks Data

    Before we can train the neural network and make any predictions, we will first require data. The type of data we are looking for is time series: a sequence of numbers in chronological order. A good place to fetch these data is the Alpha Vantage Stock API. This API allows us to retrieve chronological data on specific company stocks prices from the last 20 years. You may also refer to this article. that explains adjusted stock prices, which is an important technical concept for working with historical market data.

    You can either pick daily adjusted or weekly adjusted, open/high/low/close/volume values, daily adjusted close values, and historical split/dividend events of the global equity specified, covering 20+ years of historical data. As suggested by desduvauchelle, using adjusted close price is more robust to stock split compared to using closing price.

    The API yields the following fields:

  • open price
  • highest price of that day
  • lowest price of that day
  • closing price
  • adjusted close price (this is used in this project)
  • volume
  • To prepare training dataset for our neural network, we will be using adjusted close stocks price; which also means that we will be aiming to predict future closing price.

    Try It

    Use demo API key to fetch Microsoft Corporation prices or get your own API key for other stocks.

    Demo API key only allows 'MSFT'
    You can claim your API key from alphavantage.co
    Simple Moving Average

    For this experiment, we are using supervised learning, which means feeding data to the neural network and it learns by mapping input data to the output label. One way to prepare the training dataset is to extract Simple Moving Average from that time series data.

    Simple Moving Average (SMA) is a method to identify trends direction for a certain period of time, by looking at the average of all the values within that time window. The number of prices in a time window is selected experimentally. For example, let's assume the closing prices for past 5 days were 13, 15, 14, 16, 17, the SMA would be (13+15+14+16+17)/5 = 15. So the input for our training dataset is the set of prices within a single time window, and label is the computed moving average of those prices.

    Try It

    But first, fetch stocks data from the previous step.

    Train Neural Network

    Now that you have the training data, it is time to create a model for time series prediction, to achieve this we will use TensorFlow.js framework.

    Sequential model is selected which simply connects each layer and pass the data from input to the output during the training process. In order for the model to learn time series data which are sequential, recurrent neural network (RNN) layer layer is created and a number of LSTM cells are added to the RNN.

    The model will be trained using Adam (read more), a popular optimisation algorithm for machine learning. Root-means-squared error which determine the difference between predicted values and the actual values, so model is able to learn by minimising the error during the training process.

    These are the hyperparameters (parameters used in the training process) available for tweaking:

  • Training Dataset Size (%): the amount of data used for training, and remaining data will be used for prediction
  • Epochs: number of times the dataset is used to train the model (learn more)
  • Learning Rate: amount of change in the weights during training in each step (learn more)
  • Hidden LSTM Layers: to increase the model complexity to learn in higher dimensional space (learn more)
  • Try It

    You may tweak the hyperparameters and then hit the Begin Training Model button to train the model.

    Need training data? Explore the previous section to prepare training data.

    Validation

    Now that you have trained your model, it is time to use the model.predict function from TFJS to predicting future values. We have split the data into 2 sets, a subset of the data is training and the rest is the validation set. The training set has been used for training the model, thus will be using the validation set to validate the model. Since the model has not seen the data in the validation set before, it will be good if the model is able to predict values that are close to the exact values.

    Try It

    So let us use the remaining data for prediction which allow us to see how closely our predicted values are compared to the actual values.

    But if the model did not predict values that map closely to its true values, check the training loss graph. Generally, this model should converge with the loss to be less than 1. You can increase the number of epochs, or tweak the other learning hyperparameters.

    Don’t have a model to perform prediction? Train your model.

    Make Prediction

    Finally, the model has been validated and the predicted values map closely to its true values, we shall use it to predict the future. We will apply the same model.predict function and use the last {{input_windowsize}} data points as the input, because that is our window size. This means that, if our training data is increment daily, we will use the past {{input_windowsize}} days as input, to predict the next day.

    Try It

    Don’t have a model to perform prediction? Train your model.

    Conclusion

    Why isn’t my Model Performing?
    The model has never seen similar data in the past. In March 2020, where the market dipped and recovered within a month or two, this has never happened in history. The model is likely to fail to predict drastic changes in stock prices during those periods.
    We can add more features. In a general sense, more features tend to make the model perform better. We can include trading indicators such as Moving average convergence divergence (MACD), Relative strength index (RSI), or Bollinger bands.
    Add even more features. One amazing thing that Alpha Vantage API provides is Fundamental Data. This means that you can also include annual and quarterly income statements and cash flows for the company of interest. Who knows, those features might be useful.
    There could have many other reasons why the model fails to learn and predict. This is the challenge of machine learning; it is both an art and science to build good performing models.

    There are many ways to do time series prediction other than using a simple moving average. Do check out this repo for the PyTorch version where we attempt to predict the stock price instead of the SMA. Other possible future work is to implement this with more data from various sources.
    With TensorFlow.js, machine learning on a web browser is possible, and it is actually pretty cool.
    Explore the demo on Github, this experiment is 100% educational and by no means a trading prediction tool.