登录查看更多内容

How to create an RNN (Recurrent Neural Network) capable of predicting the behavior of the stock markets and cryptocurrencies

Juan David Tuta Botero

Data Science | Machine Learning | Artificial Intelligence

发布日期: 2022年5月11日

In this article, we will see how to build an artificial intelligence using an RNN (Recurrent neural network) to predict the price of cryptocurrencies in the future. Due to the highly marketable behavior of these assets, the precision of the network is not optimal. Still, it allows us to have an academic approach for creating more-complex networks with different uses, whether they are buying and selling shares, weather events, or even biological processes.

Time Series Forecasting

First of all, we are going to try to understand the mathematical foundation for the creation of these methods. For that, we need to define what is a time series. It is a series of?data points?indexed (or listed or graphed) in time order. Most commonly, a time series is a?sequence?taken at successive equally spaced points in time. Thus it is a sequence of?discrete-time?data. Examples of time series are heights of ocean?tides, counts of?sunspots, and the daily closing value of the?Dow Jones Industrial Average.

Time series?analysis?comprises methods for analyzing time-series data to extract meaningful statistics and other characteristics of the data.?Time series?forecasting?is the use of a?model?to predict future values based on previously observed values. While?regression analysis?is often employed in such a way as to test relationships between one or more different time series, this type of analysis is not usually called "time series analysis", which refers in particular to relationships between different points in time within a single series.?Interrupted time series?analysis is used to detect changes in the evolution of a time series from before to after some intervention which may affect the underlying variable.

Preprocessing Data

For this guide, we are going to use a cryptocurrency database in a time range from 2012 to 2018 you can download it from this link. The first thing is to see our database and understand what type of data is relevant for our analysis and what is not, in addition to re-interpreting them so that the network can understand it much better. In this database, we observe several columns that have the following meanings.

The start time of the time window in Unix time
The open price in USD at the start of the time window
The high price in USD within the time window
The low price in USD within the time window
The close price in USD at the end of the time window
The amount of BTC transacted in the time window
The amount of Currency (USD) transacted in the time window
The volume-weighted average price in USD for the time window

Clean the database

The first step is to see what data is useful and what is not, a first look at the database we can see several fields that are filled with Nan, these values mean that they were not reported, and these values could affect our neural network in a representative way, then we can decide to do 3 things.

No hay texto alternativo para esta imagen

1.We can make all the unreported values the mean of their respective data groups. The big problem with this solution is that when plotting the values, for example, Open, High and low, we see that the initial values have a mean well below the final value which would generate a significant deviation in the first values as well as the final ones, therefore this solution is discarded.

2. The second proposal is to make these values 0, or leave them as they are, if we convert them to 0 this could seriously alter the interpretation of the network in these spaces since it would represent null values while if we use the np.nan this takes a very small negative value which despite the fact that in financial mathematics if there are negative values this is not the case

3. The third option is to eliminate the non-stored data or to do interpolation to find them. This second idea is not very good either because we do not know what happens in these ranges and, knowing the highly changing behavior of these actions, it is more likely that we will fall into errors that may affect our network so the best decision is to eliminate them using the next function.

def?removing_empty_data(data_frame)
??"""
??Function?that?remove?the?rows?with?empty?values
??Args:
????data_frame:?Is?a?numpy.dataframe?with?cripto's?price?information
??Return
??"""
??columns?=?data_frame.columns
??NaN?=?data_frame[columns[1]]?>?0
??new_data_frame?=?data_frame.loc[NaN]


??data_frame.reset_index(inplace=True)
??ddata_framef.drop(columns=["index",?"Unnamed:?0"],
????????????????????inplace=True)


??return?new_data_frame

If we applied the recently created function to our dataset we are going to see the new filtered data

df = removing_empty_data()
df

Great now we have all our data cleansed we are going to see some information about our data using the method described by pandas.dataframe.

df.describe().transpose()

Well, seeing this information there is nothing rare to come to mind, like crazy min values on each of the columns o anything else, but it is a good practice to check this kind of information before doing anything else.

Better interpretation

Now we must remember that this information is going to pass through a machine and we must help him as much as we can to have a global understanding of the variables we are using. First, in the field of prediction, there is an interesting concept of seasoning and this used to happen in many cases for example the stocks in the market, one simple example would be a company that sells winter clothes, probably the company would have the better numbers when we are in winter than in summer, we will try to apply this concept to our cryptocurrency problem using the Fourier transform.

fft?=?tf.signal.rfft(df['Open']
f_per_dataset?=?np.arange(0,?len(fft))


n_samples_h?=?len(df['Open'])
hours_per_year?=?24*365.2524
years_per_dataset?=?n_samples_h/(hours_per_year)


f_per_year?=?f_per_dataset/years_per_dataset
plt.step(f_per_year,?np.abs(fft))
plt.xscale('log'))

Due to the erratic behavior of the market we see that there are no specific points where we see a great trend, rather we see a greater escalation as time goes by, this is to be expected because the price of cryptocurrencies depends a lot on speculation. If we wanted to see, for example, the image of a company that sells winter clothing, we would have a graph more or less of this type, which indicates that we have a peak or a frequency of one year and another of one day.

And if it is the case to find a graph of this style for our network, it is much easier to use sine and cosine parameters than a total value, so we could make the following change.

day = 24*60*6
year = (365.2425)*day

df['Day sin'] = np.sin(timestamp_s * (2 * np.pi / day))
df['Day cos'] = np.cos(timestamp_s * (2 * np.pi / day))
df['Year sin'] = np.sin(timestamp_s * (2 * np.pi / year))
df['Year cos'] = np.cos(timestamp_s * (2 * np.pi / year))

plt.plot(np.array(df['Day sin'])[:25])
plt.plot(np.array(df['Day cos'])[:25])
plt.xlabel('Time [h]')
plt.title('Time of day signal')0

In the end, we would have some time graphs of this style.

Since this is not the case, we continue. It is good to create and separate the training sets. We will use a split (70%, 20%, 10%) for the training, validation, and test sets. Note that the data is not randomly shuffled before splitting. This is for two reasons:

It ensures that it is still possible to split the data into consecutive sample windows. It ensures that the validation/test results are more realistic and are evaluated on the data collected after the model is trained.

column_indices=?{name:?i?for?i,?name?in?enumerate(df.columns)


n?=?len(df)
train_df?=?df[0:int(n*0.7)]
val_df?=?df[int(n*0.7):int(n*0.9)]
test_df?=?df[int(n*0.9):]


num_features?=?df.shape[1]

It is important to scale features before training a neural network. Normalization is a common way to do this scaling: subtract the mean and divide by the standard deviation of each feature. The mean and standard deviation should only be calculated using the training data so that the models do not have access to the values in the validation and test sets. It is also arguable that the model should not have access to future values in the training set during training, and that this normalization should be done using moving averages. That's not the focus of this tutorial, and the test and validation suites ensure you get (somewhat) honest metrics. So for the sake of simplicity, this tutorial uses a simple average.

train_mean?=?train_df.mean(
train_std?=?train_df.std()


train_df?=?(train_df?-?train_mean)?/?train_std
val_df?=?(val_df?-?train_mean)?/?train_std
test_df?=?(test_df?-?train_mean)?/?train_std)

df_std?=?(df?-?train_mean)?/?train_st
df_std?=?df_std.melt(var_name='Column',?value_name='Normalized')
plt.figure(figsize=(12,?6))
ax?=?sns.violinplot(x='Column',?y='Normalized',?data=df_std)
_?=?ax.set_xticklabels(df.keys(),?rotation=90)

Now take a look at the distribution of features. Some features have long tails, but there are no obvious bugs like unrealistic values like np.nan can get.

Data windowing

The models in this tutorial will make a set of predictions based on a window of consecutive samples from the data. The main features of the input windows are:

The width (number of time steps) of the input and label windows.
The time offset between them.
Which features are used as inputs, labels, or both?

This tutorial builds a variety of models (including Linear, DNN, CNN, and RNN models), and uses them for both:

Single-output, and?multi-output?predictions.
Single-time-step?and?multi-time-step?predictions.

This section focuses on implementing the data windowing so that it can be reused for all of those models.

Depending on the task and type of model you may want to generate a variety of data windows. Here are some examples:

For example, to make a single prediction 24 hours into the future, given 24 hours of history, you might define a window like this:

The rest of this section defines a?WindowGenerator?class. This class can:

Handle the indexes and offsets as shown in the diagrams above.
Split windows of features into?(features, labels)?pairs.
Plot the content of the resulting windows.
Efficiently generate batches of these windows from the training, evaluation, and test data, using?tf.data.Datasets.

?Indexes and offsets

Indexes and offset start by creating the?WindowGenerator?class. The?__init__?method includes all the necessary logic for the input and label indices. It also takes the training, evaluation, and test DataFrames as input. These will be converted to?tf.data.Datasets of windows later.

class?WindowGenerator()
??def?__init__(self,?input_width,?label_width,?shift,
???????????????train_df=train_df,?val_df=val_df,?test_df=test_df,
???????????????label_columns=None):
????#?Store?the?raw?data.
????self.train_df?=?train_df
????self.val_df?=?val_df
????self.test_df?=?test_df


????#?Work?out?the?label?column?indices.
????self.label_columns?=?label_columns
????if?label_columns?is?not?None:
??????self.label_columns_indices?=?{name:?i?for?i,?name?in
????????????????????????????????????enumerate(label_columns)}
????self.column_indices?=?{name:?i?for?i,?name?in
???????????????????????????enumerate(train_df.columns)}


????#?Work?out?the?window?parameters.
????self.input_width?=?input_width
????self.label_width?=?label_width
????self.shift?=?shift


????self.total_window_size?=?input_width?+?shift


????self.input_slice?=?slice(0,?input_width)
????self.input_indices?=?np.arange(self.total_window_size)[self.input_slice]


????self.label_start?=?self.total_window_size?-?self.label_width
????self.labels_slice?=?slice(self.label_start,?None)
????self.label_indices?=?np.arange(self.total_window_size)[self.labels_slice]


??def?__repr__(self):
????return?'\n'.join([
????????f'Total?window?size:?{self.total_window_size}',
????????f'Input?indices:?{self.input_indices}',
????????f'Label?indices:?{self.label_indices}',
????????f'Label?column?name(s):?{self.label_columns}']):

Here is the code to create the 2 windows shown in the diagrams at the start of this section:

w1 = WindowGenerator(input_width=24, label_width=1, shift=24
? ? ? ? ? ? ? ? ? ? ?label_columns=['Open'])
w1

2.?Split

Given a list of consecutive inputs, the?split_window?method will convert them to a window of inputs and a window of labels. The example?w2?you defined earlier will be split like this:

领英推荐

30 Features that Dramatically Improve LLM Performance…

Vincent Granville 7 个月前

AI's Ascent from Dreams to Reality

Hussein Hallak 1 年前

How to Fix a Failing Generative Adversarial Network

Vincent Granville 1 年前

This diagram doesn't show the?features?axis of the data, but this?split_window?function also handles the?label_columns?so it can be used for both the single output and multi-output examples.

# Stck three slices, the length of the total window
example_window = tf.stack([np.array(train_df[:w2.total_window_size]),
? ? ? ? ? ? ? ? ? ? ? ? ? ?np.array(train_df[100:100+w2.total_window_size]),
? ? ? ? ? ? ? ? ? ? ? ? ? ?np.array(train_df[200:200+w2.total_window_size])])

example_inputs, example_labels = w2.split_window(example_window)

print('All shapes are: (batch, time, features)')
print(f'Window shape: {example_window.shape}')
print(f'Inputs shape: {example_inputs.shape}')
print(f'Labels shape: {example_labels.shape}').

Typically, data in TensorFlow is packed into arrays where the outermost index is across examples (the "batch" dimension). The middle indices are the "time" or "space" (width, height) dimension(s). The innermost indices are the features.

The code above took a batch of three 7-time step windows with 19 features at each time step. It splits them into a batch of 6-time step 19-feature inputs, and a 1-time step 1-feature label. The label only has one feature because the?WindowGenerator?was initialized with?label_columns=['Open']. Initially, this tutorial will build models that predict single output labels.

3.?Plot

Here is a plot method that allows simple visualization of the split window:

w2.example = example_inputs, example_labels

def plot(self, model=None, plot_col='T (degC)', max_subplots=3)
? inputs, labels = self.example
? plt.figure(figsize=(12, 8))
? plot_col_index = self.column_indices[plot_col]
? max_n = min(max_subplots, len(inputs))
? for n in range(max_n):
? ? plt.subplot(max_n, 1, n+1)
? ? plt.ylabel(f'{plot_col} [normed]')
? ? plt.plot(self.input_indices, inputs[n, :, plot_col_index],
? ? ? ? ? ? ?label='Inputs', marker='.', zorder=-10)

? ? if self.label_columns:
? ? ? label_col_index = self.label_columns_indices.get(plot_col, None)
? ? else:
? ? ? label_col_index = plot_col_index

? ? if label_col_index is None:
? ? ? continue

? ? plt.scatter(self.label_indices, labels[n, :, label_col_index],
? ? ? ? ? ? ? ? edgecolors='k', label='Labels', c='#2ca02c', s=64)
? ? if model is not None:
? ? ? predictions = model(inputs)
? ? ? plt.scatter(self.label_indices, predictions[n, :, label_col_index],
? ? ? ? ? ? ? ? ? marker='X', edgecolors='k', label='Predictions',
? ? ? ? ? ? ? ? ? c='#ff7f0e', s=64)

? ? if n == 0:
? ? ? plt.legend()

? plt.xlabel('Time [h]')

WindowGenerator.plot = plot
w2.plot()

This plot aligns inputs, labels, and (later) predictions based on the time that the item refers to

4.?Create?tf.data.Datasets

Finally, this?make_dataset?method will take a time series DataFrame and convert it to a?tf.data.Dataset?of?(input_window, label_window)?pairs using the?tf.keras.utils.timeseries_dataset_from_array?function:

def make_dataset(self, data)
? data = np.array(data, dtype=np.float32)
? ds = tf.keras.utils.timeseries_dataset_from_array(
? ? ? data=data,
? ? ? targets=None,
? ? ? sequence_length=self.total_window_size,
? ? ? sequence_stride=1,
? ? ? shuffle=True,
? ? ? batch_size=32,)

? ds = ds.map(self.split_window)

? return ds

WindowGenerator.make_dataset = make_dataset

The?WindowGenerator?object holds training, validation, and test data.

Add properties for accessing them as?tf.data.Datasets using the?make_dataset?method you defined earlier. Also, add a standard example batch for easy access and plotting:

@propert
def train(self):
? return self.make_dataset(self.train_df)

@property
def val(self):
? return self.make_dataset(self.val_df)

@property
def test(self):
? return self.make_dataset(self.test_df)

@property
def example(self):
? """Get and cache an example batch of `inputs, labels` for plotting."""
? result = getattr(self, '_example', None)
? if result is None:
? ? # No example batch was found, so get one from the `.train` dataset
? ? result = next(iter(self.train))
? ? # And cache it for next time
? ? self._example = result
? return result

WindowGenerator.train = train
WindowGenerator.val = val
WindowGenerator.test = test
WindowGenerator.example = exampley

Now, the?WindowGenerator?object gives you access to the?tf.data.Dataset?objects, so you can easily iterate over the data.

The?Dataset.element_spec?property tells you the structure, data types, and shapes of the dataset elements.

# Each element is an (inputs, label) pair.
w2.train.element_spec

Iterating over Dataset yields concrete batches:

for example_inputs, example_labels in w2.train.take(1):
? print(f'Inputs shape (batch, time, features): {example_inputs.shape}')
? print(f'Labels shape (batch, time, features): {example_labels.shape}')

Single-step models

The simplest model you can build on this sort of data is one that predicts a single feature's value—1 time step (one hour) into the future based only on the current conditions.

So, start by building models to predict the?Open?value one hour into the future.

Recurrent neural network

A Recurrent Neural Network (RNN) is a type of neural network well-suited to time series data. RNNs process a time series step-by-step, maintaining an internal state from time-step to time step.

You can learn more about the?Text generation with an RNN?tutorial and the?Recurrent Neural Networks (RNN) with Keras?guide.

In this tutorial, you will use an RNN layer called Long Short-Term Memory (tf.keras.layers.LSTM).

An important constructor argument for all Keras RNN layers, such as?tf.keras.layers.LSTM, is the?return_sequences?argument. This setting can configure the layer in one of two ways:

If?False, the default, the layer only returns the output of the final time step, giving the model time to warm up its internal state before making a single prediction:

If?True, the layer returns output for each input. This is useful for:

Stacking RNN layers.
Training a model on multiple time steps simultaneously.

lstm_model = tf.keras.models.Sequential([
? ? # Shape [batch, time, features] => [batch, time, lstm_units]
? ? tf.keras.layers.LSTM(32, return_sequences=True),
? ? # Shape => [batch, time, features]
? ? tf.keras.layers.Dense(units=1)])

With?return_sequences=True, the model can be trained on 24 hours of data at a time.

print('Input shape:', wide_window.example[0].shape)
print('Output shape:', lstm_model(wide_window.example[0]).shape)

history = compile_and_fit(lstm_model, wide_window

IPython.display.clear_output()
val_performance['LSTM'] = lstm_model.evaluate(wide_window.val)
performance['LSTM'] = lstm_model.evaluate(wide_window.test, verbose=0))

wide_window.plot(lstm_model)

Multi-step models

Both the single-output and multiple-output models in the previous sections made?single time step predictions, one hour into the future.

This section looks at how to expand these models to make?multiple time step predictions.

In a multi-step prediction, the model needs to learn to predict a range of future values. Thus, unlike a single-step model, where only a single future point is predicted, a multi-step model predicts a sequence of the future values.

There are two rough approaches to this:

Single-shot predictions where the entire time series is predicted at once.
Autoregressive predictions where the model only makes single-step predictions and its output is fed back as its input.

In this section, all the models will predict?all the features across all output time steps.

For the multi-step model, the training data again consists of hourly samples. However, here, the models will learn to predict 24 hours into the future, given 24 hours of the past.

Here is a?Window?object that generates these slices from the dataset:

OUT_STEPS = 24
multi_window = WindowGenerator(input_width=24,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?label_width=OUT_STEPS,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?shift=OUT_STEPS)

multi_window.plot()
multi_window

Conclusions

With the results, we obtained we can give a clear understanding of the incredible randomness in the stock markets of currencies and assets even more speaking of cryptocurrencies. But using this information base we can recreate it for more stable markets and we could have much better results.

In the analysis of the data, we could also use PCA (Principal Component Analysis) which would almost certainly take the first 3 columns of our database but it is also interesting to see if the closing price of an asset could influence its behavior. on the next day. These are interesting considerations to analyze in the future, if you are interested in reviewing the code you can access this notebook in Google collab where all the executed code is, or my GitHub where you will see the main functions.

Bibliography

https://www.tensorflow.org/tutorials/structured_data/time_series

要查看或添加评论，请登录

Juan David Tuta Botero的更多文章

How to perform automated data augmentation

2022年8月10日

How to perform automated data augmentation

Every day of our lives we see regularly how machine learning takes more and more prominence in the different economic…
Hyperparameters selection using Bayesian Optimization with GPyOpt over a Keras Neural Network

2022年4月18日

Hyperparameters selection using Bayesian Optimization with GPyOpt over a Keras Neural Network

One of the main concerns in the development of neural networks is the correct selection of hyperparameters. These are…
Transfer Learning, how to use pre-trained neural networks to apply to your own

2022年2月21日

Transfer Learning, how to use pre-trained neural networks to apply to your own

Abstract This review is provided a detailed overview of how to develop a Neural network able to recognize and…
Convolutional Neural Networks how Artificial intelligence see

2022年2月6日

Convolutional Neural Networks how Artificial intelligence see

Recently the idea to use Artificial intelligence to analyze images and videos is becoming a trending topic even for…
Optimization operations in supervised learning and hyperparameter choices

2022年1月17日

Optimization operations in supervised learning and hyperparameter choices

In the current world, it’s hard to imagine one industry or company that is not interested to implement machine learning…
Activation functions in machine learning and Neural Networks

2022年1月5日

Activation functions in machine learning and Neural Networks

It’s been a while since the last article I wrote, today we are going to talk about activation functions, this concept…
What happened when you search in your browser?

2021年9月6日

What happened when you search in your browser?

The Internet has become one of the most important tools in the current society. It is hard to believe any activity that…
IoT is one step closer to the future

2021年8月19日

IoT is one step closer to the future

What is the first thing that comes to your mind when you think of the word Iot? In my case, I didn't know anything…
What is Recursion? Computer science

2021年6月17日

What is Recursion? Computer science

Every day of our lives we usually find notions to interpret the reality from where we are living, and how we perceived…

1 条评论
Differences between static and dynamic libraries

2021年5月4日

Differences between static and dynamic libraries

Why using libraries in general A library in C is a collection of objects files exposed for use and build other…

1 条评论

See all articles

How to create an RNN (Recurrent Neural Network) capable of predicting the behavior of the stock markets and cryptocurrencies

Juan David Tuta Botero

Data Science | Machine Learning | Artificial Intelligence

Time Series Forecasting

Preprocessing Data

Clean the database

Better interpretation

Data windowing

?Indexes and offsets

2.?Split

领英推荐

3.?Plot

4.?Create?tf.data.Datasets

Single-step models

Recurrent neural network

Multi-step models

Conclusions

Bibliography

Juan David Tuta Botero的更多文章

社区洞察

其他会员也浏览了

Breakthrough: Zero-Weight LLM for Accurate Predictions and High-Performance Clustering

AI-101: What is an Artificial Neuron?

The Artificial Intelligence Gambit

Temporal Convolutional Neural Network with Conditioning for Broad Market Signals

Value Creation: Delightful, Distinguished & Deeply Reinforced with Neural Networks

The Neural Network’s Journey: Unraveling the World Through Text

Consciousness: Artificial Intelligence with Machine Consciousness: the World Model Engine

AI, Consciousness, and Intelligence in the Era of AI

Using Artificial Neural Networks (ANNs) to Build an Options Trading Strategy for Volatile Markets

Neural Networks in Options Trading: Revolutionizing the Future of Nifty50 in India

Time Series Forecasting

Preprocessing Data

Clean the database

Better interpretation

Data windowing

?Indexes and offsets

2.?Split

领英推荐

3.?Plot

4.?Create?tf.data.Datasets

Single-step models

Recurrent neural network

Multi-step models

Conclusions

Bibliography

Juan David Tuta Botero的更多文章

How to perform automated data augmentation

Hyperparameters selection using Bayesian Optimization with GPyOpt over a Keras Neural Network

Transfer Learning, how to use pre-trained neural networks to apply to your own

Convolutional Neural Networks how Artificial intelligence see

Optimization operations in supervised learning and hyperparameter choices

Activation functions in machine learning and Neural Networks

What happened when you search in your browser?

IoT is one step closer to the future

What is Recursion? Computer science

Differences between static and dynamic libraries

社区洞察

其他会员也浏览了

Breakthrough: Zero-Weight LLM for Accurate Predictions and High-Performance Clustering

AI-101: What is an Artificial Neuron?

The Artificial Intelligence Gambit

Temporal Convolutional Neural Network with Conditioning for Broad Market Signals

Value Creation: Delightful, Distinguished & Deeply Reinforced with Neural Networks

The Neural Network’s Journey: Unraveling the World Through Text

Consciousness: Artificial Intelligence with Machine Consciousness: the World Model Engine

AI, Consciousness, and Intelligence in the Era of AI

Using Artificial Neural Networks (ANNs) to Build an Options Trading Strategy for Volatile Markets

Neural Networks in Options Trading: Revolutionizing the Future of Nifty50 in India