登录查看更多内容

Regression using Neural Network

Gautam Karmakar

Leader in Oracle Cloud Engineering (AI/ML -Healthcare) | Ex-Microsoft | Driving Digital Transformation by harnessing the power of Cloud, Analytics & AI

发布日期: 2018年1月23日

Keras a wrapper API that runs on top of Tensorflow is very popular and easy to use. Scikitlearn also very popular libraries for machine learning.In this post I will show how to use keras and scikitlearn to build neural network architecture in python and develop a regression linear model.

Jump to code and spare reading time:

https://github.com/GKarmakar/RegressionUsingNN

Define a base model to be used to build a model for regression using scikitlearn API KerasRegressor.

def baseline_model_1057(optimizer=’adam’):

# create model

model = Sequential()

model.add(Dense(1058, activation=’relu’,

kernel_regularizer = ‘l2’,

kernel_initializer = ‘normal’,

input_shape=(1057,)))

model.add(BatchNormalization())

model.add(Dropout(0.5))

model.add(Dense(529, activation=’relu’,

kernel_regularizer = ‘l2’,

kernel_initializer = ‘normal’))

model.add(BatchNormalization())

model.add(Dropout(0.5))

model.add(Dense(1, activation=’linear’,

kernel_regularizer = ‘l2’,

kernel_initializer=’normal’))

model.compile(loss=’mse’, optimizer=optimizer, metrics=[‘accuracy’])

return model

Now we write a method for training the model we created above:

def train_data_nn(X_train, y_train):

np.random.seed(42)

# create model

estimator = KerasRegressor(build_fn=baseline_model_1057, epochs=100, batch_size=10, verbose=0)

kfold = KFold(n_splits=10, random_state=42)

results = cross_val_score(estimator, X_train, y_train, cv=kfold)

print(“Standardized: %.2f (%.2f) MSE” % (results.mean(), results.std()))

return estimator

Define a method to visualize loss — we are using MSE loss for regression.

def visualize_learning_curve(history):

# summarize history for loss

plt.plot(history.history[‘loss’])

plt.plot(history.history[‘val_loss’])

plt.title(‘model loss’)

plt.ylabel(‘loss’)

plt.xlabel(‘epoch’)

plt.legend([‘train’, ‘test’], loc=’upper left’)

plt.show()

Main method to perform data preprocessing such as replace null values, standardize data and split into train and test.

def train_and_predict(Xtrain, Xtest):

X = Xtrain

y = X[‘rank’]

X.drop(“rank”, inplace=True, axis=1)

null_cols = X.columns[X.isnull().all()]

X.drop(null_cols, inplace=True, axis=1)

nunique = X.apply(pd.Series.nunique)

null_col_uni = nunique[nunique == 1].index

X.drop(null_col_uni, inplace=True, axis=1)

Xtest.drop(null_cols, inplace=True, axis=1)

Xtest.drop(null_col_uni, inplace=True, axis=1)

print(‘Train size:’, X.shape, ‘ Test size:’, Xtest.shape)

seed = 7

numpy.random.seed(seed)

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.1, random_state=42)

scaler = StandardScaler().fit(X_train)

X_train = scaler.transform(X_train)

X_val = scaler.transform(X_val)

estimator = train_data_nn(X_train, y_train)

early_stopping = EarlyStopping(monitor=’loss’, patience=1, verbose=1)

history = estimator.fit(X_train, y_train, validation_split=0.1,

epochs=100, batch_size=10,

callbacks=[early_stopping],

verbose=1)

visualize_learning_curve(history)

rmse = math.sqrt(mean_squared_error(y_val.values, estimator.predict(X_val.values)))

print(rmse)

pred = estimator.predict(X_test)

test_df = pd.DataFrame({‘y_pred’: pred})

return test_df

Data processing from train and test data files:

df_train = pd.read_csv(“train.csv”)

df_test = pd.read_csv(“test.csv”)

train_num = len(df_train)

df_test.insert(0, ‘rank’, 0)

dataset = pd.concat(objs=[df_train, df_test], axis=0)

dataset = shuffle(dataset)

dataset.fillna(0, inplace=True)

df_train = dataset[:train_num]

df_test = dataset[train_num:]

df_test.drop(‘rank’, inplace=True, axis=1)

print(“Train Data:”, df_train.shape)

print(“Test Data:”, df_test.shape)

Create predictions and submission file for kaggle like submission.

test_df = train_and_predict_new(df_train, df_test)

submission = test_df

submission.sort_index(inplace=True)

submission.loc[submission[‘y_pred’] < 0, ‘y_pred’] = 0

submission.loc[submission[‘y_pred’] > 100, ‘y_pred’] = 100

submission.to_csv(“submission.csv”, index=False)

Grid Search Deep Learning Model Parameters

The previous example showed how easy it is to wrap your deep learning model from Keras and use it in functions from the scikit-learn library.

In this example, we go a step further. The function that we specify to the build_fn argument when creating the KerasRegressor wrapper can take arguments. We can use these arguments to further customize the construction of the model. In addition, we know we can provide arguments to the fit() function.

In this example, we use a grid search to evaluate different configurations for our neural network model and report on the combination that provides the best-estimated performance.

The create_model() function is defined to take two arguments optimizer and init, both of which must have default values. This will allow us to evaluate the effect of using different optimization algorithms and weight initialization schemes for our network.

After creating our model, we define arrays of values for the parameter we wish to search, specifically:

Optimizers for searching different weight values. Initializers for preparing the network weights using different schemes. Epochs for training the model for a different number of exposures to the training dataset. Batches for varying the number of samples before a weight update. The options are specified into a dictionary and passed to the configuration of the GridSearchCV scikit-learn class. This class will evaluate a version of our neural network model for each combination of parameters (2 x 3 x 3 x 3 for the combinations of optimizers, initializations, epochs and batches). Each combination is then evaluated using the default of 3-fold stratified cross validation.

That is a lot of models and a lot of computation. This is not a scheme that you want to use lightly because of the time it will take. It may be useful for you to design small experiments with a smaller subset of your data that will complete in a reasonable time. This is reasonable in this case because of the small network and the small dataset (less than 1000 instances and 9 attributes).

Finally, the performance and combination of configurations for the best model are displayed, followed by the performance of all combinations of parameters.

This might take about 5 minutes to complete on your workstation executed on the CPU (rather than CPU). running the example shows the results below.

We can see that the grid search discovered that using a uniform initialization scheme, rmsprop optimizer, 150 epochs and a batch size of 5 achieved the best cross-validation score of approximately 75% on this problem.

def gridSearch_neural_network(df_train, ytrain):

# fix random seed for reproducibility

seed = 7

numpy.random.seed(seed)

X_train, X_val, y_train, y_val = train_test_split(df_train, ytrain, test_size=0.1, random_state=42)

print(“Train Data:”, X_train.shape)

print(“Train label:”, y_train.shape)

# evaluate model with standardized dataset

estimator = KerasRegressor(build_fn=baseline_model, nb_epoch=100, batch_size=5, verbose=0)

# grid search epochs, batch size and optimizer

optimizers = [‘rmsprop’, ‘adam’]

dropout_rate = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

init = [‘glorot_uniform’, ‘normal’, ‘uniform’]

epochs = [50, 100, 150]

batches = [5, 10, 20]

weight_constraint = [1, 2, 3, 4, 5]

param_grid = dict(optimizer=optimizers,

dropout_rate=dropout_rate,

epochs=epochs,

batch_size=batches,

weight_constraint=weight_constraint,

init=init)

grid = GridSearchCV(estimator=estimator, param_grid=param_grid)

grid_result = grid.fit(X_train.values, y_train.values)

# summarize results

print(“Best: %f using %s” % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_[‘mean_test_score’]

stds = grid_result.cv_results_[‘std_test_score’]

params = grid_result.cv_results_[‘params’]

for mean, stdev, param in zip(means, stds, params):

print(“%f (%f) with: %r” % (mean, stdev, param))

Summary

In this post, you discovered how you can wrap your Keras deep learning models and use them in the scikit-learn general machine learning library.

You can see that using scikit-learn for standard machine learning operations such as model evaluation and model hyperparameter optimization can save a lot of time over implementing these schemes yourself.

Wrapping your model allowed you to leverage powerful tools from scikit-learn to fit your deep learning models into your general machine learning process

要查看或添加评论，请登录

Gautam Karmakar的更多文章

Guide to Nvidia GenAI Associate Certification (NCA-GENL)

2024年9月19日

Guide to Nvidia GenAI Associate Certification (NCA-GENL)

Introduction The reason for this certificate can be taken in order to brush up the foundation of machine learning and…

1 条评论
Databricks Delta Lake House - Build Reliable Data Lake

2021年8月13日

Databricks Delta Lake House - Build Reliable Data Lake

What Data Lake is good (or bad) for Data Lake promise land became a hot topic for large enterprises to startup in the…
Data-Driven Organization - Key Takeways

2019年10月13日

Data-Driven Organization - Key Takeways

Building data-driven insights for running a business in the large enterprises has been there quiet a while. I was…
Enterprise Data Science - How to do it right.

2018年4月27日

Enterprise Data Science - How to do it right.

Data science is critical for business success today but how to get it right is still not a complete science. Data…
LSTM model (Manhattan LSTM) for Text Similarity

2018年3月31日

LSTM model (Manhattan LSTM) for Text Similarity

A Brief Summary of Siamese Recurrent Architectures for Learning Sentence Similarity: One of the important tasks for…
Google Cloud - Serverless Architecture for Big Data & Machine Learning

2018年3月16日

Google Cloud - Serverless Architecture for Big Data & Machine Learning

Introduction: Serverless architecture is a new paradigm to abstract the way application of any type can be executed…
Current Challenges of AI

2018年2月28日

Current Challenges of AI

Introduction: With increasing commoditization of AI capabilities like computer vision, speech recognition and machine…
What’s “next” after Deep Learning specialization from deeplearning.ai?

2018年2月25日

What’s “next” after Deep Learning specialization from deeplearning.ai?

I thought it may be a good idea that I summarize the “next thing” after this deep learning specialization as there is n…
XGBoost — Model to win Kaggle

2018年1月18日

XGBoost — Model to win Kaggle

I have recently used xgboost in one of my experiment of solving a linear regression problem predicting a discrete…
My experience with Deep Learning Course @ Coursera from Andrew Ng #deeplearning.ai.

2017年8月19日

My experience with Deep Learning Course @ Coursera from Andrew Ng #deeplearning.ai.

I am deeply intrigued by the advancement of AI that is happening in recent years fueled by deep learning techniques. As…

7 条评论

See all articles

Regression using Neural Network

Gautam Karmakar

Leader in Oracle Cloud Engineering (AI/ML -Healthcare) | Ex-Microsoft | Driving Digital Transformation by harnessing the power of Cloud, Analytics & AI

Summary

Gautam Karmakar的更多文章

社区洞察

其他会员也浏览了

3D Fractal Dimension

Object Detection Using EfficientNet in Tensorflow 2

Building a neural network in python is quite simple

How to Classify the paintings of an artist using Convolutional Neural?Network

Krish Naik Udemy Coupon Code

TensorFlow-Keras using Mnist Dataset

Real-time 'me-not_me' Face Detector

Mastering Machine Learning: The Essential Tools To Watch Out For In 2023

Algorithm, code, and mathematical complexities: introduction TENSORFLOW

Deep Learning in Python with TensorFlow and Keras API for creating AI algorithms/models. Sequential models.

Summary

Gautam Karmakar的更多文章

Guide to Nvidia GenAI Associate Certification (NCA-GENL)

Databricks Delta Lake House - Build Reliable Data Lake

Data-Driven Organization - Key Takeways

Enterprise Data Science - How to do it right.

LSTM model (Manhattan LSTM) for Text Similarity

Google Cloud - Serverless Architecture for Big Data & Machine Learning

Current Challenges of AI

What’s “next” after Deep Learning specialization from deeplearning.ai?

XGBoost — Model to win Kaggle

My experience with Deep Learning Course @ Coursera from Andrew Ng #deeplearning.ai.

社区洞察

其他会员也浏览了

3D Fractal Dimension

Object Detection Using EfficientNet in Tensorflow 2

Building a neural network in python is quite simple

How to Classify the paintings of an artist using Convolutional Neural?Network

Krish Naik Udemy Coupon Code

TensorFlow-Keras using Mnist Dataset

Real-time 'me-not_me' Face Detector

Mastering Machine Learning: The Essential Tools To Watch Out For In 2023

Algorithm, code, and mathematical complexities: introduction TENSORFLOW

Deep Learning in Python with TensorFlow and Keras API for creating AI algorithms/models. Sequential models.