Artificial Neural Network¶

Importing the libraries¶

In [45]:
import numpy as np
import pandas as pd
import tensorflow as tf
In [46]:
tf.__version__
Out[46]:
'2.20.0'

Part 1 - Data Preprocessing¶

Importing the dataset¶
In [47]:
df = pd.read_csv('Churn_Modelling.csv')
df.head(10)
Out[47]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0
5 6 15574012 Chu 645 Spain Male 44 8 113755.78 2 1 0 149756.71 1
6 7 15592531 Bartlett 822 France Male 50 7 0.00 2 1 1 10062.80 0
7 8 15656148 Obinna 376 Germany Female 29 4 115046.74 4 1 0 119346.88 1
8 9 15792365 He 501 France Male 44 4 142051.07 2 0 1 74940.50 0
9 10 15592389 H? 684 France Male 27 2 134603.88 1 1 1 71725.73 0
In [48]:
X = df.iloc[:, 3:-1].values
y = df.iloc[:, -1].values
In [49]:
y[0:10]
Out[49]:
array([1, 0, 1, 0, 0, 1, 0, 1, 0, 0])
In [50]:
X[0:10]
Out[50]:
array([[619, 'France', 'Female', 42, 2, 0.0, 1, 1, 1, 101348.88],
       [608, 'Spain', 'Female', 41, 1, 83807.86, 1, 0, 1, 112542.58],
       [502, 'France', 'Female', 42, 8, 159660.8, 3, 1, 0, 113931.57],
       [699, 'France', 'Female', 39, 1, 0.0, 2, 0, 0, 93826.63],
       [850, 'Spain', 'Female', 43, 2, 125510.82, 1, 1, 1, 79084.1],
       [645, 'Spain', 'Male', 44, 8, 113755.78, 2, 1, 0, 149756.71],
       [822, 'France', 'Male', 50, 7, 0.0, 2, 1, 1, 10062.8],
       [376, 'Germany', 'Female', 29, 4, 115046.74, 4, 1, 0, 119346.88],
       [501, 'France', 'Male', 44, 4, 142051.07, 2, 0, 1, 74940.5],
       [684, 'France', 'Male', 27, 2, 134603.88, 1, 1, 1, 71725.73]],
      dtype=object)

Encoding categorical data¶

Label Encoding the "Gender" column

In [51]:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])
In [52]:
print(X[0:10])
[[619 'France' 0 42 2 0.0 1 1 1 101348.88]
 [608 'Spain' 0 41 1 83807.86 1 0 1 112542.58]
 [502 'France' 0 42 8 159660.8 3 1 0 113931.57]
 [699 'France' 0 39 1 0.0 2 0 0 93826.63]
 [850 'Spain' 0 43 2 125510.82 1 1 1 79084.1]
 [645 'Spain' 1 44 8 113755.78 2 1 0 149756.71]
 [822 'France' 1 50 7 0.0 2 1 1 10062.8]
 [376 'Germany' 0 29 4 115046.74 4 1 0 119346.88]
 [501 'France' 1 44 4 142051.07 2 0 1 74940.5]
 [684 'France' 1 27 2 134603.88 1 1 1 71725.73]]

One Hot Encoding the "Geography" column

In [53]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))
In [54]:
print(X[0:10])
[[1.0 0.0 0.0 619 0 42 2 0.0 1 1 1 101348.88]
 [0.0 0.0 1.0 608 0 41 1 83807.86 1 0 1 112542.58]
 [1.0 0.0 0.0 502 0 42 8 159660.8 3 1 0 113931.57]
 [1.0 0.0 0.0 699 0 39 1 0.0 2 0 0 93826.63]
 [0.0 0.0 1.0 850 0 43 2 125510.82 1 1 1 79084.1]
 [0.0 0.0 1.0 645 1 44 8 113755.78 2 1 0 149756.71]
 [1.0 0.0 0.0 822 1 50 7 0.0 2 1 1 10062.8]
 [0.0 1.0 0.0 376 0 29 4 115046.74 4 1 0 119346.88]
 [1.0 0.0 0.0 501 1 44 4 142051.07 2 0 1 74940.5]
 [1.0 0.0 0.0 684 1 27 2 134603.88 1 1 1 71725.73]]

Splitting the dataset into the Training set and Test set¶

In [55]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

Feature Scaling¶

In [56]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Part 2 - Building the ANN¶

Initializing the ANN¶

In [57]:
ann = tf.keras.models.Sequential()

Adding the input layer and the first hidden layer¶

In [58]:
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

Adding the second hidden layer¶

In [59]:
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

Adding the output layer¶

In [60]:
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

Part 3 - Training the ANN¶

Compiling the ANN¶

In [61]:
ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

Training the ANN on the Training set¶

In [62]:
from keras.callbacks import Callback

class InlineLogger(Callback):
    def on_epoch_end(self, epoch, logs=None):
        logs = logs or {}
        print(
            f"\rEpoch {epoch + 1}/100 "
            f"- loss: {logs.get('loss', 0):.4f} "
            f"- accuracy: {logs.get('accuracy', 0):.4f}",
            end=""
        )

# Use it in fit()
ann.fit(X_train, y_train, batch_size=32, epochs=100, callbacks=[InlineLogger()], verbose=0)
Epoch 100/100 - loss: 0.3310 - accuracy: 0.8630
Out[62]:
<keras.src.callbacks.history.History at 0x1a24fb8a990>

Part 4 - Making the predictions and evaluating the model¶

Predicting the result of a single observation¶

Question

Use our ANN model to predict if the customer with the following informations will leave the bank:

Geography: France

Credit Score: 600

Gender: Male

Age: 40 years old

Tenure: 3 years

Balance: $ 60000

Number of Products: 2

Does this customer have a credit card? Yes

Is this customer an Active Member: Yes

Estimated Salary: $ 50000

So, should we say goodbye to that customer?

Solution

In [63]:
print(ann.predict(sc.transform([[1, 0, 0, 600, 1, 40, 3, 60000, 2, 1, 1, 50000]])) > 0.5)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step
[[False]]

Therefore, our ANN model predicts that this customer stays in the bank!

Important note 1: Notice that the values of the features were all input in a double pair of square brackets. That's because the "predict" method always expects a 2D array as the format of its inputs. And putting our values into a double pair of square brackets makes the input exactly a 2D array.

Important note 2: Notice also that the "France" country was not input as a string in the last column but as "1, 0, 0" in the first three columns. That's because of course the predict method expects the one-hot-encoded values of the state, and as we see in the first row of the matrix of features X, "France" was encoded as "1, 0, 0". And be careful to include these values in the first three columns, because the dummy variables are always created in the first columns.

Predicting the Test set results¶

In [64]:
y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5)
#print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step

Confusion Matrix¶

In [65]:
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
[[1504   91]
 [ 191  214]]
In [66]:
#Accuracy
accuracy_score(y_test, y_pred)
Out[66]:
0.859