Capstone 9 Scope
Capstone 9 converts the copied churn-prediction assignment into an executed TensorFlow ANN workflow with saved training history, confusion-matrix evidence, and sample-customer scoring.
Primary staged dataset: Churn_Modeling.csv.
Training history, prediction samples, and summary outputs are staged under outputs/.
Original Project PDF
The copied project directions are embedded here for direct comparison against the notebook and output artifacts.
Requirement Walkthrough
Each walkthrough block maps the copied PDF requirements to the executed notebook cells, exported outputs, and reviewable evidence staged with this capstone.
9a
Prepare The Churn Dataset For ANN Training
Notebook section: Load, drop, encode, and split cells
Requirement: Drop personal-data columns, encode Geography and Gender, and split the dataset 80:20 with random_state 0.
The notebook removes RowNumber, CustomerId, and Surname, applies scaling plus one-hot encoding, and prepares the processed feature matrix for the ANN.
Results Capture
- Dataset shape is [10000,14].
- Processed feature count is 13.
working_df = df.drop(columns=['RowNumber', 'CustomerId', 'Surname']).copy()
preprocessor = ColumnTransformer([...])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0, stratify=y)
9b
Build And Train The Required ANN
Notebook section: Sequential-model and fit cells
Requirement: Create the ANN with a 6-neuron ReLU dense layer, a 1-neuron sigmoid output layer, and train it with Adam and binary_crossentropy.
The notebook trains the ANN for the copied Session 9 workflow and exports a training-history CSV plus the line chart used by the site page.
Results Capture
- The training-history CSV is exported as session_9_training_history.csv.
- Accuracy and loss histories are saved as plot artifacts for the page.
model = tf.keras.Sequential([tf.keras.layers.Input(shape=(X_train_processed.shape[1],)), tf.keras.layers.Dense(6, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid')])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Associated Artifact
Training History
Saved accuracy and loss curves across epochs.
Associated Artifact
Confusion Matrix
Saved confusion-matrix heatmap for the held-out test set.
9c
Evaluate The Test Set And Score The Sample Customer
Notebook section: Evaluation and sample-prediction cells
Requirement: Evaluate the ANN on the test set and predict whether the specified customer should be allowed to go.
The notebook exports prediction samples for the held-out split and also scores the exact sample customer required by the copied PDF.
Results Capture
- Test accuracy is 0.8545.
- Sample-customer decision: Allow to stay.
test_probabilities = model.predict(X_test_processed, verbose=0).ravel()
sample_probability = float(model.predict(sample_processed, verbose=0).ravel()[0])
sample_decision = 'Do not allow to go' if sample_prediction == 1 else 'Allow to stay'
Data And Artifact Links
The links below open the copied project files, executed notebook, generated outputs, and staged evidence artifacts for this capstone.
Artifact
Project PDF
Open the copied project directions PDF for this capstone.
Artifact
Notebook Evidence
View the notebook as a readable page or download the original file.
Artifact
Requirements File
Open the generated requirements file for the website workflow.
Artifact
Original CSV Dataset
View the original source CSV staged for this capstone or download the raw file.
Artifact
JSON Output
Open the generated JSON artifact or download the original file.
Artifact
CSV Output
Open the generated CSV handoff or download the original file.
Artifact
Training History CSV
Exported epoch-by-epoch loss and accuracy history.
Artifact
Prediction Samples CSV
Exported sample predictions for the held-out test set.
Artifact
Summary JSON
Structured summary of evaluation metrics and sample-customer scoring.
Interactive Neural Network Lab
This TensorFlow Playground embed is a concept simulator for the ANN ideas behind the churn project. It does not load `Churn_Modeling.csv`; instead, it preloads small synthetic classification datasets and network settings so you can watch how hidden layers, activations, learning rate, and regularization change the learned decision boundary.
What This Is
- The embedded lab is not the graded Session 9 model and does not use the bank churn dataset from the notebook.
- What is preloaded depends on the preset button you click: each preset swaps in a synthetic dataset, activation, network shape, learning rate, train split, and noise setting.
- The right-hand plot shows the model output over the 2D feature space, while the loss values at the top tell you whether training is improving.
How To Use It
- Click any preset button above the embed to load that preconfigured scenario into the playground frame.
- Press the Play button in the top-left corner of the playground to start training the network.
- Watch the epoch counter, loss readout, and colored output panel update as the model learns.
- Switch to another preset to compare how a different dataset or network design changes the training behavior.
What To Look For
- Decision Boundary Basics: expect a smooth boundary forming around the center cluster as the tanh network converges.
- Hidden Layers On Spiral Data: expect a harder problem that needs more epochs and more capacity to untangle the spiral arms.
- ReLU On XOR: expect the network to learn a nonlinear separation that a simple linear separator cannot produce.
- Regularization Under Noise: expect noisier points and a smoother boundary, which helps explain overfitting control.
Preset 1
Decision Boundary Basics
Preloads a circle-classification toy dataset with `x` and `y` inputs, a tanh network shaped `4,2`, learning rate `0.03`, zero noise, and 50% train split. Press Play to watch the network learn a nonlinear boundary around the center cluster, which is the same classification idea used by the churn ANN even though this demo uses synthetic 2D points instead of bank-customer rows.
Preset 2
Hidden Layers On Spiral Data
Preloads the spiral dataset with a deeper `8,6,4` tanh network so you can see why extra hidden-layer capacity helps on more complex class boundaries. Use this to compare a simple ANN versus a deeper one and watch how training takes longer but can represent more complicated separation patterns.
Preset 3
ReLU On XOR
Preloads the XOR dataset with a `6,3` ReLU network so you can compare activation choice and topology. Press Play and watch the network solve a pattern that a linear model cannot separate, which mirrors why hidden layers and nonlinear activations matter in ANN-based classification.
Preset 4
Regularization Under Noise
Preloads Gaussian classification data with 15% noise, visible test points, and regularization rate `0.001`. Press Play and compare how the learned boundary stays smoother under noisy data, which is useful for explaining generalization and overfitting risk in the churn project.
Loaded preset: Decision Boundary Basics
These four buttons are preset loaders, not dead tabs. Clicking one reloads the embedded playground with a different preconfigured dataset and network. The actual graded evidence for Session 9 still comes from the notebook, the training-history plot, the confusion matrix, and the exported churn-prediction outputs.
Colab Notebook
This section provides the notebook preview, launch link, and project file links.
The notebook opens in Google Colab when a launch URL is configured, and the project files and outputs remain available here on the site.
Embedded Notebook Preview
Cell 1 Markdown
Capstone Session 9
This notebook is generated from the copied Capstone_Session_9.pdf directions and the staged Churn_Modeling.csv dataset.
Cell 2 Markdown
Objective
Build the required artificial neural network for customer churn prediction, evaluate it on the held-out test set, and score the specified sample customer.
Cell 3 Code · python
from pathlib import Path
import json
import os
import sys
from urllib.parse import quote
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
from IPython.display import display
from sklearn.compose import ColumnTransformer
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.keras.utils.set_random_seed(42)
IS_COLAB = 'google.colab' in sys.modules
GITHUB_REPO_OWNER = 'FrancisBurnet'
GITHUB_REPO_NAME = 'francisburnet'
GITHUB_REPO_BRANCH = 'main'
CAPSTONE_ROOT = Path('Incremental Capstones/Deep Learning Specialization/Capstone Session 9')
DATASET_FILENAME = 'Churn_Modeling.csv'
def build_raw_github_url(relative_path: Path) -> str:
encoded_path = quote(relative_path.as_posix(), safe='/')
return (
f"https://raw.githubusercontent.com/{GITHUB_REPO_OWNER}/{GITHUB_REPO_NAME}/"
f"{GITHUB_REPO_BRANCH}/{encoded_path}"
)
def resolve_capstone_dir() -> Path | None:
current = Path.cwd().resolve()
for candidate in [current, *current.parents]:
if candidate.name == CAPSTONE_ROOT.name and (candidate / DATASET_FILENAME).exists():
return candidate
nested_candidate = candidate / CAPSTONE_ROOT
if nested_candidate.exists():
return nested_candidate
return None
CAPSTONE_DIR = resolve_capstone_dir()
DATASET_URL = build_raw_github_url(CAPSTONE_ROOT / DATASET_FILENAME)
if CAPSTONE_DIR is not None:
OUTPUT_ROOT = CAPSTONE_DIR
OUTPUT_MODE = 'permanent capstone outputs'
else:
runtime_root = Path('/content/capstone-session-9-runtime') if IS_COLAB else Path.cwd().resolve() / 'capstone-session-9-runtime'
OUTPUT_ROOT = runtime_root
OUTPUT_MODE = 'runtime scratch outputs; export final artifacts back into the capstone outputs folder'
OUTPUTS_DIR = (OUTPUT_ROOT / 'outputs').resolve()
PLOTS_DIR = OUTPUTS_DIR / 'plots'
OUTPUTS_DIR.mkdir(parents=True, exist_ok=True)
PLOTS_DIR.mkdir(parents=True, exist_ok=True)
sns.set_theme(style='whitegrid')
pd.set_option('display.max_columns', 100)
print('Runtime:', 'Google Colab' if IS_COLAB else 'Local / notebook runtime')
print('Capstone directory:', CAPSTONE_DIR if CAPSTONE_DIR is not None else 'Not available in current runtime')
print('Dataset source:', DATASET_URL)
print('Output mode:', OUTPUT_MODE)
print('Outputs directory:', OUTPUTS_DIR)
Cell 4 Code · python
from io import StringIO
df = pd.read_csv(DATASET_URL)
missing_summary = pd.DataFrame({
'missing_count': df.isna().sum(),
'missing_pct': (df.isna().mean() * 100).round(2),
})
info_buffer = StringIO()
df.info(buf=info_buffer)
print('Dataset source used:', DATASET_URL)
print('Shape:', df.shape)
print('Duplicate rows:', int(df.duplicated().sum()))
print(info_buffer.getvalue())
display(df.head())
display(df.describe().transpose())
display(missing_summary)
print('Target distribution:', df['Exited'].value_counts().to_dict())
Cell 5 Code · python
working_df = df.drop(columns=['RowNumber', 'CustomerId', 'Surname']).copy()
X = working_df.drop(columns=['Exited'])
y = working_df['Exited']
categorical_columns = ['Geography', 'Gender']
numeric_columns = [column for column in X.columns if column not in categorical_columns]
preprocessor = ColumnTransformer([
('num', StandardScaler(), numeric_columns),
('cat', OneHotEncoder(handle_unknown='ignore', sparse_output=False), categorical_columns),
])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0, stratify=y)
X_train_processed = preprocessor.fit_transform(X_train)
X_test_processed = preprocessor.transform(X_test)
print('Processed train shape:', X_train_processed.shape)
print('Processed test shape:', X_test_processed.shape)
Cell 6 Code · python
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(X_train_processed.shape[1],)),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid'),
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(
X_train_processed,
y_train,
epochs=10,
batch_size=10,
validation_split=0.2,
verbose=0,
)
pd.DataFrame(history.history).head()
Cell 7 Code · python
test_probabilities = model.predict(X_test_processed, verbose=0).ravel()
test_predictions = (test_probabilities >= 0.5).astype(int)
test_accuracy = float(accuracy_score(y_test, test_predictions))
test_confusion = confusion_matrix(y_test, test_predictions)
print('Test accuracy:', round(test_accuracy, 4))
print('Confusion matrix:', test_confusion.tolist())
Cell 8 Code · python
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
axes[0].plot(history.history['accuracy'], label='train')
axes[0].plot(history.history['val_accuracy'], label='validation')
axes[0].set_title('Accuracy by Epoch')
axes[0].legend()
axes[1].plot(history.history['loss'], label='train')
axes[1].plot(history.history['val_loss'], label='validation')
axes[1].set_title('Loss by Epoch')
axes[1].legend()
fig.tight_layout()
fig.savefig(PLOTS_DIR / 'training_history.png', dpi=150)
plt.show()
plt.close(fig)
fig, ax = plt.subplots(figsize=(5, 4))
sns.heatmap(test_confusion, annot=True, fmt='d', cmap='Blues', ax=ax)
ax.set_title('Confusion Matrix')
ax.set_xlabel('Predicted')
ax.set_ylabel('Actual')
fig.tight_layout()
fig.savefig(PLOTS_DIR / 'confusion_matrix.png', dpi=150)
plt.show()
plt.close(fig)
Cell 9 Code · python
sample_customer = pd.DataFrame([{
'CreditScore': 600,
'Geography': 'France',
'Gender': 'Male',
'Age': 40,
'Tenure': 3,
'Balance': 60000,
'NumOfProducts': 2,
'HasCrCard': 1,
'IsActiveMember': 1,
'EstimatedSalary': 50000,
}])
sample_processed = preprocessor.transform(sample_customer)
sample_probability = float(model.predict(sample_processed, verbose=0).ravel()[0])
sample_prediction = int(sample_probability >= 0.5)
sample_decision = 'Do not allow to go' if sample_prediction == 1 else 'Allow to stay'
{
'sample_probability': round(sample_probability, 4),
'sample_prediction': sample_prediction,
'sample_decision': sample_decision,
}
Cell 10 Code · python
history_df = pd.DataFrame(history.history)
history_df.to_csv(OUTPUTS_DIR / 'session_9_training_history.csv', index=False)
prediction_frame = pd.DataFrame({
'actual': y_test.reset_index(drop=True),
'predicted_probability': test_probabilities,
'predicted_label': test_predictions,
})
prediction_frame.head(100).to_csv(OUTPUTS_DIR / 'session_9_prediction_samples.csv', index=False)
summary = {
'dataset_shape': list(df.shape),
'target_distribution': df['Exited'].value_counts().to_dict(),
'processed_feature_count': int(X_train_processed.shape[1]),
'test_accuracy': test_accuracy,
'confusion_matrix': test_confusion.tolist(),
'sample_customer_probability': sample_probability,
'sample_customer_prediction': sample_prediction,
'sample_customer_decision': sample_decision,
}
with open(OUTPUTS_DIR / 'session_9_summary.json', 'w', encoding='utf-8') as handle:
json.dump(summary, handle, indent=2)
summary
Project Notes
- Feature preparation and encoded churn inputs.
- ANN architecture and training-history outputs.
- Held-out confusion-matrix and prediction-sample exports.
- Sample-customer decision for the copied PDF prompt.
Launch Controls
Notebook Launch
Open the matching notebook in Google Colab or review the tracked notebook source in GitHub.