373 lines
12 KiB
Plaintext
373 lines
12 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f8a8990c-d09a-4fac-8e67-6a37ebaf056f",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Programming Exercise: Comparison of Regulations\n",
|
|
"Your task is to train an MLP for the classification of the iris data set using different regularization methods.\n",
|
|
"You can use the given libraries, but you can also use other libraries.\n",
|
|
"Set all seeds to 42."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ac54e46e-ce27-4d84-a29b-21442baee5f1",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Please enter your names\n",
|
|
"name = \"Fabian Langer, Yannik Bretschneider\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "62ec6feb-72cc-4431-9822-005a2200b217",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"import random\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import numpy as np\n",
|
|
"import tensorflow as tf\n",
|
|
"from sklearn.datasets import load_iris\n",
|
|
"from sklearn.model_selection import train_test_split\n",
|
|
"from sklearn.preprocessing import StandardScaler\n",
|
|
"\n",
|
|
"SEED = 42\n",
|
|
"os.environ[\"PYTHONHASHSEED\"] = str(SEED)\n",
|
|
"random.seed(SEED)\n",
|
|
"np.random.seed(SEED)\n",
|
|
"tf.random.set_seed(SEED)\n",
|
|
"\n",
|
|
"#load data\n",
|
|
"iris = load_iris()\n",
|
|
"X = iris.data \n",
|
|
"y = iris.target \n",
|
|
"\n",
|
|
"X = StandardScaler().fit_transform(X)\n",
|
|
"\n",
|
|
"X_train, X_val, y_train, y_val = train_test_split(\n",
|
|
" X, y, test_size=0.3, stratify=y, random_state=SEED\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ea6afe4b-18ce-4778-a786-1494ece331c3",
|
|
"metadata": {},
|
|
"source": [
|
|
"# MLP (2 pts)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "546ce278-188e-40eb-b5d8-d625b372cf3a",
|
|
"metadata": {},
|
|
"source": [
|
|
"First, implement an MLP with an input of 4, a hidden size of 16, and an output of 3 (a two-layer MLP).\n",
|
|
"For the first layer, use the ReLU function, and for the last layer, the softmax function."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "96acf91f-b9f2-413a-9f23-7a677d6ef0d3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from tensorflow.keras.models import Sequential\n",
|
|
"from tensorflow.keras.layers import Dense, Dropout\n",
|
|
"from tensorflow.keras.regularizers import l2\n",
|
|
"from tensorflow.keras.callbacks import EarlyStopping\n",
|
|
"\n",
|
|
"def create_mlp(regularization_type=None, rate=0.0, gamma=0.0):\n",
|
|
" \"\"\"\n",
|
|
" Creates and compiles a two-layer MLP with optional regularization.\n",
|
|
" Resets random seeds each time to ensure models start with identical weights\n",
|
|
" for a fair comparison.\n",
|
|
"\n",
|
|
" Args:\n",
|
|
" regularization_type (str, optional): 'dropout' or 'l2'. Defaults to None.\n",
|
|
" rate (float, optional): Dropout rate if type is 'dropout'. Defaults to 0.0.\n",
|
|
" gamma (float, optional): L2 regularization factor if type is 'l2'. Defaults to 0.0.\n",
|
|
"\n",
|
|
" Returns:\n",
|
|
" tf.keras.Model: A compiled Keras Sequential model.\n",
|
|
" \"\"\"\n",
|
|
" # Reset all seeds to ensure models initialize identically before training\n",
|
|
" os.environ[\"PYTHONHASHSEED\"] = str(SEED)\n",
|
|
" random.seed(SEED)\n",
|
|
" np.random.seed(SEED)\n",
|
|
" tf.random.set_seed(SEED)\n",
|
|
"\n",
|
|
" model = Sequential()\n",
|
|
" \n",
|
|
" # Input layer and the first hidden layer (4 -> 16)\n",
|
|
" # L2 regularization is applied via the kernel_regularizer argument\n",
|
|
" model.add(Dense(16, \n",
|
|
" input_shape=(X_train.shape[1],), \n",
|
|
" activation='relu', \n",
|
|
" kernel_regularizer=l2(gamma) if regularization_type == 'l2' else None))\n",
|
|
" \n",
|
|
" # Add a dropout layer if specified\n",
|
|
" if regularization_type == 'dropout':\n",
|
|
" model.add(Dropout(rate, seed=SEED))\n",
|
|
" \n",
|
|
" # Output layer (16 -> 3)\n",
|
|
" model.add(Dense(3, activation='softmax'))\n",
|
|
" \n",
|
|
" # Compile the model\n",
|
|
" model.compile(optimizer='adam',\n",
|
|
" loss='sparse_categorical_crossentropy', # Use sparse for integer labels\n",
|
|
" metrics=['accuracy'])\n",
|
|
" \n",
|
|
" return model"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9db21233-a43b-4a3f-8d26-847c5b9381ad",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train MLP without any regularization (1 pt)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "bbd429f2-fee9-47d7-aee8-d6263d42b845",
|
|
"metadata": {},
|
|
"source": [
|
|
"Train an MLP with the given train/validation split for 200 epochs with a batch size of 16.\n",
|
|
"Track the train loss, train accuracy, validation loss, and validation accuracy for each epoch. (e.g., in four arrays) "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3038c352-b657-4d2c-a916-174c25582858",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"mlp_plain = create_mlp()\n",
|
|
"\n",
|
|
"# Train the model and store its history\n",
|
|
"# verbose=0 is used to keep the notebook output clean\n",
|
|
"history_plain = mlp_plain.fit(\n",
|
|
" X_train, y_train,\n",
|
|
" epochs=200,\n",
|
|
" batch_size=16,\n",
|
|
" validation_data=(X_val, y_val),\n",
|
|
" verbose=0\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "95be0757-fb8a-4532-9272-e2a0cd2ca704",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train an MLP with dropout (1 pts)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "fb615a7b-06d0-416e-92df-a359408796cc",
|
|
"metadata": {},
|
|
"source": [
|
|
"Train another MLP on the given train/validation split for 200 epochs with a batch size of 16 and dropout of 0.6.\n",
|
|
"Track the train loss, train accuracy, validation loss, and validation accuracy for each epoch. (e.g., in four arrays)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "82e0d61e-f58d-46a5-addd-587ab92bd5b3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Create the MLP with dropout regularization\n",
|
|
"mlp_dropout = create_mlp(regularization_type='dropout', rate=0.6)\n",
|
|
"\n",
|
|
"# Train the model and store its history\n",
|
|
"history_dropout = mlp_dropout.fit(\n",
|
|
" X_train, y_train,\n",
|
|
" epochs=200,\n",
|
|
" batch_size=16,\n",
|
|
" validation_data=(X_val, y_val),\n",
|
|
" verbose=0\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b4f5d2b1-d9a8-4b2c-9f04-2c9d6fe744f4",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train an MLP with the L2 Regularization (1 pts)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "4dd894be-0a6e-4fa3-b905-a36ecb67aabb",
|
|
"metadata": {},
|
|
"source": [
|
|
"Train another MLP on the given train/validation split for 200 epochs with a batch size of 16 and the L2 Regularization with gamma = 0.02.\n",
|
|
"Track the train loss, train accuracy, validation loss, and validation accuracy for each epoch. (e.g., in four arrays)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9ea07dcc-a49d-48ad-b343-ebbaee68e960",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"mlp_l2 = create_mlp(regularization_type='l2', gamma=0.02)\n",
|
|
"\n",
|
|
"# Train the model and store its history\n",
|
|
"history_l2 = mlp_l2.fit(\n",
|
|
" X_train, y_train,\n",
|
|
" epochs=200,\n",
|
|
" batch_size=16,\n",
|
|
" validation_data=(X_val, y_val),\n",
|
|
" verbose=0\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "38c93494-8ba1-479e-93ac-8c53577737ac",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train an MLP with early stopping and a patience of 20 (1 pts)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3accc819-5712-4e78-90ef-de89a33e7917",
|
|
"metadata": {},
|
|
"source": [
|
|
"Train another MLP on the given train/validation split for 200 epochs with a batch size of 16 and use early stopping with a patience of 20.\n",
|
|
"Track the train loss, train accuracy, validation loss, and validation accuracy for each epoch. (e.g., in four arrays) Here it can stop earlier."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0eacc33c-8549-490d-9000-2cd1aef62a46",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Create the MLP for early stopping\n",
|
|
"mlp_early_stopping = create_mlp()\n",
|
|
"\n",
|
|
"# Define the EarlyStopping callback\n",
|
|
"# It monitors the validation loss and stops if it doesn't improve for 20 epochs.\n",
|
|
"# restore_best_weights=True ensures the final model has the weights from its best epoch.\n",
|
|
"early_stopping_callback = EarlyStopping(\n",
|
|
" monitor='val_loss',\n",
|
|
" patience=20,\n",
|
|
" restore_best_weights=True\n",
|
|
")\n",
|
|
"\n",
|
|
"# Train the model with the early stopping callback\n",
|
|
"history_early_stopping = mlp_early_stopping.fit(\n",
|
|
" X_train, y_train,\n",
|
|
" epochs=200,\n",
|
|
" batch_size=16,\n",
|
|
" validation_data=(X_val, y_val),\n",
|
|
" callbacks=[early_stopping_callback],\n",
|
|
" verbose=0\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "02cafc93-6498-43fa-954e-cc531cce7a6f",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Plot the training of all four MLPs (2 pts)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f380b6eb-2950-424c-991f-6259a837d48e",
|
|
"metadata": {},
|
|
"source": [
|
|
"Create a plot for each MLP that shows its train loss, train accuracy, validation loss, and validation accuracy for each epoch. The plot for the MLP with early stopping can stop earlier."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "23cea99d-c226-4429-8884-dd2cc20c76c8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def plot_history(history, title):\n",
|
|
" \"\"\"\n",
|
|
" Plots the training and validation loss and accuracy from a Keras history object.\n",
|
|
" \n",
|
|
" Args:\n",
|
|
" history (tf.keras.callbacks.History): The history object returned by model.fit().\n",
|
|
" title (str): The main title for the plot.\n",
|
|
" \"\"\"\n",
|
|
" fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))\n",
|
|
" fig.suptitle(title, fontsize=16)\n",
|
|
"\n",
|
|
" # Plot Loss\n",
|
|
" ax1.plot(history.history['loss'], label='Train Loss')\n",
|
|
" ax1.plot(history.history['val_loss'], label='Validation Loss')\n",
|
|
" ax1.set_title('Model Loss')\n",
|
|
" ax1.set_xlabel('Epoch')\n",
|
|
" ax1.set_ylabel('Loss')\n",
|
|
" ax1.legend(loc='upper right')\n",
|
|
" ax1.grid(True)\n",
|
|
"\n",
|
|
" # Plot Accuracy\n",
|
|
" ax2.plot(history.history['accuracy'], label='Train Accuracy')\n",
|
|
" ax2.plot(history.history['val_accuracy'], label='Validation Accuracy')\n",
|
|
" ax2.set_title('Model Accuracy')\n",
|
|
" ax2.set_xlabel('Epoch')\n",
|
|
" ax2.set_ylabel('Accuracy')\n",
|
|
" ax2.legend(loc='lower right')\n",
|
|
" ax2.grid(True)\n",
|
|
" \n",
|
|
" plt.show()\n",
|
|
"\n",
|
|
"# Generate the plots for all four training scenarios\n",
|
|
"plot_history(history_plain, 'MLP without Regularization')\n",
|
|
"plot_history(history_dropout, 'MLP with Dropout (rate=0.6)')\n",
|
|
"plot_history(history_l2, 'MLP with L2 Regularization (gamma=0.02)')\n",
|
|
"plot_history(history_early_stopping, 'MLP with Early Stopping (patience=20)')"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.12"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|