{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Feature engineering and feature learning in a linear system\n",
"\n",
"The [bivariate linear regression tutorial](LinearRegressionBivariate.html) ([Notebook](LinearRegressionBivariate.ipynb)), has provided the tools to perform linear regression on multiple variables. \n",
"\n",
"In the [bivariate Keras linear regression tutorial](LinearRegressionBivariate-Keras.html) ([Notebook](LinearRegressionBivariate-Keras.ipynb)), most of the plumbing of the optimization was delegated to Keras.\n",
"\n",
"In this tutorial, we will use the [Galton dataset of people height](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/T0HSJ1). Starting from a simple regression using parents' heights to estimate the child's height, the model will be improved to take into account for the child's sex. Eventually, we will try to get the model to learn the feature through a two level neural net based on Keras.\n",
"\n",
"An introductory article about this notebook has been [published on Medium](https://medium.com/analytics-vidhya/machine-learning-from-feature-engineering-to-feature-learning-1d81fbf2dc23)\n",
"\n",
"\n",
"### Learning goals:\n",
"- [Feature engineering](https://en.wikipedia.org/wiki/Feature_engineering) to manually improve the model\n",
"- Observe the extraction and combination of features by a neural net, that's called [feature learning](https://en.wikipedia.org/wiki/Feature_learning)\n",
"- Prescale training data and adjust scaling on predictions"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from tensorflow import keras # TF 2.0 required\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from sklearn import metrics, linear_model, model_selection, preprocessing\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"usingTensorBoard = False"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Helpers"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def twinPlot(features, series, legend):\n",
" \"\"\" Plot subplots horizontally \"\"\"\n",
" numPlots = len(features.columns)\n",
" fig, axs = plt.subplots(1, numPlots, sharey=True, figsize=(16, 6))\n",
" if numPlots == 1:\n",
" t, v = features.iteritems()[0]\n",
" for y in series:\n",
" axs.scatter(v, y)\n",
" # axs[i].set_aspect('equal')\n",
" axs.set_xlabel(t)\n",
" axs.legend(legend)\n",
" axs.set_title('Height [inch]')\n",
" else:\n",
" for i, c in enumerate(features.iteritems()):\n",
" for y in series:\n",
" axs[i].scatter(c[1], y)\n",
" # axs[i].set_aspect('equal')\n",
" axs[i].set_xlabel(c[0])\n",
" axs[0].legend(legend)\n",
" axs[0].set_title('Height [inch]')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading data\n",
"\n",
"The choosen data model is the Galton dataset of human heights with parents' heights. It also contains the gender of each kid, that's the feature will attempt to automatically process."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Family
\n",
"
Father
\n",
"
Mother
\n",
"
Gender
\n",
"
Height
\n",
"
Kids
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
78.5
\n",
"
67.0
\n",
"
M
\n",
"
73.2
\n",
"
4
\n",
"
\n",
"
\n",
"
1
\n",
"
1
\n",
"
78.5
\n",
"
67.0
\n",
"
F
\n",
"
69.2
\n",
"
4
\n",
"
\n",
"
\n",
"
2
\n",
"
1
\n",
"
78.5
\n",
"
67.0
\n",
"
F
\n",
"
69.0
\n",
"
4
\n",
"
\n",
"
\n",
"
3
\n",
"
1
\n",
"
78.5
\n",
"
67.0
\n",
"
F
\n",
"
69.0
\n",
"
4
\n",
"
\n",
"
\n",
"
4
\n",
"
2
\n",
"
75.5
\n",
"
66.5
\n",
"
M
\n",
"
73.5
\n",
"
4
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Family Father Mother Gender Height Kids\n",
"0 1 78.5 67.0 M 73.2 4\n",
"1 1 78.5 67.0 F 69.2 4\n",
"2 1 78.5 67.0 F 69.0 4\n",
"3 1 78.5 67.0 F 69.0 4\n",
"4 2 75.5 66.5 M 73.5 4"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv('Galton.txt', sep=\"\\t\") # http://www.randomservices.org/random/data/Galton.txt \n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Splitting train (80%) and test (20%)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"(df_train, df_test) = model_selection.train_test_split(df, test_size=0.2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Preparation\n",
"\n",
"We will first normalize the data before fitting the linear regression, in order to compare with the linear regression with neural net afterward.\n",
"\n",
"Normalization is, for each feature, removing the mean and scaling the variance to 1. It is important in order to get equivalent convergence speeds on all features and proper regularization.\n",
"\n",
"Inverse transformation will be necessary after prediction on train or test data.\n",
"\n",
"Two scalers are used to handle separately the features (X) and the labels (Y)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"scalerX = preprocessing.StandardScaler(copy=True, with_mean=True, with_std=True)\n",
"scalerX.fit(df_train[['Mother', 'Father']])\n",
"trainX_scaled = scalerX.transform(df_train[['Mother', 'Father']])\n",
"testX_scaled = scalerX.transform(df_test[['Mother', 'Father']])\n",
"\n",
"scalerY = preprocessing.StandardScaler(copy=True, with_mean=True, with_std=True)\n",
"scalerY.fit(df_train[['Height']])\n",
"trainY_scaled = scalerY.transform(df_train[['Height']])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linear regression on the parents' heights"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Linear regression with 2 features, intercept = 0.000, weights = 0.167, 0.252\n"
]
}
],
"source": [
"model1 = linear_model.LinearRegression()\n",
"\n",
"model1.fit(trainX_scaled, trainY_scaled)\n",
"b1 = model1.intercept_\n",
"w1 = model1.coef_.reshape(-1)\n",
"\n",
"print('Linear regression with 2 features, intercept = %.3f, weights = %.3f, %.3f' % (b1, w1[0], w1[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Model evaluation"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean squared error for global model : 11.440\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"height_est1_scaled = model1.predict(testX_scaled)\n",
"height_est1 = scalerY.inverse_transform(height_est1_scaled)\n",
"\n",
"mse1 = metrics.mean_squared_error(df_test['Height'], height_est1)\n",
"twinPlot(df_test[['Mother', 'Father']], [ df_test['Height'], height_est1], ['Reference', 'Estimated'])\n",
"print(\"Mean squared error for global model : %.3f\" % mse1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We may guess or check on the residus that we are missing some fundamental information : the sex of the kid.\n",
"Let's redo the linear regression with separate models for boys and girls."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linear regression on the parents' heights with gendered models\n",
"\n",
"Let's create a specific linear regression for each sex and combine the results\n",
"\n",
"Note: the scalers fitted on the full dataset are used for both models, in order to align with what is done for neural nets. It implies that intercepts might not be null."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"girls_train = df_train['Gender'] == 'F'\n",
"girls_test = df_test['Gender'] == 'F'\n",
"\n",
"trainX_girls_scaled = scalerX.transform(df_train[['Mother', 'Father']][girls_train])\n",
"trainY_girls_scaled = scalerY.transform(df_train[['Height']][girls_train])\n",
"\n",
"model2_girl = linear_model.LinearRegression()\n",
"\n",
"model2_girl.fit(trainX_girls_scaled, trainY_girls_scaled)\n",
"b2_girl = model2_girl.intercept_\n",
"w2_girl = model2_girl.coef_.reshape(-1)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"boys_train = df_train['Gender'] == 'M'\n",
"boys_test = df_test['Gender'] == 'M'\n",
"\n",
"trainX_boys_scaled = scalerX.transform(df_train[['Mother', 'Father']][boys_train])\n",
"trainY_boys_scaled = scalerY.transform(df_train[['Height']][boys_train])\n",
"\n",
"model2_boy = linear_model.LinearRegression()\n",
"\n",
"model2_boy.fit(trainX_boys_scaled, trainY_boys_scaled)\n",
"b2_boy = model2_boy.intercept_\n",
"w2_boy = model2_boy.coef_.reshape(-1)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Linear regression with 2 features, fitting girls, intercept = -0.773, weights = 0.207, 0.283\n",
"Linear regression with 2 features, fitting boys, intercept = 0.710, weights = 0.209, 0.268\n"
]
}
],
"source": [
"print('Linear regression with 2 features, fitting girls, intercept = %.3f, weights = %.3f, %.3f' % (b2_girl, w2_girl[0], w2_girl[1]))\n",
"print('Linear regression with 2 features, fitting boys, intercept = %.3f, weights = %.3f, %.3f' % (b2_boy, w2_boy[0], w2_boy[1]))"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean squared error for split boy/girl model : 5.123\n"
]
}
],
"source": [
"testX_girls_scaled = scalerX.transform(df_test[['Mother', 'Father']][girls_test])\n",
"testX_boys_scaled = scalerX.transform(df_test[['Mother', 'Father']][boys_test])\n",
"\n",
"# Predict\n",
"height_est2_boy_scaled = model2_boy.predict(testX_boys_scaled)\n",
"height_est2_girl_scaled = model2_girl.predict(testX_girls_scaled)\n",
"# Inverse scale\n",
"height_est2_boy = scalerY.inverse_transform(height_est2_boy_scaled).reshape(-1)\n",
"height_est2_girl = scalerY.inverse_transform(height_est2_girl_scaled).reshape(-1)\n",
"\n",
"mse2 = 1 / len(df_test) * (np.sum((df_test['Height'][boys_test] - height_est2_boy)**2) +\n",
" np.sum((df_test['Height'][girls_test] - height_est2_girl)**2))\n",
"\n",
"print(\"Mean squared error for split boy/girl model : %.3f\" % mse2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We observe a significant change in the coefficients, and a sharp decrease of the MSE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Baseline model with Keras\n",
"\n",
"Let's build the equivalent gradient descent version of the _model1_, in order to perform a fair and verified comparison between the methods: inversion and gradient descent."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Scaled intercept = -0.011, Weights = [0.1839134 0.21746412]\n"
]
}
],
"source": [
"# Number of epochs\n",
"nEpoch3 = 512\n",
"nBatch3 = 128 # 32 is default\n",
"nFeatures3 = 2\n",
"\n",
"# Model\n",
"model3 = keras.models.Sequential([\n",
" keras.layers.Dense(1, activation='linear', input_shape=[nFeatures3],\n",
" kernel_regularizer=keras.regularizers.l1(0.001))\n",
"])\n",
"model3.compile(optimizer='adam',\n",
" loss=keras.losses.mean_squared_error,\n",
" metrics=['mse'])\n",
"\n",
"# Tensor board\n",
"callbacks = []\n",
"if usingTensorBoard:\n",
" ks = keras.callbacks.TensorBoard(log_dir=\"./logs/\", \n",
" histogram_freq=1, write_graph=True, write_grads=True, batch_size=1)\n",
" callbacks = [ks]\n",
" \n",
"# Fit\n",
"hist3 = model3.fit(trainX_scaled, trainY_scaled, \n",
" epochs=nEpoch3, batch_size=nBatch3, validation_split = 0.2, verbose=0, callbacks=callbacks)\n",
"\n",
"w3, b3 = model3.get_weights()\n",
"print('Scaled intercept = %.3f, Weights = ' % b3[0], w3.reshape(-1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Verification, the intercepts and weights computed by the gradient descent are equal to the ones compted by Scikit Learn's linear regression. "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# Predict\n",
"height_est3_scaled = model3.predict(testX_scaled)\n",
"# Inverse scale\n",
"height_est3 = scalerY.inverse_transform(height_est3_scaled)\n",
"\n",
"mse3 = metrics.mean_squared_error(df_test['Height'], height_est3)\n",
"\n",
"twinPlot(df_test[['Mother', 'Father']], [df_test['Height'], height_est3], ['Reference', 'Estimated'])\n",
"print(\"Mean squared error for Keras baseline model : %.3f, original mse: %.3f\" % (mse3, mse1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The MSE on the test set is similar to the one of linear regression.\n",
"\n",
"Keras baseline model and method are validated."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Model with gender in Keras\n",
"\n",
"Let's add the gender into the Keras model. In order to get symetrical handling of both sexes, we are first encoding the sex with a one hot encoder that is going to create two binary features corresponding to the sex:\n",
"\n",
"|sex|enc1 = \"Is a girl?\"|enc2 = \"Is a boy ?\"|\n",
"|---|----|----|\n",
"| F | 1 | 0 |\n",
"| M | 0 | 1 |"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([['F', 1.0, 0.0],\n",
" ['M', 0.0, 1.0],\n",
" ['M', 0.0, 1.0]], dtype=object)"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"genderEncoder = preprocessing.OneHotEncoder(sparse=False)\n",
"genderEncoder.fit(df_train[['Gender']])\n",
"np.concatenate((df_train['Gender'].to_numpy().reshape(-1,1), \n",
" genderEncoder.transform(df_train[['Gender']])), axis=1)[:3]"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"# Number of epochs\n",
"nEpoch4 = 1024\n",
"nBatch4 = 128 # 32 is default\n",
"nFeatures4 = 4\n",
"\n",
"# Model\n",
"model4 = keras.models.Sequential([\n",
" keras.layers.Dense(2, activation='linear', input_shape=[nFeatures4],\n",
" kernel_regularizer=keras.regularizers.l1(0.05)),\n",
" keras.layers.Dense(1, activation='linear', input_shape=[nFeatures4],\n",
" kernel_regularizer=keras.regularizers.l1(0.05))\n",
"])\n",
"model4.compile(optimizer='adam',\n",
" loss=keras.losses.mean_squared_error,\n",
" metrics=['mse'])\n",
"\n",
"# Tensor board\n",
"callbacks = []\n",
"if usingTensorBoard:\n",
" ks = keras.callbacks.TensorBoard(log_dir=\"./logs/\", \n",
" histogram_freq=1, write_graph=True, write_grads=True, batch_size=1)\n",
" callbacks = [ks]\n",
" \n",
"# Fit\n",
"xTrain4 = np.concatenate((trainX_scaled, \n",
" genderEncoder.transform(df_train[['Gender']])), axis=1)\n",
"hist4 = model4.fit(xTrain4, trainY_scaled,\n",
" epochs=nEpoch4, batch_size=nBatch4, validation_split = 0.2, verbose=0, callbacks=callbacks)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential_1\"\n",
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"dense_1 (Dense) (None, 2) 10 \n",
"_________________________________________________________________\n",
"dense_2 (Dense) (None, 1) 3 \n",
"=================================================================\n",
"Total params: 13\n",
"Trainable params: 13\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
],
"source": [
"model4.summary()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[array([[ 1.3161868e-01, -1.1788163e-04],\n",
" [ 1.9276769e-01, -6.0178681e-06],\n",
" [-4.5193085e-01, 4.2015861e-05],\n",
" [ 6.8445295e-01, 2.8250279e-04]], dtype=float32),\n",
" array([-0.08908255, -0.01057821], dtype=float32),\n",
" array([[1.2722054e+00],\n",
" [4.7708898e-05]], dtype=float32),\n",
" array([-0.07207449], dtype=float32)]"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weights4 = model4.get_weights()\n",
"weights4"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(15,4))\n",
"plt.subplot(1,3,1)\n",
"plt.semilogy(hist4.history['loss'])\n",
"plt.semilogy(hist4.history['val_loss'])\n",
"plt.grid()\n",
"plt.legend(('train', 'validation'))\n",
"plt.title('Loss (MSE + reg)');\n",
"plt.subplot(1,3,2)\n",
"plt.semilogy(hist4.history['mse'])\n",
"plt.semilogy(hist4.history['val_mse'])\n",
"plt.grid()\n",
"plt.legend(('train', 'validation'))\n",
"plt.title('MSE');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Beware that this MSE results are on scaled data, thus lower than if on original data"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mean squared error for Keras model with genre : 5.167, original MSE : 5.123\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"testX_scaled4 = np.concatenate((scalerX.transform(df_test[['Mother', 'Father']]), \n",
" genderEncoder.transform(df_test[['Gender']])), axis=1)\n",
"height_est_scaled4 = model4.predict(testX_scaled4)\n",
"\n",
"height_est4 = scalerY.inverse_transform(height_est_scaled4)\n",
"mse4 = metrics.mean_squared_error(df_test['Height'], height_est4)\n",
"\n",
"twinPlot(df_test[['Mother', 'Father']], [df_test['Height'], height_est4], ['Reference', 'Estimated'])\n",
"print(\"Mean squared error for Keras model with genre : %.3f, original MSE : %.3f\" % (mse4, mse2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Comparison of the two gendered models\n",
"\n",
"The mode2 in which prediction is using either girl or boy regression model is equivalent as a two layer network : \n",
"1. compute the predictions using both the girl and boy regression models, weigh each of the outputs with the one hot encoded value of actual gender\n",
"2. sum the two contributions\n",
"\n",
"Using this topology, we may compare model2 and model4 in following tables.\n",
"\n",
"__NOTE:__ actual values of the coefficients will change if gradient based optimization is ran again. It comes from the fact the Keras is randomly initializing the coefficients on the neurons.\n",
"\n",
"### Stage 1 : two neurons \n",
"\n",
"\n",
"| Neuron 1 | Mother | Father | Is a girl | Is a boy | Intercept |\n",
"|----------------------------|--------|--------|-----------|----------|-----------|\n",
"| Linear regressions combined| 0.212 | 0.293 | 1.0 | 0.0 | -0.734 |\n",
"| Auto trained neural net | 0.131 | 0.193 | -0.452 | 0.684 | -0.0891 |\n",
"\n",
"| Neuron 2 | Mother | Father | Is a girl | Is a boy | Intercept |\n",
"|----------------------------|--------|--------|-----------|----------|-----------|\n",
"| Linear regressions combined| 0.230 | 0.299 | 0.0 | 1.0 | 0.715 |\n",
"| Auto trained neural net | -1.18e-04 | -6.02e-06 | 4.20e-05 | 2.83e-04 | -1.05e-02 |\n",
"\n",
"\n",
"### Stage 2 : combiner\n",
"\n",
"| Neuron 3 | In1 | In2 | Intercept |\n",
"|----------------------------|--------|--------|-----------|\n",
"| Linear regressions combined| 1 | 1 | 0.0 |\n",
"| Auto trained neural net | 1.27 | 4.77e-05 | -7.21e-02 |\n",
"\n",
"\n",
"The figures are quite different in magnitude and in sign!\n",
"\n",
"With the neural net, the 2nd neuron has very small coefficients, it acts like a correction on the estimate of the 1st neuron."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step by step example on the first person\n",
"\n",
"#### With \"2 stage\" gendered linear regression"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Family
\n",
"
Father
\n",
"
Mother
\n",
"
Gender
\n",
"
Height
\n",
"
Kids
\n",
"
\n",
" \n",
" \n",
"
\n",
"
269
\n",
"
69
\n",
"
70.0
\n",
"
65.0
\n",
"
M
\n",
"
73.0
\n",
"
8
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Family Father Mother Gender Height Kids\n",
"269 69 70.0 65.0 M 73.0 8"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s1 = df_test.head(1)\n",
"s1"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.39927185, 0.27232284]])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s1_scaled = scalerX.transform(s1[['Mother', 'Father']])\n",
"s1_scaled"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1st Stage outputs = -0.613 * 0, 0.867 * 1\n"
]
}
],
"source": [
"s1_encoded = genderEncoder.transform(s1[['Gender']]).reshape(-1)\n",
"p2_out1_girl = model2_girl.predict(s1_scaled).reshape(-1)[0]\n",
"p2_out1_boy = model2_boy.predict(s1_scaled).reshape(-1)[0]\n",
"print(\"1st Stage outputs = %.3f * %d, %.3f * %d\" % (p2_out1_girl, s1_encoded[0], p2_out1_boy, s1_encoded[1]))"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prediction with combined linear regressors = 69.896 inch\n"
]
}
],
"source": [
"p2_out = np.sum(np.matmul([p2_out1_girl, p2_out1_boy], s1_encoded))\n",
"# Inverse scaling\n",
"p2_out_unscaled = scalerY.inverse_transform([p2_out]).reshape(-1)[0]\n",
"print(\"Prediction with combined linear regressors = %.3f inch\" % p2_out_unscaled)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### With the 2 layer neural net"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.39927185, 0.27232284, 0. , 1. ]])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s1_in = np.concatenate((s1_scaled, s1_encoded.reshape(1,-1)), axis=1)\n",
"s1_in"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0.70041708, -0.01034441]])"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p4_out1 = np.matmul(s1_in, weights4[0]) + weights4[1]\n",
"p4_out1"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.8189993828432794"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p4_out2 = (np.matmul(p4_out1, weights4[2]) + weights4[3]).reshape(-1)[0]\n",
"p4_out2"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prediction with 2 stage neural net = 69.726 inch\n"
]
}
],
"source": [
"p4_out_unscaled = scalerY.inverse_transform([p4_out2]).reshape(-1)[0]\n",
"print(\"Prediction with 2 stage neural net = %.3f inch\" % p4_out_unscaled)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Almost the same prediction is made but following rather different paths, showing that the gradient descent optimization's result is not unique."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"Using a little feature engineering (one hot encoding) and feature learning capabilities of the neural net, we have been able to learn a model taking into acount for the gender of the target child.\n",
"\n",
"This simple example is extended in the large classification frameworks like AlexNet, VGG or GoogLeNet: they are able to combine features extracted from images to eventually classify objects and animals.\n",
"\n",
"The drawbacks are :\n",
"- More complex models : the provided model has 2 layers and 13 parameters, whereas the manually engineered model has 2*3=6 parameters\n",
"- Complex and tricky optimizations : the gradient descent on multi-layer neural networks requires attention and skills to get a repeatable and generalizable results\n",
"\n",
"### Where to go from here :\n",
"\n",
"Compare with the [two feature binary classification using logistic regression using Keras](../classification/ClassificationContinuous2Features-Keras.html) ([Notebook](../classification/ClassificationContinuous2Features-Keras.ipynb]))\n",
"\n",
"[Multi-class classification with Keras](../classification/ClassificationMulti2Features-Keras.html) ([Notebook](../classification/ClassificationMulti2Features-Keras.ipynb))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}