深度学习识别手势

原创

已于 2025-06-17 14:14:01 修改 · 935 阅读

9 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #人工智能

于 2024-01-02 10:39:12 首次发布

Assignment 3

Sign Language Image Classification using Deep Learning

Scenario

A client is interested in having you (or rather the company that you work for) investigate whether it is possible to develop an app that would enable American sign language to be translated for people that do not sign, or those that sign in different languages/styles. They have provided you with a labelled data of images related to signs (hand positions) that represent individual letters in order to do a preliminary test of feasibility.

Your manager has asked you to do this feasibility assessment, but subject to a constraint on the computational facilities available. More specifically, you are asked to do no more than 100 training runs in total (including all models and hyperparameter settings that you consider).

The task requires you to create a Jupyter Notebook to perform 22 steps. These steps involve loading the dataset, fixing data problems, converting labels to one-hot encoding, plotting sample images, creating, training, and evaluating two sequential models with 20 Dense layers with 100 neurons each, checking for better accuracy using MC Dropout, retraining the first model with performance scheduling, evaluating both models, using transfer learning to create a new model using pre-trained weights, freezing the weights of the pre-trained layers, adding new Dense layers, training and evaluating the new model, predicting and converting sign language to text using the best model.

IMPORTANT

Train all the models locally on your own machine. No model training should occur on Gradescope (GS).
After completing the training, upload the trained models' h5 files and their training histories along with your notebook to GS.
- best_dnn_bn_model.h5
- best_dnn_bn_perf_model.h5
- best_dnn_selu_model.h5
- best_mobilenet_model.h5
- history1
- history2
- history1_perf
- historymb
To avoid any confusion and poor training on GS, please remember to comment out the training code in your notebook before uploading it to GS.

In [1]:

# import the necessary libraries (TensorFlow, sklearn NumPy, Pandas, and Matplotlib)

import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

# Install opencv using "pip install opencv-python" in order to use cv2.
import cv2

Step0 This test is for checking whether all the required files are submitted. Once you submit all the required files to the autograder, you will be able to pass this step.

IMPORTANT: Run this step to determine whether you have created all the required files correctly.

Points: 1

In [2]:

# Don't change this cell code
required_files = ['best_dnn_bn_model.h5','best_dnn_bn_perf_model.h5','best_dnn_selu_model.h5','best_mobilenet_model.h5','history1','history2','history1_perf','historymb']
step0_files = True
for file in required_files:
    if os.path.exists(file) == False:
        step0_files = False
        print("One or more files are missing!")
        break

# The following code is used by the autograder. Do not modify it.   
step0_data = step0_files

grader.check("step00")

Data Preprocessing

STEP1 Load the dataset (train and test) using Pandas from the CSV file.

Points: 1

In [3]:

# Load the dataset using Pandas from the CSV file
train_df = pd.read_csv("sign_mnist_train.csv")
test_df = pd.read_csv("sign_mnist_test.csv")

# The following code is used by the autograder. Do not modify it.
step1_sol = test_df.shape

train_df.head()

Out[3]:

	label	pixel1	pixel2	pixel3	pixel4	pixel5	pixel6	pixel7	pixel8	pixel9	...	pixel775	pixel776	pixel777	pixel778	pixel779	pixel780	pixel781	pixel782	pixel783	pixel784
0	3	107	118	127	134	139	143	146	150	153	...	207	207	207	207	206	206	206	204	203	202
1	6	155	157	156	156	156	157	156	158	158	...	69	149	128	87	94	163	175	103	135	149
2	2	187	188	188	187	187	186	187	188	187	...	202	201	200	199	198	199	198	195	194	195
3	2	211	211	212	212	211	210	211	210	210	...	235	234	233	231	230	226	225	222	229	163
4	13	164	167	170	172	176	179	180	184	185	...	92	105	105	108	133	163	157	163	164	179

5 rows × 785 columns

grader.check("step01")

STEP2 Examine the data and fix any problems. It is important that you don't have gaps in the number of classes, therefore, check the classes which are not available and shift the labels in order to ensure 24 classes, starting from class 0. In addition, normalize the values of your images in a range of 0 and 1.

Points: 1

In [4]:

# Separate labels and pixel values in training and testing sets
train_df = train_df[train_df.label <= 24]

train_labels = train_df.iloc[:, 0].apply(lambda x: x-1 if x > 9 else x).values
test_labels = test_df.iloc[:, 0].apply(lambda x: x-1 if x > 9 else x).values
train_images = (train_df.iloc[:, 1:] / 255.0).values
test_images = (test_df.iloc[:, 1:] / 255.0).values

# The following code is used by the autograder. Do not modify it.
step2_sol = (np.max(train_images)-np.min(train_images), np.max(train_labels)-np.min(train_labels), np.max(test_images)-np.min(test_images), np.max(test_labels)-np.min(test_labels))
step2_sol
# train_labels.value_counts().sort_index()
# # train_images.head()
# type(train_labels), train_images.dtypes, step2_sol
# type(train_labels)

Out[4]:

(1.0, 23, 1.0, 23)

grader.check("step02")

STEP3 Convert Labels to One-Hot Encoding both train and test.

Points: 1

In [5]:

# Convert labels to one-hot encoding
train_labels_encoded = keras.utils.to_categorical(train_labels)
test_labels_encoded = keras.utils.to_categorical(test_labels)

# The following code is used by the autograder. Do not modify it.
step3_sol = (len(np.unique(train_labels_encoded)), len(np.unique(test_labels_encoded)))

test_labels_encoded[:5]

Out[5]:

array([[0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)

grader.check("step03")

STEP4 Plot one sample image for each letter in the dataset given in the training set. To solve this step you should use the fu

最低0.47元/天解锁文章

200万优质内容无限畅学

	label	pixel1	pixel2	pixel3	pixel4	pixel5	pixel6	pixel7	pixel8	pixel9	...	pixel775	pixel776	pixel777	pixel778	pixel779	pixel780	pixel781	pixel782	pixel783	pixel784
0	3	107	118	127	134	139	143	146	150	153	...	207	207	207	207	206	206	206	204	203	202
1	6	155	157	156	156	156	157	156	158	158	...	69	149	128	87	94	163	175	103	135	149
2	2	187	188	188	187	187	186	187	188	187	...	202	201	200	199	198	199	198	195	194	195
3	2	211	211	212	212	211	210	211	210	210	...	235	234	233	231	230	226	225	222	229	163
4	13	164	167	170	172	176	179	180	184	185	...	92	105	105	108	133	163	157	163	164	179

	label	pixel1	pixel2	pixel3	pixel4	pixel5	pixel6	pixel7	pixel8	pixel9	...	pixel775	pixel776	pixel777	pixel778	pixel779	pixel780	pixel781	pixel782	pixel783	pixel784
0	3	107	118	127	134	139	143	146	150	153	...	207	207	207	207	206	206	206	204	203	202
1	6	155	157	156	156	156	157	156	158	158	...	69	149	128	87	94	163	175	103	135	149
2	2	187	188	188	187	187	186	187	188	187	...	202	201	200	199	198	199	198	195	194	195
3	2	211	211	212	212	211	210	211	210	210	...	235	234	233	231	230	226	225	222	229	163
4	13	164	167	170	172	176	179	180	184	185	...	92	105	105	108	133	163	157	163	164	179

	label	pixel1	pixel2	pixel3	pixel4	pixel5	pixel6	pixel7	pixel8	pixel9	...	pixel775	pixel776	pixel777	pixel778	pixel779	pixel780	pixel781	pixel782	pixel783	pixel784
0	3	107	118	127	134	139	143	146	150	153	...	207	207	207	207	206	206	206	204	203	202
1	6	155	157	156	156	156	157	156	158	158	...	69	149	128	87	94	163	175	103	135	149
2	2	187	188	188	187	187	186	187	188	187	...	202	201	200	199	198	199	198	195	194	195
3	2	211	211	212	212	211	210	211	210	210	...	235	234	233	231	230	226	225	222	229	163
4	13	164	167	170	172	176	179	180	184	185	...	92	105	105	108	133	163	157	163	164	179