Udacity Deep Learning課程作業（一）

天才XLM發表於2017-08-17

原文網址 : https://blog.csdn.net/Draco_mystack/article/details/77341715

Udacity的深度學習是Google開設的一門基於TensorFlow完成任務的線上課程，課程短小精悍，包括4章（入門ML/DL，DNN，CNN，RNN）、6個小作業（以ipynb的形式，十分方便友好）和1個大作業（開發實時攝像頭應用）。

有ML/DL基礎的同學，視訊很快可以過完，因此課程精華在於其實戰專案，很有意思。作為G家的課程，算是TensorFlow比較權威的學習tutorial了。

課程連結：這裡
作業連結：這裡

以下是本人的課程作業（一）的程式碼

Problem 1

用IPython.display來視覺化一些樣本資料：

from IPython.display import display, Image
def visualize(folders):
    for folder_path in folders:
        fnames = os.listdir(folder_path)
        random_index = np.random.randint(len(fnames))
        fname = fnames[random_index]
        display(Image(filename=os.path.join(folder_path, fname)))

print("train_folders")
visualize(train_folders)
print("test_folders")
visualize(test_folders)

Problem 2

使用matplotlib.pyplot視覺化樣本：

def visualize_datasets(datasets):
    for dataset in datasets:
        with open(dataset, 'rb') as f:
            letter = pickle.load(f)
            sample_idx = np.random.randint(len(letter))
            sample_image = letter[sample_idx, :, :]
            fig = plt.figure()
            plt.imshow(sample_image)
        break

visualize_datasets(train_datasets)
visualize_datasets(test_datasets)

Problem 3

檢查樣本是否平衡（不同樣本的數量差不多）:

def check_dataset_is_balanced(datasets, notation=None):
    print(notation)
    for label in datasets:
        with open(label, 'rb') as f:
            ds = pickle.load(f)
            print("label {} has {} samples".format(label, len(ds)))

check_dataset_is_balanced(train_datasets, "training set")
check_dataset_is_balanced(test_datasets, "test set")

Problem 5

統計訓練集、測試集和驗證集出現重複的樣本：

import hashlib

def count_duplicates(dataset1, dataset2):
    hashes = [hashlib.sha1(x).hexdigest() for x in dataset1]
    dup_indices = []
    for i in range(0, len(dataset2)):
        if hashlib.sha1(dataset2[i]).hexdigest() in hashes:
            dup_indices.append(i)
    return len(dup_indices)

data = pickle.load(open('notMNIST.pickle', 'rb'))
print(count_duplicates(data['test_dataset'], data['valid_dataset']))
print(count_duplicates(data['valid_dataset'], data['train_dataset']))
print(count_duplicates(data['test_dataset'], data['train_dataset']))

Problem 6

使用50、100、1000和5000個和全部訓練樣本來訓練一個off-the-shelf模型，可以藉助sklearn.linear_model中的Logistic Regression方法。

def train_and_predict(X_train, y_train, X_test, y_test):
    lr = LogisticRegression()

    X_train = X_train.reshape(X_train.shape[0], 28*28)
    lr.fit(X_train, y_train)

    X_test = X_test.reshape(X_test.shape[0], 28*28)
    print(lr.score(X_test, y_test))

def main():
    X_train = data["train_dataset"]
    y_train = data["train_labels"]

    X_test = data["test_dataset"]
    y_test = data["test_labels"]
    for size in [50, 100, 1000, 5000, None]:
        train_and_predict(X_train[:size], y_train[:size], X_test[:size], y_test[:size])

main()

林軒田機器學習技法課程學習筆記13 — Deep Learning
2018-07-29
機器學習筆記
Deep learning - note 1
2018-11-01
Deep Learning with Differential Privacy
2024-04-09
軟體工程課程小作業
2020-10-01
軟體工程
軟體工程課程第一次作業
2024-09-03
軟體工程
《DEEP LEARNING·深度學習》
2024-05-05
深度學習
深度學習（Deep Learning）
2022-08-17
深度學習
資料庫課程作業筆記
2019-04-24
資料庫筆記
深度學習 DEEP LEARNING 學習筆記（一）
2020-07-24
深度學習筆記
軟體工程課程第一次個人作業
2024-09-20
軟體工程
作業系統課程實踐報告
2018-07-04
作業系統
計算機課程第三週作業
2024-10-13
計算機
課程排課系統：智慧排課+線上約課+直播上課+作業打卡！
2022-06-07
《深度學習》PDF Deep Learning: Adaptive Computation and Machine Learning series
2019-12-17
深度學習APTMac
Manning.Deep.Learning.with.Python.2017.11.pdf
2018-10-09
Python
COMP9444 Neural Networks and Deep Learning
2024-06-21
DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ | TENSORS
2022-01-19
PyTorch
Join Query Optimization with Deep Reinforcement Learning Algorithms
2020-12-27
Go
資料採集與融合技術實驗課程作業一
2024-10-18
資料庫課程作業筆記 - 驗收
2019-04-24
資料庫筆記
寫作課程目錄
2024-09-15
2024秋軟體工程課程個人作業（第一次）
2024-09-05
軟體工程
深度學習（Deep Learning）優缺點
2020-02-23
深度學習
「Wide & Deep Learning for Recommender Systems」- 論文摘要
2020-02-28
IDE
DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ | NEURAL NETWORKS
2022-01-21
PyTorch
DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ | TRAINING A CLASSIFIER
2022-01-22
PyTorchAI
Deep Embedding Learning for Text-Dependent Speaker Verification
2020-11-23
總結與思考：OOP課程PTA作業4 - 6
2024-06-07
OOP
慕課作業
2018-05-19
開課作業
2024-03-05
貝葉斯深度學習（bayesian deep learning）
2019-01-17
深度學習
Deep Learning with Python.pdf 免費下載
2019-01-08
Python
Paper Reading:A Survey of Deep Learning-based Object Detection
2020-11-21
Object
Ranked List Loss for Deep Metric Learning | 論文分享
2019-03-12
Deep Transfer Learning綜述閱讀筆記
2023-03-24
筆記
資料庫課程作業筆記 - 編寫 RESTful 路由
2019-04-24
資料庫筆記REST路由
基於課程學習（Curriculum Learning）的自然語言理解
2020-11-13
9.23課堂作業
2024-09-29
JAVA課後作業
2024-09-27
Java