1. InfoNCE loss(源自知乎https://zhuanlan.zhihu.com/p/506544456 )
1.引入
把對比學習看成是一個字典查詢的任務,即訓練一個編碼器從而去做字典查詢的任務。假設已經有一個編碼好的queryq以及一系列編碼好的樣本k0, k1, k2..., 把k0, k1, k2...可以看作是字典裡的key。假設只有一個keyk+和q匹配, 那麼q和k+為正樣本對,其餘的key為q的負樣本。一旦定義好了正負樣本對,就需要一個對比學習的損失函式來指導模型來進行學習。
2. 目的
query和唯一的正樣本k+相似,並且和其他所有負樣本key都不相似的時候,這個loss的值應該比較低。反之,如果query和k+不相似,或者和其他負樣本的key相似了,那麼loss就應該大,從而懲罰模型,促使模型進行引數更新。
3. 公式
這個k指的是負樣本的數量, \(\tau\) 是一個溫度超引數。
上式分母中的sum是在1個正樣本和k個負樣本上做的,從0到k,所以共k+1個樣本,也就是字典裡所有的key。
InfoNCE loss其實就是一個cross entropy loss,做的是一個k+1類的分類任務,目的就是想把q這個圖片分到\(k_+\)這個類。
2. Instance discrimination(下面的概念均來自b站https://www.bilibili.com/video/BV1C3411s7t9/?spm_id_from=333.1007.top_right_bar_window_history.content.click&vd_source=4afdb0bf8f80389d3492b886b5277ddc)
大致定義
B站中朱老師講解的例子是做代理任務(即原先任務過於複雜或者抽象, 採用方便模型學習理解的簡單的任務來促進模型進行後續任務的學習), 本例子中採用的是影像分類的例子。假如採用ImageNet的影像資料集, 取出其中一張照片分為兩組進行旋轉裁剪和插值等操作,然後把這兩張圖片定義為一個正樣本對, 而對於其他的圖片作為負樣本, 利用編碼器分別提取這些圖片的特徵, 對比學習就是讓正樣本之間的特徵距離拉近, 而負樣本對之間的距離拉遠。而這篇原本的論文MoCohttps://arxiv.org/abs/1911.05722這種採用無監督學習的方式,在不同的應用場景下發現比監督學習方式的模型幾乎接近, 真是讓人驚歎!有機會刷完朱老師的影片, 感覺還是知之甚少
3. 對於多標籤二分類問題的思考
今天在做kaggle上面machine learning比賽課程時, 想採用BP神經網路的方法解決鋼軌有損檢測的問題, 可以發現它的測試資料和需要提交的csv檔案如下:
可以發現要進行測試的資料均為0, 1 二分類的資料, 因此想的採用BP神經網路實現較為方便。
1. chatGPT的思路
下面是Chat GPT給出的樣例程式碼, 結合本案例做出的程式碼:
點選檢視程式碼
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split
train_X, val_X, train_labels, val_labels = train_test_split(X, y,random_state=1)
train_X_np = train_X.values
val_X_np = val_X.values
train_labels_np = train_labels.values
val_labels_np = val_labels.values
train_X_tensor = torch.from_numpy(train_X_np)
val_X_tensor = torch.from_numpy(val_X_np)
train_labels_tensor = torch.from_numpy(train_labels_np)
val_labels_tensor = torch.from_numpy(val_labels_np
train_X_tensor = torch.tensor(train_X_tensor, dtype=torch.float32)
val_X_tensor = torch.tensor(val_X_tensor, dtype=torch.float32)
train_labels_tensor = torch.tensor(train_labels_tensor, dtype=torch.float32)
val_labels_tensor = torch.tensor(val_labels_tensor, dtype=torch.float32)
n_features = train_X_tensor.shape[1]
n_labels = train_labels_tensor.shape[1]
batch_size = 32
train_dataset = TensorDataset(train_X_tensor, train_labels_tensor)
val_dataset = TensorDataset(val_X_tensor, val_labels_tensor)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
class MultiLabelClassifier(nn.Module):
def __init__(self, n_features, n_labels):
super(MultiLabelClassifier, self).__init__()
self.layer1 = nn.Linear(n_features, 64)
self.layer2 = nn.Linear(64, 128)
self.layer3 = nn.Linear(128,512)
self.layer4 = nn.Linear(512, 512)
self.layer5 = nn.Linear(512, n_labels)
self.dropout = nn.Dropout(p=0.5)
def forward(self, x):
x = torch.relu(self.layer1(x))
x = self.dropout(x)
x = torch.relu(self.layer2(x))
x = self.dropout(x)
x = torch.relu(self.layer3(x))
x = self.dropout(x)
x = torch.relu(self.layer4(x))
x = self.dropout(x)
x = torch.sigmoid(self.layer5(x)) # 使用Sigmoid函式,以便每個輸出都是獨立的標籤機率
return x
model = MultiLabelClassifier(n_features, n_labels)
loss_fn = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
def binary_accuracy(output, target, threshold=0.5):
"""計算多標籤分類的準確率"""
# 使用閾值將輸出機率轉換為二進位制預測
preds = (output > threshold).float()
# 比較預測和真實標籤
correct = (preds == target).float()
# 計算每個樣本的準確率
acc = correct.sum(1) / target.size(1)
# 返回所有樣本的平均準確率
return acc.mean()
epochs = 20
best = 0
# 訓練和評估, 並將效果最好的進行引數儲存
for epoch in range(epochs):
total_acc = 0
total_loss = 0
total_acc_val = 0
total_loss_val =0
for batch_inputs, batch_labels in train_loader:
model.train()
optimizer.zero_grad()
outputs = model(batch_inputs)
loss = loss_fn(outputs, batch_labels)
acc = binary_accuracy(outputs, batch_labels)
loss.backward()
optimizer.step()
total_loss += loss.item()
total_acc += acc.item()
train_loss, train_acc = total_loss/len(train_loader), total_acc / len(train_loader)
if train_acc >= best:
best = train_acc
torch.save(model.state_dict(), 'best.pth')
print("save best model param")
print(f'Epoch {epoch+1}/{epochs}, train_Loss: {train_loss:.4f}, train_Accuracy: {train_acc:.4f}')
for batch_inputs, batch_labels in val_loader:
model.eval()
with torch.no_grad():
outputs = model(batch_inputs)
loss_val = loss_fn(outputs, batch_labels)
acc_val = binary_accuracy(outputs, batch_labels)
total_loss_val += loss_val.item()
total_acc_val += acc_val.item()
average_loss = total_loss_val/len(val_loader)
average_acc = total_acc_val/len(val_loader)
print(f'val_loss:{average_loss:.4f}, val_acc:{average_acc:.4f}')
best_net = MultiLabelClassifier(n_features, n_labels)
# print(best_net.state_dict())
best_param = torch.load('best.pth')
best_net.load_state_dict(best_param)
# print(best_net.state_dict())
test_np = test_data.values
test_tensor = torch.from_numpy(test_np).float()
preds = (best_net(test_tensor) > 0.5).int()
preds = preds.numpy()
submission_new = pd.DataFrame(preds, columns=['Pastry',
'Z_Scratch',
'K_Scatch',
'Stains',
'Dirtiness',
'Bumps',
'Other_Faults'])
submission_new.insert(0, 'id', submission_id)
submission_new.to_csv('pytorch test.csv', index=False)