标签:oss bat nload manager dataset 空间 ali rop 区别
num_workers = 0 只用主进程main process (训练程序的进程)来加载数据。主进程完成一个batch的前向后向传播,再去disk搬运下一个batch到cpu,然后再转移到GPU。
num_workers = 1 除了主进程main process外,再开一个额外的进程去disk搬运数据到内存,主进程完成一个batch的前向后向传播后,就可以直接从内存get数据而不是从disk。
因此num_workers = 1 (共有2个进程,main process + 1 additional process),效果提升明显。20%加速
原因:神经网络对于输入在0-1之间时拟合效果最好
train_set = torchvision.datasets.FashionMNIST(
root=‘./data‘
,train=True
,download=True
transform = transforms.Compose( #将多个transpose的操作组合成一个使用,这里是转化tensor和标准化。
[transforms.ToTensor(), #必须先将pil.image或ndarray转化成tensor,才能进行下面的归一化操作。
, transforms.Normalize(mean, std)) #这里是标准化输入的是(mean,std)均值和方差,因为是RGB因此mean和std都是(1,3)的vector.Normalized_image = (image - mean)/std
)
输入归一化的方法:
batch = batch.float()
batch /= 255.0
n_channels = batch.shape[1] #通道数
for c in range(n_channels):
mean = torch.mean(batch[:,c]) #只取c通道
std = torch.std(batch[:,c])
batch[:,c] = (batch[:,c]-mean)/std
#结果显示,正规化的数据相同epoch后准率更高,意味着收敛的更快
#但是不总是成立,可能有些数据不归一化更好,需要试验。
params = OrderedDict(
lr = [.01]
, batch_size = [1000]
, num_workers = [1]
, device = [‘cuda‘]
, trainset = [‘not_normal‘, ‘normal‘]
)
m = RunManager()
for run in RunBuilder.get_runs(params):
device = torch.device(run.device)
network = Network().to(device)
loader = DataLoader(
trainsets[run.trainset]
, batch_size=run.batch_size
, num_workers=run.num_workers
)
optimizer = optim.Adam(network.parameters(), lr=run.lr)
m.begin_run(run, network, loader)
for epoch in range(20):
m.begin_epoch()
for batch in loader:
images = batch[0].to(device)
labels = batch[1].to(device)
preds = network(images) # Pass Batch
loss = F.cross_entropy(preds, labels) # Calculate Loss
optimizer.zero_grad() # Zero Gradients
loss.backward() # Calculate Gradients
optimizer.step() # Update Weights
m.track_loss(loss, batch)
m.track_num_correct(preds, labels)
m.end_epoch()
m.end_run()
m.save(‘results‘)
标签:oss bat nload manager dataset 空间 ali rop 区别
原文地址:https://www.cnblogs.com/Henry-ZHAO/p/13086589.html