作业帖 | 百度深度学习集训营
DJ星尘 发布于2019-12-09 23:08 浏览:29086 回复:949
48
收藏
最后编辑于2020-06-28

百度深度学习集训营已经正式开营,每个阶段的作业都将有各自的奖励,欢迎大家学习~

PS:如遇帖子过期、审核不通过的情况,请先复制内容保存在word文档,然后根据提示,完成个人实名验证,刷新后重新粘贴复制的内容,即可提交~

欢迎大家报名参加~

1月9日作业:

作业9-1:在第二章中学习过如何设置学习率衰减,这里建议使用分段衰减的方式,衰减系数为0.1, 根据ResNet目前的训练情况,应该在训练到多少步的时候设置衰减合适?请设置好学习率衰减方式,在眼疾识别数据集iChallenge-PM上重新训练ResNet模型。

作业9-1奖励:在作业中随机各抽取5名同学送出飞桨本+数据线+飞桨贴纸

回复帖子形式:  作业9-1:XXX

抽奖作业截止时间:2020年1月13日中午12点之前

作业9-2奖励:在作业中随机各抽取5名同学送出飞桨本+数据线+飞桨贴纸

回复帖子形式:  作业9-2:XXX

抽奖作业截止时间:2020年1月13日中午12点之前

 

1月7日作业:

作业8:如果将LeNet模型中的中间层的激活函数Sigmoid换成ReLU,在眼底筛查数据集上将会得到什么样的结果?Loss是否能收敛,ReLU和Sigmoid之间的区别是引起结果不同的原因吗?请发表你的观点

作业8奖励:在作业中随机各抽取5名同学送出飞桨本+数据线+飞桨贴纸

回复帖子形式:  作业8:XXX

获奖同学:#820 thunder95、#819 你还说不想我吗、 #818 百度用户#0762194095、#817 呵赫 he、#816 星光1dl

1月2日作业

作业7-1  计算卷积中一共有多少次乘法和加法操作

输入数据形状是[10, 3, 224, 224],卷积核kh = kw = 3,输出通道数为64,步幅stride=1,填充ph = pw =1

完成这样一个卷积,一共需要做多少次乘法和加法操作?

提示:先看输出一个像素点需要做多少次乘法和加法操作,然后再计算总共需要的操作次数

提交方式:请回复乘法和加法操作的次数,例如:乘法1000,加法1000

作业7-1奖励:抽取5人赢得飞桨定制本+数据线,截止时间2020年1月6日中午12点之前

回复帖子形式:  作业7-1:XXX

作业7-2奖励:从正确答案中抽取5人获得飞桨定制本+50元京东卡,截止时间2020年1月6日中午12点之前 

 

12月31日作业

作业6-1:

1.将普通神经网络模型的每层输出打印,观察内容
2.将分类准确率的指标 用PLT库画图表示
3.通过分类准确率,判断以采用不同损失函数训练模型的效果优劣
4.作图比较:随着训练进行,模型在训练集和测试集上的Loss曲线
5.调节正则化权重,观察4的作图曲线的变化,并分析原因
作业6-1奖励:抽取5人赢得飞桨定制本+数据线 ,回复帖子形式:  作业6-1:XXX

作业6-2:

正确运行AI Studio《百度架构师手把手教深度学习》课程里面的作业3 的极简版代码,分析训练过程中可能出现的问题或值得优化的地方,通过以下几点优化:

(1)样本:数据增强的方法

(2)假设:改进网络模型

(2)损失:尝试各种Loss

(2)优化:尝试各种优化器和学习率

目标:尽可能使模型在mnist测试集上的分类准确率最高

提交实现最高分类准确率的代码和模型,我们筛选最优结果前10名进行评奖

作业6-2奖励:飞桨定制本+50元京东卡

 

12月25日作业

12月23日作业

作业4-1:在AI studio上运行作业2,用深度学习完成房价预测模型

作业4-1奖励:飞桨定制本+ 《深度学习导论与应用实践》教材,选取第2、3、23、123、223、323…名同学送出奖品

作业4-2:回复下面问题,将答案回复帖子下方:

通过Python、深度学习框架,不同方法写房价预测,Python编写的模型 和 基于飞桨编写的模型在哪些方面存在异同?例如程序结构,编写难易度,模型的预测效果,训练的耗时等等?

回复帖子形式:  作业4-2:XXX

作业4-2奖励:在12月27日(本周五)中午12点前提交的作业中,我们选出最优前五名,送出百度定制数据线+《深度学习导论与应用实践》教材


12月17日作业

完成下面两个问题,并将答案回复在帖子下面,回帖形式:作业3-1(1)XX(2)XX

作业奖励:在2019年12月20日中午12点之前提交,随机抽取5名同学进行点评,礼品是本+数据线

12月12日作业

获奖者:第12名:飞天雄者                                     

12月10日作业
作业1-1:在AI Studio平台上https://aistudio.baidu.com/aistudio/education/group/info/888 跑通房价预测案例

作业1-1奖励:最先完成作业的前3名,以及第6名、66名、166名、266名、366名、466名、566名、666名的同学均可获得飞桨定制大礼包:飞桨帽子、飞桨数据线 、飞桨定制logo笔

作业1-1的获奖者如图:

作业1-2:完成下面两个问题,并将答案发布在帖子下面
①类比牛顿第二定律的案例,在你的工作和生活中还有哪些问题可以用监督学习的框架来解决?假设和参数是什么?优化目标是什么?
②为什么说AI工程师有发展前景?怎样从经济学(市场供需)的角度做出解读?
作业1-2奖励:回复帖子且点赞top5,获得《深度学习导论与应用实践》教材+飞桨定制本

点赞Top5获奖者:1.飞天雄者  2.God_s_apple  3.177*******62   4.学痞龙   5.故乡237、qq526557820

作业截止时间2020年1月10日,再此之前完成,才有资格参加最终Mac大奖评选

 

报名流程:

1.加入QQ群:726887660,班主任会在QQ群里进行学习资料、答疑、奖品等活动

2.点此链接,加入课程报名并实践:https://aistudio.baidu.com/aistudio/course/introduce/888

温馨提示:课程的录播会在3个工作日内上传到AI studio《百度架构师手把手教深度学习》课程上

 

收藏
点赞
48
个赞
共949条回复 最后由superbusiness0回复于2020-06-28 20:29
#873百度用户#0762194095回复于2020-01-12 16:19:39

作业9-2:程序改动部分如下

1、网络结构

2、训练过程

程序改动后,运行结果

0
#872儒雅的sagapo回复于2020-01-12 15:36:49

作业8:

将Sigmoid换成ReLU,训练的loss明显下降收敛,准确率明显提升到90+%。将大图片缩放到小图片后,致使这样结果的原因是ReLU函数没有出现梯度消失的现象

0
#871万国风云回复于2020-01-12 11:21:20

作业6-1:

打印网络结果和参数内容如下:

使用Adam优化算法且不设置正则化项时的loss曲线如下:

测试集上的效果为:

 

增加正则化项,且设置惩罚系数为0,0.05,0.1,0.15时,在训练集和测试集上效果如下:

 

可以发现,不增加正则化项在测试集上的loss不超过0.25,平均在0.15左右。而增加正则化项后,测试集上的loss曲线超过0.25,甚至达到0.8之高。因此,不增加正则化项时模型尚未过拟合,而增加正则化项后,泛化了模型预测能力,变成了欠拟合状态,因此在测试集上的loss变大了。

0
#870星光ld1回复于2020-01-11 22:31:57

作业9-1

# 9-1
def train_optimizer(model, optimizer):
    with fluid.dygraph.guard():
        print('start training ... ')
        model.train()
        epoch_num = 20
        opt = optimizer
        BATCH_SIZE = 32
        # 定义数据读取器,训练数据读取器和验证数据读取器
        train_loader = data_loader(DATADIR, batch_size=BATCH_SIZE, mode='train')
        valid_loader = valid_data_loader(DATADIR2, CSVFILE)
        iter_list = []
        loss_list = []
        iters = 0
        for epoch in range(epoch_num):
            for batch_id, data in enumerate(train_loader()):
                x_data, y_data = data
                img = fluid.dygraph.to_variable(x_data)
                label = fluid.dygraph.to_variable(y_data)
                # 运行模型前向计算,得到预测值
                logits = model(img)
                # 进行loss计算
                loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, label)
                avg_loss = fluid.layers.mean(loss)

                if batch_id % 10 == 0:
                    print("epoch: {}, batch_id: {}, loss is: {}".format(epoch, batch_id, avg_loss.numpy()))
                # 反向传播,更新权重,清除梯度
                avg_loss.backward()
                opt.minimize(avg_loss)
                model.clear_gradients()
                
                iters+=BATCH_SIZE
                iter_list.append(iters)
                loss_list.append(avg_loss.numpy())

            model.eval()
            accuracies = []
            losses = []
            for batch_id, data in enumerate(valid_loader()):
                x_data, y_data = data
                img = fluid.dygraph.to_variable(x_data)
                label = fluid.dygraph.to_variable(y_data)
                # 运行模型前向计算,得到预测值
                logits = model(img)
                # 二分类,sigmoid计算后的结果以0.5为阈值分两个类别
                # 计算sigmoid后的预测概率,进行loss计算
                pred = fluid.layers.sigmoid(logits)
                loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, label)
                # 计算预测概率小于0.5的类别
                pred2 = pred * (-1.0) + 1.0
                # 得到两个类别的预测概率,并沿第一个维度级联
                pred = fluid.layers.concat([pred2, pred], axis=1)
                acc = fluid.layers.accuracy(pred, fluid.layers.cast(label, dtype='int64'))
                accuracies.append(acc.numpy())
                losses.append(loss.numpy())
            print("[validation] accuracy/loss: {}/{}".format(np.mean(accuracies), np.mean(losses)))
            model.train()

        # save params of model
        fluid.save_dygraph(model.state_dict(), 'mnist')
        # save optimizer state
        fluid.save_dygraph(opt.state_dict(), 'mnist')
    return iter_list, loss_list

以上述为基础0.001学习率测试,结果如下,因此设置在3000//32位置学习率衰减

因此代码修改为如下

boundaries = [3000//32,5000//32]
values = [1e-3, 1e-4, 1e-5]
opt = fluid.optimizer.Momentum(
    learning_rate=fluid.layers.piecewise_decay(boundaries=boundaries, values=values),
    momentum=0.9
)
with fluid.dygraph.guard():
    model = ResNet("ResNet")
iter_list, loss_list = train_optimizer(model, opt)

 

0
#869星光ld1回复于2020-01-11 20:46:49

作业9-2:

代码部分

# 9-2
# GoogLeNet_full模型代码
import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.layer_helper import LayerHelper
from paddle.fluid.dygraph.nn import Conv2D, Pool2D, BatchNorm, FC
from paddle.fluid.dygraph.base import to_variable

# 定义Inception块
class Inception(fluid.dygraph.Layer):
    def __init__(self, name_scope, c1, c2, c3, c4, **kwargs):
        '''
        Inception模块的实现代码,
        name_scope, 模块名称,数据类型为string
        c1,  图(b)中第一条支路1x1卷积的输出通道数,数据类型是整数
        c2,图(b)中第二条支路卷积的输出通道数,数据类型是tuple或list, 
               其中c2[0]是1x1卷积的输出通道数,c2[1]是3x3
        c3,图(b)中第三条支路卷积的输出通道数,数据类型是tuple或list, 
               其中c3[0]是1x1卷积的输出通道数,c3[1]是3x3
        c4,  图(b)中第一条支路1x1卷积的输出通道数,数据类型是整数
        '''
        super(Inception, self).__init__(name_scope)
        # 依次创建Inception块每条支路上使用到的操作
        self.p1_1 = Conv2D(self.full_name(), num_filters=c1, 
                           filter_size=1, act='relu')
        self.p2_1 = Conv2D(self.full_name(), num_filters=c2[0], 
                           filter_size=1, act='relu')
        self.p2_2 = Conv2D(self.full_name(), num_filters=c2[1], 
                           filter_size=3, padding=1, act='relu')
        self.p3_1 = Conv2D(self.full_name(), num_filters=c3[0], 
                           filter_size=1, act='relu')
        self.p3_2 = Conv2D(self.full_name(), num_filters=c3[1], 
                           filter_size=5, padding=2, act='relu')
        self.p4_1 = Pool2D(self.full_name(), pool_size=3, 
                           pool_stride=1,  pool_padding=1, 
                           pool_type='max')
        self.p4_2 = Conv2D(self.full_name(), num_filters=c4, 
                           filter_size=1, act='relu')

    def forward(self, x):
        # 支路1只包含一个1x1卷积
        p1 = self.p1_1(x)
        # 支路2包含 1x1卷积 + 3x3卷积
        p2 = self.p2_2(self.p2_1(x))
        # 支路3包含 1x1卷积 + 5x5卷积
        p3 = self.p3_2(self.p3_1(x))
        # 支路4包含 最大池化和1x1卷积
        p4 = self.p4_2(self.p4_1(x))
        # 将每个支路的输出特征图拼接在一起作为最终的输出结果
        return fluid.layers.concat([p1, p2, p3, p4], axis=1)  
    
class GoogLeNet_Full(fluid.dygraph.Layer):
    def __init__(self, name_scope):
        super(GoogLeNet_Full, self).__init__(name_scope)
        # GoogLeNet包含五个模块,每个模块后面紧跟一个池化层
        self.conv1 = Conv2D(self.full_name(), num_filters=64, filter_size=7, 
                            stride=2, padding=3, act='relu')
        self.pool1 = Pool2D(self.full_name(), pool_size=3, pool_stride=2,  
                            pool_padding=1, pool_type='max')
        self.conv2_1 = Conv2D(self.full_name(), num_filters=64, filter_size=1, act='relu')
        self.conv2_2 = Conv2D(self.full_name(), num_filters=192, filter_size=3,
                            stride=1, padding=1, act='relu')
        self.pool2 = Pool2D(self.full_name(), pool_size=3, pool_stride=2,  
                            pool_padding=1, pool_type='max')

        self.inception3a = Inception(self.full_name(), 64, (96, 128), (16, 32), 32)
        self.inception3b = Inception(self.full_name(), 128, (128, 192), (32, 96), 64)
        self.pool3 = Pool2D(self.full_name(), pool_size=3, pool_stride=2,  
                            pool_padding=1, pool_type='max')
        self.inception4a = Inception(self.full_name(), 192, (96, 208), (16, 48), 64)
        self.inception4b = Inception(self.full_name(), 160, (112, 224), (24, 64), 64)
        self.inception4c = Inception(self.full_name(), 128, (128, 256), (24, 64), 64)
        self.inception4d = Inception(self.full_name(), 112, (144, 288), (32, 64), 64)
        self.inception4e = Inception(self.full_name(), 256, (160, 320), (32, 128), 128)
        self.pool4 = Pool2D(self.full_name(), pool_size=3, pool_stride=2,  
                            pool_padding=1, pool_type='max')
        self.inception5a = Inception(self.full_name(), 256, (160, 320), (32, 128), 128)
        self.inception5b = Inception(self.full_name(), 384, (192, 384), (48, 128), 128)
        self.pool5 = Pool2D(self.full_name(), pool_stride=1, 
					   global_pooling=True, pool_type='avg')
        self.drop_ratio = 0.7
        self.fc = FC(self.full_name(),  size=1)
        
        # Branch1
        self.br1_pool = Pool2D(self.full_name(), pool_stride=1, 
					   global_pooling=True, pool_type='avg')
        self.br1_fc1 = FC(self.full_name(),  size=128)
        self.br1_fc2 = FC(self.full_name(),  size=1024)
        self.br1_drop_ratio = 0.7
        self.br1_fc3 = FC(self.full_name(),  size=1)
        # Branch2
        self.br2_pool = Pool2D(self.full_name(), pool_stride=1, 
					   global_pooling=True, pool_type='avg')
        self.br2_fc1 = FC(self.full_name(),  size=128)
        self.br2_fc2 = FC(self.full_name(),  size=1024)
        self.br2_drop_ratio = 0.7
        self.br2_fc3 = FC(self.full_name(),  size=1)

    def forward(self, x):
        # Main_branch
        x = self.pool1(self.conv1(x))
        x = self.pool2(self.conv2_2(self.conv2_1(x)))
        
        x = self.inception4a(self.pool3(self.inception3b(self.inception3a(x))))
        branch1_input = x
        x = self.inception4d(self.inception4c(self.inception4b(x)))
        branch2_input = x
        x = self.pool4(self.inception4e(x))
        
        x = self.pool5(self.inception5b(self.inception5a(x)))
        x = fluid.layers.dropout(x, self.drop_ratio)
        out = self.fc(x)
        
        
        # Branch 1 
        out1 = self.br1_fc3(fluid.layers.dropout(self.br1_fc2(self.br1_fc1(self.br1_pool(branch1_input))),self.br1_drop_ratio))
        # Branch 2
        out2 = self.br2_fc3(fluid.layers.dropout(self.br2_fc2(self.br2_fc1(self.br2_pool(branch2_input))),self.br2_drop_ratio))
        return out, out1, out2
		
# 定义训练过程
def new_train(model):
    with fluid.dygraph.guard():
        print('start training ... ')
        model.train()
        epoch_num = 5
        # 定义优化器
        opt = fluid.optimizer.Momentum(learning_rate=0.001, momentum=0.9)
        # 定义数据读取器,训练数据读取器和验证数据读取器
        train_loader = data_loader(DATADIR, batch_size=10, mode='train')
        valid_loader = valid_data_loader(DATADIR2, CSVFILE)
        for epoch in range(epoch_num):
            for batch_id, data in enumerate(train_loader()):
                x_data, y_data = data
                img = fluid.dygraph.to_variable(x_data)
                label = fluid.dygraph.to_variable(y_data)
                # 运行模型前向计算,得到预测值
                logits, logits1, logits2 = model(img)
                # 进行loss计算
                loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, label)
                loss1 = fluid.layers.sigmoid_cross_entropy_with_logits(logits1, label)
                loss2 = fluid.layers.sigmoid_cross_entropy_with_logits(logits2, label)
                loss_total = 0.6*loss + 0.2*loss1+0.2*loss2
                avg_loss = fluid.layers.mean(loss_total)

                if batch_id % 10 == 0:
                    print("epoch: {}, batch_id: {}, loss is: {}".format(epoch, batch_id, avg_loss.numpy()))
                # 反向传播,更新权重,清除梯度
                avg_loss.backward()
                opt.minimize(avg_loss)
                model.clear_gradients()

            model.eval()
            accuracies = []
            losses = []
            for batch_id, data in enumerate(valid_loader()):
                x_data, y_data = data
                img = fluid.dygraph.to_variable(x_data)
                label = fluid.dygraph.to_variable(y_data)
                # 运行模型前向计算,得到预测值
                logits, logits1, logits2 = model(img)
                # 二分类,sigmoid计算后的结果以0.5为阈值分两个类别
                # 计算sigmoid后的预测概率,进行loss计算
                pred = fluid.layers.sigmoid(logits)
                pred1 = fluid.layers.sigmoid(logits1)
                pred2 = fluid.layers.sigmoid(logits2)
                pred_total = 0.6*pred + 0.2*pred1+0.2*pred2
                loss = fluid.layers.sigmoid_cross_entropy_with_logits(logits, label)
                loss1 = fluid.layers.sigmoid_cross_entropy_with_logits(logits1, label)
                loss2 = fluid.layers.sigmoid_cross_entropy_with_logits(logits2, label)
                loss_total = 0.6*loss + 0.2*loss1+0.2*loss2
                # 计算预测概率小于0.5的类别
                pred_total2 = pred_total * (-1.0) + 1.0
                # 得到两个类别的预测概率,并沿第一个维度级联
                pred = fluid.layers.concat([pred_total2, pred_total], axis=1)
                acc = fluid.layers.accuracy(pred, fluid.layers.cast(label, dtype='int64'))
                accuracies.append(acc.numpy())
                losses.append(loss_total.numpy())
            print("[validation] accuracy/loss: {}/{}".format(np.mean(accuracies), np.mean(losses)))
            model.train()

        # save params of model
        fluid.save_dygraph(model.state_dict(), 'mnist')
        # save optimizer state
        fluid.save_dygraph(opt.state_dict(), 'mnist')

训练结果

0
#868Casla711回复于2020-01-11 13:55:53

作业6-1:
1.将普通神经网络模型的每层输出打印,观察内容

########## print network layer's superparams ##############
conv1-- kernel_size:[20, 1, 5, 5], padding:[2, 2], stride:[1, 1]
conv2-- kernel_size:[20, 20, 5, 5], padding:[2, 2], stride:[1, 1]
pool1-- pool_type:max, pool_size:[2, 2], pool_stride:[2, 2]
pool2-- pool_type:max, poo2_size:[2, 2], pool_stride:[2, 2]
fc-- weight_size:[980, 10], bias_size_[10], activation:softmax

########## print shape of features of every layer ###############
inputs_shape: [100, 1, 28, 28]
outputs1_shape: [100, 20, 28, 28]
outputs2_shape: [100, 20, 14, 14]
outputs3_shape: [100, 20, 14, 14]
outputs4_shape: [100, 20, 7, 7]
outputs5_shape: [100, 10]

########## print outputs of every layer ###############
inputs: name tmp_10, dtype: VarType.FP32 shape: [28, 28] 	lod: {}
	dim: 28, 28
	layout: NCHW
	dtype: float
	data: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.664062 0.847656 0.0625 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.660156 0.988281 0.109375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.320312 0.988281 0.453125 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.222656 0.988281 0.746094 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.222656 0.992188 0.996094 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.222656 0.988281 0.988281 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.222656 0.988281 0.988281 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.222656 0.988281 0.988281 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.222656 0.992188 0.945312 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.515625 0.988281 0.453125 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.660156 0.988281 0.109375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0507812 0.808594 0.988281 0.109375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.359375 0.992188 0.992188 0.109375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.550781 0.988281 0.988281 0.109375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.550781 0.988281 0.730469 0.0234375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.550781 0.988281 0.65625 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.601562 0.992188 0.417969 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.992188 0.988281 0.21875 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.847656 0.988281 0.21875 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.308594 0.644531 0.0234375 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

outputs1: name tmp_12, dtype: VarType.FP32 shape: [28, 28] 	lod: {}
	dim: 28, 28
	layout: NCHW
	dtype: float
	data: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.107743 -0.0425535 -0.396745 0.112166 0.546622 0.173892 0.0103576 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.275182 -0.0439464 -0.96178 -0.153189 0.963963 0.51708 0.039013 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.390578 0.0103116 -1.00827 -0.125621 0.90058 0.615897 0.100305 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.28447 -0.285899 -1.31694 -0.0576798 0.852307 0.625948 0.238661 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.283786 0.137092 -1.09254 -0.69071 0.25074 0.490552 0.27705 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.259563 0.4491 -0.995031 -1.32075 0.337868 0.682673 0.195397 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.202827 0.472592 -0.919384 -1.49067 0.64926 0.72584 -0.0740934 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.186387 0.458053 -0.883865 -1.58582 0.743093 0.718873 -0.261554 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.186387 0.459363 -0.846411 -1.535 0.776497 0.584346 -0.371219 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.233921 0.379611 -1.01744 -1.23123 0.971675 0.31891 -0.463876 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.331521 0.234639 -1.35593 -0.807262 1.35387 0.0332039 -0.677534 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0082392 0.454086 -0.00201292 -1.54693 -0.258937 1.34632 -0.1551 -0.691721 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0711608 0.45532 -0.480535 -1.49301 0.60775 1.16048 -0.447825 -0.48088 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.193437 0.402292 -0.824981 -1.38297 0.919515 0.950342 -0.400131 -0.18378 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.321342 0.230001 -1.0046 -1.1731 1.00881 0.667705 -0.342584 -0.0541753 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.377824 -0.0473361 -1.09764 -0.886114 1.04689 0.509415 -0.469763 -0.0867794 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.43708 -0.125013 -1.15764 -0.702036 0.854803 0.21759 -0.510305 -0.07902 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.545534 -0.223104 -1.37404 -0.214287 1.02043 -0.0857781 -0.586148 -0.0519833 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.634066 -0.347423 -1.61954 -0.0251993 1.09078 -0.110362 -0.456248 -0.00980772 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.610672 -0.415353 -1.29981 0.156565 0.64197 -0.399974 -0.34813 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.393479 -0.538878 -0.807518 0.515784 -0.0649121 -0.664786 -0.264757 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.242604 -0.283106 -0.54146 0.306287 -0.455608 -0.77574 -0.153788 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.141195 0.0577559 -0.177634 -0.0398448 -0.526227 -0.667397 -0.0977526 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.051951 0.1512 0.123053 -0.0673122 -0.404805 -0.279819 -0.00980772 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

outputs2: name tmp_14, dtype: VarType.FP32 shape: [14, 14] 	lod: {}
	dim: 14, 14
	layout: NCHW
	dtype: float
	data: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.275182 -0.0425535 0.963963 0.51708 0 0 0 0 0 0 0 0 0 0 0.390578 0.0103116 0.90058 0.625948 0 0 0 0 0 0 0 0 0 0 0.283786 0.4491 0.337868 0.682673 0 0 0 0 0 0 0 0 0 0 0.202827 0.472592 0.743093 0.72584 0 0 0 0 0 0 0 0 0 0 0.233921 0.459363 0.971675 0.584346 0 0 0 0 0 0 0 0 0 0 0.454086 0.234639 1.35387 0.0332039 0 0 0 0 0 0 0 0 0 0 0.45532 -0.480535 1.16048 -0.18378 0 0 0 0 0 0 0 0 0 0 0.377824 -0.886114 1.04689 -0.0541753 0 0 0 0 0 0 0 0 0 0 0.545534 -0.214287 1.02043 -0.0519833 0 0 0 0 0 0 0 0 0 0 0.634066 0.156565 1.09078 0 0 0 0 0 0 0 0 0 0 0 0.393479 0.515784 -0.0649121 0 0 0 0 0 0 0 0 0 0 0 0.1512 0.123053 -0.279819 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

outputs3: name tmp_16, dtype: VarType.FP32 shape: [14, 14] 	lod: {}
	dim: 14, 14
	layout: NCHW
	dtype: float
	data: [1.86204e-08 6.96134e-09 -6.5311e-09 0.0738617 -0.0882374 0.501779 -0.0143353 -0.118527 -0.195871 0.235797 0.0407876 -8.33324e-09 -1.85782e-08 5.33535e-10 2.66289e-08 2.72922e-08 1.55458e-10 0.0953356 -0.221967 0.275329 -0.145684 0.006992 -0.433791 0.205996 0.00952349 -4.06548e-09 -8.91605e-09 -3.04813e-08 -1.8644e-08 4.29013e-09 -3.88333e-09 -0.0482235 -0.36565 0.411109 0.138121 0.174388 -0.240724 0.151631 0.0747122 2.15792e-09 -3.07933e-08 -2.50728e-08 2.09921e-09 2.9041e-08 -6.40211e-09 -0.0506894 -0.201947 0.429564 0.563309 0.31287 0.560975 0.232333 0.116557 1.37045e-08 -1.92286e-08 -2.92073e-08 2.86671e-09 1.50965e-08 -1.38043e-08 -0.114786 -0.562196 0.718849 0.639246 0.689557 0.282109 -0.235261 0.124391 2.52505e-08 -3.38769e-08 -2.05432e-08 1.92372e-08 2.29103e-08 -1.90748e-08 0.0378259 -0.615874 1.07277 0.144875 0.140119 0.28609 -0.3686 -0.00540974 9.48849e-09 1.42135e-09 -1.24957e-08 1.05752e-08 -4.84141e-09 -2.77607e-08 -0.0245793 -0.694341 1.08355 0.047209 -0.114199 -0.0170693 -0.530006 0.00218094 2.15002e-09 9.417e-09 8.24154e-09 9.89358e-09 8.12489e-09 -1.84613e-08 -0.0779099 -0.334247 0.764371 -0.0319754 -0.0813933 -0.0216134 -0.558286 0.0942625 -8.97245e-09 1.78256e-08 8.08244e-09 -2.71394e-10 3.06648e-09 6.99949e-09 -0.275809 -0.0240722 0.741117 0.181583 0.177826 0.0780636 -0.622945 0.343748 4.6728e-09 -1.06737e-08 1.16848e-08 2.00936e-08 6.60212e-09 1.84674e-08 -0.317289 0.277705 0.109897 -0.0116202 0.409118 -0.186705 -0.363929 0.350874 -5.9714e-09 1.05047e-08 -1.47552e-08 2.11975e-08 -1.07581e-08 -7.73132e-10 -0.435655 0.125806 0.101941 -0.000596382 0.678369 -0.549654 -0.307356 0.284046 1.58105e-09 5.02303e-09 -4.58177e-09 3.0278e-08 5.33599e-09 3.04099e-09 -0.630067 0.0840283 -0.129227 -0.13352 0.709296 -0.262436 -0.231588 0.190166 2.07437e-09 3.47231e-08 -2.70815e-08 1.51423e-08 9.18303e-09 -1.00494e-09 -0.237417 0.00536549 -0.0328892 -0.0780016 0.242513 -0.287225 -0.0459329 0.126023 -1.02282e-08 2.24682e-08 -2.22275e-08 1.17638e-08 8.27329e-09 -1.185e-08 -0.416378 0.182611 0.0837259 -0.0528843 0.0149047 -0.251068 0.0996464 0.0608269 -1.92888e-08 3.21026e-08 -1.61186e-08]

outputs4: name tmp_18, dtype: VarType.FP32 shape: [7, 7] 	lod: {}
	dim: 7, 7
	layout: NCHW
	dtype: float
	data: [2.72922e-08 0.0953356 0.501779 0.006992 0.235797 0.0407876 5.33535e-10 2.9041e-08 -3.88333e-09 0.429564 0.563309 0.560975 0.116557 -1.92286e-08 2.29103e-08 0.0378259 1.07277 0.689557 0.28609 0.124391 1.42135e-09 1.05752e-08 -1.84613e-08 1.08355 0.047209 -0.0170693 0.0942625 1.78256e-08 2.00936e-08 1.84674e-08 0.741117 0.409118 0.0780636 0.350874 1.16848e-08 3.0278e-08 3.04099e-09 0.125806 0.709296 -0.231588 0.284046 3.47231e-08 1.51423e-08 -1.00494e-09 0.182611 0.242513 0.0996464 0.126023 3.21026e-08]

outputs5: name tmp_20, dtype: VarType.FP32 shape: [1] 	lod: {}
	dim: 1
	layout: NCHW
	dtype: float
	data: [0.0378569]

2.将分类准确率的指标 用PLT库画图表示

paramnames = ['Momentum', 'SGD', 'Adagrad', 'Adam']
with fluid.dygraph.guard():
    print('start evaluation .......')
    #加载模型参数
    model = MNIST("mnist")
    eval_loader = load_data('eval')
    acc_set_all = []
    iters=[]
    for paramname in paramnames:
        model_state_dict, _ = fluid.load_dygraph(paramname + 'mnist')
        model.load_dict(model_state_dict)
    
        model.eval()
    
        acc_set = []
        iter=0
        iters=[]
        avg_loss_set = []
        for batch_id, data in enumerate(eval_loader()):
            x_data, y_data = data
            img = fluid.dygraph.to_variable(x_data)
            label = fluid.dygraph.to_variable(y_data)
            prediction, acc = model(img, label)
            loss = fluid.layers.cross_entropy(input=prediction, label=label)
            avg_loss = fluid.layers.mean(loss)
            iters.append(iter)
            # losses.append(avg_loss.numpy())
            iter = iter + 1
            acc_set.append(float(acc.numpy()))
            avg_loss_set.append(float(avg_loss.numpy()))
    
        #计算多个batch的平均损失和准确率
        acc_val_mean = np.array(acc_set).mean()
        avg_loss_val_mean = np.array(avg_loss_set).mean()
    
        print('loss={}, acc={}'.format(avg_loss_val_mean, acc_val_mean))
        acc_set_all.append(acc_set)

    import matplotlib.pyplot as plt
    colors = ['red', 'green', 'yellow', 'blue']
    markers = ['o', 'v', 's', '+']
    legends = ['Momentum', 'SGD', 'Adagrad', 'Adam']
    plt.figure()
    plt.title("ACC", fontsize=24)
    plt.xlabel('iter', fontsize=24)
    plt.ylabel('acc', fontsize=24)
    for i, acc in enumerate(acc_set_all):
        plt.plot(iters, acc, color=colors[i], marker=markers[i])
    plt.legend(legends)
    plt.show()

3.通过分类准确率,判断以采用不同损失函数训练模型的效果优劣
     由上图可知,adam的效果是最好的

4.作图比较:随着训练进行,模型在训练集和测试集上的Loss曲线

#引入matplotlib库
import matplotlib.pyplot as plt

with fluid.dygraph.guard(place):
    model = MNIST("mnist")
    
    
    #四种优化算法的设置方案,可以逐一尝试效果
    #optimizer = fluid.optimizer.SGDOptimizer(learning_rate=0.01)
    #optimizer = fluid.optimizer.MomentumOptimizer(learning_rate=0.01)
    #optimizer = fluid.optimizer.AdagradOptimizer(learning_rate=0.01)
    optimizer = fluid.optimizer.AdamOptimizer(learning_rate=0.01)
    #optimizer = fluid.optimizer.AdamOptimizer(learning_rate=0.01, regularization=fluid.regularizer.L2Decay(regularization_coeff=0.1))
    
    EPOCH_NUM = 5
    iter=0
    iters=[]
    train_losses=[]
    eval_losses=[]
    eval_loader = load_data('eval')
    for epoch_id in range(EPOCH_NUM):
        for batch_id, data in enumerate(train_loader()):
            model.train() 
            #准备数据,变得更加简洁
            image_data, label_data = data
            image = fluid.dygraph.to_variable(image_data)
            label = fluid.dygraph.to_variable(label_data)
            
            #前向计算的过程,同时拿到模型输出值和分类准确率
            predict, avg_acc = model(image, label)

            #计算损失,取一个批次样本损失的平均值
            loss = fluid.layers.cross_entropy(predict, label)
            avg_loss = fluid.layers.mean(loss)
            
            #每训练了100批次的数据,打印下当前Loss的情况
            if batch_id % 100 == 0:
                print("train epoch: {}, batch: {}, loss is: {}, acc is {}".format(epoch_id, batch_id, avg_loss.numpy(),avg_acc.numpy()))
                train_losses.append(avg_loss.numpy())

            #后向传播,更新参数的过程
            avg_loss.backward()
            optimizer.minimize(avg_loss)
            model.clear_gradients()
            
            if batch_id % 100 == 0:
                # 测试
                model.eval()
                for batch_id, data in enumerate(eval_loader()):
                    x_data, y_data = data
                    img = fluid.dygraph.to_variable(x_data)
                    label = fluid.dygraph.to_variable(y_data)
                    prediction, acc = model(img, label)
                    eval_loss = fluid.layers.cross_entropy(input=prediction, label=label)
                    avg_eval_loss = fluid.layers.mean(eval_loss)
                    #eval一共只有100批次的数据,打印下当前Loss的情况
                    if batch_id % 100 == 0:
                        print("eval epoch: {}, loss is: {}, acc is {}".format(epoch_id, avg_eval_loss.numpy(),acc.numpy()))
                        iters.append(iter)
                        eval_losses.append(avg_eval_loss.numpy())
                        iter = iter + 100
 

    #画出训练过程中Loss的变化曲线
    plt.figure()
    plt.title("LOSS", fontsize=24)
    plt.xlabel("iter", fontsize=14)
    plt.ylabel("loss", fontsize=14)
    plt.plot(iters, train_losses,color='red',marker='s') 
    plt.plot(iters, eval_losses,color='blue',marker='v') 
    plt.legend(['train loss', 'eval loss'])
    plt.grid()
    plt.show()

5.调节正则化权重,观察4的作图曲线的变化,并分析原因

    代码如上所示,regularization_coeff=0.1,0.5,0.9

随着"regularization_coeff"参数的增加,loss震荡越来越剧烈,并且测试集的loss比训练集的loss更低,权重越大时,对模型复杂度的惩罚越高,模型在“尽量减少训练损失”和“保持模型的泛化能力”之间越偏向泛化能力

0
#867awesomezzzz000回复于2020-01-11 13:47:13

作业6-1:

1.将普通神经网络模型的每层输出打印,观察内容

########## print network layer's superparams ##############
conv1-- kernel_size:[20, 1, 5, 5], padding:[2, 2], stride:[1, 1]
conv2-- kernel_size:[20, 20, 5, 5], padding:[2, 2], stride:[1, 1]
pool1-- pool_type:max, pool_size:[2, 2], pool_stride:[2, 2]
pool2-- pool_type:max, poo2_size:[2, 2], pool_stride:[2, 2]
fc-- weight_size:[980, 10], bias_size_[10], activation:softmax

########## print shape of features of every layer ###############
inputs_shape: [100, 1, 28, 28]
outputs1_shape: [100, 20, 28, 28]
outputs2_shape: [100, 20, 14, 14]
outputs3_shape: [100, 20, 14, 14]
outputs4_shape: [100, 20, 7, 7]
outputs5_shape: [100, 10]
epoch: 0, batch: 0, loss is: [2.6448748], acc is [0.97]
epoch: 0, batch: 200, loss is: [0.42326134], acc is [0.86]
epoch: 0, batch: 400, loss is: [0.2978367], acc is [0.92]
2.将分类准确率的指标 用PLT库画图表示

损失函数使用cross entropy

3.通过分类准确率,判断以采用不同损失函数训练模型的效果优劣

损失函数使用mean square error

对比问题2中给出的结果,得知MSE 损失函数真的不适合分类问题,MSE更适合回归问题。

4.作图比较:随着训练进行,模型在训练集和测试集上的Loss曲线

测试数据来自从测试集中随机选取的2000张图片。

5.调节正则化权重,观察4的作图曲线的变化,并分析原因

# L2 coeff 由0.1调整至0.6。与问题4的比较是,loss升高了,更加难以训练。主要的一个原因是模型训练后难以同时保证cross entropy loss小同时模型参数还普遍较小,其中有一大部分的loss来自L2项。

 

 

1
#866自在e鹤悠回复于2020-01-11 10:16:12

作业6-1:

1.通过运行2-8代码,可以打印神经网络的参数尺寸、输出特征形状以及网络参数。

2.通过将2-8代码进行修改,画出训练过程中分类准确率的变化。

3.以下是使用交叉熵为损失函数所得到的分类准确率:

以下是使用噪音对比估计损失值nce所得的分类准确率:

4.下图是训练集和测试集:

5.正则化权重为0.01,保持其他超参数不变:

正则化权重为0.04,保持其他超参数不变:

正则化权重为0.07,保持其他超参数不变:

正则化权重为0.1,保持其他超参数不变:

0
#865hyy永勇回复于2020-01-10 23:29:55

作业2-2:

import numpy as np

class Network(object):
def __init__(self, num_of_weights):
# 随机数种子
#np.random.seed(0)

self.w1 = np.random.randn(num_of_weights, num_of_weights)
self.w2 = np.random.randn(num_of_weights, 1)
self.b1 = np.random.randn(num_of_weights)
self.b2 = 0.

def forward1(self, x):
z = np.dot(x, self.w1) + self.b1
return z

def forward2(self, x):
z = np.dot(x, self.w2) + self.b2
return z
def loss(self, z, y):
error = z - y
num_samples = error.shape[0]
cost = error * error
cost = np.sum(cost) / num_samples
return cost

def gradient(self,z, x, y):
#z = self.forward(x)
N = x.shape[0]
gradient_w = 1. / N * np.sum((z-y) * x, axis=0)
gradient_w = gradient_w[:, np.newaxis]
gradient_b = 1. / N * np.sum(z-y)
return gradient_w, gradient_b

def update1(self, gradient_w, gradient_b, eta = 0.01):
self.w1 = self.w1 - eta * gradient_w
self.b1 = self.b1 - eta * gradient_b
def update2(self, gradient_w, gradient_b, eta = 0.01):
self.w2 = self.w2 - eta * gradient_w
self.b2 = self.b2 - eta * gradient_b

def train(self, training_data, num_epoches, batch_size=10, eta=0.01):
n = len(training_data)
losses = []
for epoch_id in range(num_epoches):
# 随机的打乱,
# 取batch_size条数据
np.random.shuffle(training_data)

mini_batches = [training_data[k:k+batch_size] for k in range(0, n, batch_size)]
for iter_id, mini_batch in enumerate(mini_batches):
#print(self.w.shape)
#print(self.b)
x = mini_batch[:, :-1]
y = mini_batch[:, -1:]
a1 = self.forward1(x)
a2 = self.forward2(a1)

loss = self.loss(a2, y)
gradient_w2, gradient_b2 = self.gradient(a2,a1, y)
self.update2(gradient_w2, gradient_b2, eta)

gradient_w1, gradient_b1 = self.gradient(a1,x, a1)
self.update1(gradient_w1, gradient_b1, eta)
losses.append(loss)
print('Epoch {:3d} / iter {:3d}, loss = {:.4f}'.
format(epoch_id, iter_id, loss))

return losses

# 获取数据
train_data, test_data = load_data()

# 创建网络
net = Network(13)
# 启动训练
losses = net.train(train_data, num_epoches=50, batch_size=100, eta=0.1)


plot_x = np.arange(len(losses))
plot_y = np.array(losses)
plt.plot(plot_x, plot_y)
plt.show()

0
#864hyy永勇回复于2020-01-10 23:26:02

作业2-1

0
#863hyy永勇回复于2020-01-10 23:11:57

作业1-2:完成下面两个问题

①类比牛顿第二定律的案例,在你的工作和生活中还有哪些问题可以用监督学习的框架来解决?假设和参数是什么?优化目标是什么?
②为什么说AI工程师有发展前景?怎样从经济学(市场供需)的角度做出解读?

答:

①F = P * S 压力的大小与受力面积成正比 这个问题也能用监督学习的框架解决, 假设压强固定 压力与受力面积事线性关系, 参数是压强,优化目标是找到合适的压强

② 现在大数据爆发,算力提高,算法逐渐完善,人工智能行业有很多方面可以挖掘,大量资金进入人工智能行业,国内AI工程师数量相对较少,质量参差不齐,因此,AI工程师非常有发展前景

 

0
#862儒雅的sagapo回复于2020-01-10 22:18:32

作业7-1:乘法:867041280, 加法:867041280

作业7-2:

0
#861Casla711回复于2020-01-10 22:08:29

作业5-1:

adam算法,准确率0.99

use_gpu = True
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()

with fluid.dygraph.guard():
    print('start evaluation .......')
    #加载模型参数
    model = MNIST("mnist")
    model_state_dict, _ = fluid.load_dygraph('mnist')
    model.load_dict(model_state_dict)

    model.eval()
    eval_loader = load_data('eval')

    acc_set = []
    avg_loss_set = []
    for batch_id, data in enumerate(eval_loader()):
        x_data, y_data = data
        img = fluid.dygraph.to_variable(x_data)
        label = fluid.dygraph.to_variable(y_data)
        prediction, acc = model(img, label)
        loss = fluid.layers.cross_entropy(input=prediction, label=label)
        avg_loss = fluid.layers.mean(loss)
        acc_set.append(float(acc.numpy()))
        avg_loss_set.append(float(avg_loss.numpy()))
        break
    
    #计算多个batch的平均损失和准确率
    acc_val_mean = np.array(acc_set).mean()
    avg_loss_val_mean = np.array(avg_loss_set).mean()

    print('loss={}, acc={}'.format(avg_loss_val_mean, acc_val_mean))

作业5-2: 常见的卷积神经网络包括:VGGnet, Googlenet, Alexnet, yolo, ssd, fastrcnn, fasterrcnn等等
作业5-3: Adam算法效果最好,lr在0.01-0.001之间最优,0.1震荡

use_gpu = True
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()

with fluid.dygraph.guard(place):
    model = MNIST("mnist")
    model.train() 
    #调用加载数据的函数
    train_loader = load_data('train')
    EPOCH_NUM = 5
    BATCH_SIZE = 100
    
    iters=[]
    l_losses=[]
    f_losses=[]
    all_losses=[]
    lrs = [0.1, 0.01, 0.001, 0.0001]
    for i in range(4):
        f_losses=[]
        for lr in lrs:
            if i == 0:
                optimizer = fluid.optimizer.AdamOptimizer(learning_rate=lr)
            elif i == 1:
                optimizer = fluid.optimizer.SGDOptimizer(learning_rate=lr)
            elif i == 2:
                optimizer = fluid.optimizer.MomentumOptimizer(learning_rate=lr, momentum = 0.1)
            else:
                optimizer = fluid.optimizer.AdagradOptimizer(learning_rate=lr)
            iters=[]
            iter=0
            l_losses=[]
            for epoch_id in range(EPOCH_NUM):
                for batch_id, data in enumerate(train_loader()):
                    #准备数据,变得更加简洁
                    image_data, label_data = data
                    image = fluid.dygraph.to_variable(image_data)
                    label = fluid.dygraph.to_variable(label_data)
                    #前向计算的过程,同时拿到模型输出值和分类准确率
                    predict, acc = model(image, label)
                    avg_acc = fluid.layers.mean(acc)
                    
                    #计算损失,取一个批次样本损失的平均值
                    loss = fluid.layers.cross_entropy(predict, label)
                    avg_loss = fluid.layers.mean(loss)
                    
                    #每训练了200批次的数据,打印下当前Loss的情况
                    if batch_id % 200 == 0:
                        print("epoch: {}, batch: {}, loss is: {}, acc is {}".format(epoch_id, batch_id, avg_loss.numpy(),avg_acc.numpy()))
                        iters.append(iter)
                        l_losses.append(avg_loss.numpy())
                        iter = iter + 200
                        
                    #后向传播,更新参数的过程
                    avg_loss.backward()
                    optimizer.minimize(avg_loss)
                    model.clear_gradients()
            f_losses.append(l_losses)
        all_losses.append(f_losses)
    # fluid.save_dygraph(model.state_dict(), 'mnist')
    
    #画出训练过程中Loss的变化曲线
titles = ['AdamOptimizer', 'SGDOptimizer', 'MomentumOptimizer', 'AdagradOptimize']
colors = ['yellow', 'green', 'blue', 'red']
markers = ['o', 'v', 'x', '+']
legends = ['lr_0.1', 'lr_0.01', 'lr_0.001', 'lr_0.0001']
for i, losses in enumerate(all_losses):
    # plt.figure()
    plt.title(titles[i], fontsize=24)
    plt.xlabel("iter", fontsize=14)
    plt.ylabel("loss", fontsize=14)
    for i, ls in enumerate(losses):
        plt.plot(iters, ls, color=colors[i], marker=markers[i]) 
    plt.legend(legends, loc='upper right')
    plt.grid()
    plt.show()

 

0
#860万国风云回复于2020-01-10 21:32:19

作业5-3:

可以发现,当学习率相同为0.01时,可以发现Adam优化算法的优化效果最佳。

当学习率不同时,Adam优化算法的训练结果如下:

可以发现当学习率为0.015时,效果最佳。

0
#859180******83回复于2020-01-10 21:24:18

作业8

将LeNet模型中的中间层的激活函数Sigmoid换成ReLU,在眼底筛查数据集上Loss可以收敛。


ReLU 在训练过程中收敛更快

0
#858yuzaihuan回复于2020-01-10 20:54:00

作业9-1:

下图为学习率为0.001时的训练结果

学习率为0.1时:

每10个batch Id变化时,改变学习率,步距为-0.0045

最终,得到的测试结果:

最后两个Batch,发现loss有增大的趋势,不知什么原因。

0
#857xlwan11回复于2020-01-10 20:18:13

作业8:

激活函数为relu时损失可以收敛,relu使用分段线性产生非线性函数,因此造成了网络的稀疏性,减少了参数的相互依存关系,能缓解了过拟合

0
#856wangyf童鞋回复于2020-01-10 19:49:30

作业8:

使用sigmoid函数:

使用relu函数:

对于relu函数来说,当x大于0时,其梯度恒为1,有利于反向传播的计算,不会像sigmoid函数那样出现梯度消失现象。

0
#855儒雅的sagapo回复于2020-01-10 17:08:31

作业6-1:

(1)

(2)

(3)对比这种不同的损失函数,我们发现cross entropy准确率能快速达到很高,收敛的快;

(4)采用cross entropy损失函数,训练集以及测试集的loss和accuracy同步变化,且能很快达到收敛

(5)我们发现随着正则化权重的增大,最终收敛的准确率降低,loss变大。造成这样的原因可能是,正则化系数为0时,此时基本没有过拟合现象(如(4)种图所示),随着系数的增大,模型的泛化能力变的更强,所以收敛后的损失有所增加,准确率下降

0
#854儒雅的sagapo回复于2020-01-10 17:06:18

作业6-1:

(1)

(2)

(3)对比这种不同的损失函数,我们发现cross entropy准确率能快速达到很高,收敛的快;

(4)采用cross entropy损失函数,训练集以及测试集的loss和accuracy同步变化,且能很快达到收敛

(5)我们发现随着正则化权重的增大,最终收敛的准确率降低,loss变大。造成这样的原因可能是,正则化系数为0时,此时基本没有过拟合现象(如(4)种图所示),随着系数的增大,模型的泛化能力变的更强,所以收敛后的损失有所增加,准确率下降

0
TOP
切换版块