作者:Sam (甄峰) sam_code@hotmail.com
Paddle在目标检测7日打卡营中有个作业--Yolo系列模型训练实战,训练PCB瑕疵检测。
现将训练过程记录如下:
1. 使用Yolov3_darknet_baseline.yml为基础训练:
其中:
max_iters: 11000
base_lr: 0.00025
batch_size=8
worker_num=8
后两项故意不修改,看看情况。
python -u tools/train.py
-c ../zienon_config/yolov3_darknet_baseline_sam.yml -o use_gpu=true
--eval --use_vdl=true --vdl_log_dir=vdl_dir/scalar3/
说明:
-c
../zienon_config/yolov3_darknet_baseline_sam.yml
采用baseline的yml文件。
--eval
表示在训练过程中进行验证。周期由yml文件中:snapshot_iter: 200 指定。即每200轮,保存一次模型,并验证。
--use_vdl=true
--vdl_log_dir=vdl_dir/scalar3/
打开可视化工具并指定保存目录。如果指定,则可以使用VisualDL直观的看loss和mAP变化。
结果如下:
Average Precision (AP) @[
IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.110 Average
Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.422
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] =
0.019 Average Precision (AP) @[ IoU=0.50:0.95 | area= small |
maxDets=100 ] = 0.213 Average Precision (AP) @[ IoU=0.50:0.95 |
area=medium | maxDets=100 ] = 0.121 Average Precision (AP) @[
IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.006 Average Recall
(AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.055 Average
Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.237
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] =
0.239 Average Recall (AR) @[ IoU=0.50:0.95 | area= small |
maxDets=100 ] = 0.237 Average Recall (AR) @[ IoU=0.50:0.95 |
area=medium | maxDets=100 ] = 0.258 Average Recall (AR) @[
IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.006 2021-04-15
14:00:30,146-INFO: Best test box ap: 0.14457301539635092, in iter:
9000
VisualDL显示:
可以看到,mAP(IoU=0.5) 的值是:0.422.
直接跑:
python
-u tools/eval.py -c
../zienon_config/yolov3_darknet_baseline_sam.yml -o use_gpu=true
weitht=output/yolov3_darknet_baseline_sam/best_model.pdparams
mAP也是类似。
2.
修改学习率:
base_lr:
0.001
max_iters: 11000
batch_size=8
worker_num=1
不知道worker_num 在单核GPU中有何意义。暂时把学习率增大看看效果。
Average Precision
(AP) @[ IoU=0.50:0.95 | area=
all | maxDets=100 ] = 0.173
Average
Precision (AP) @[ IoU=0.50
| area=
all | maxDets=100 ] = 0.567
Average
Precision (AP) @[ IoU=0.75
| area=
all | maxDets=100 ] = 0.036
Average
Precision (AP) @[ IoU=0.50:0.95 | area= small |
maxDets=100 ] = 0.252
Average
Precision (AP) @[ IoU=0.50:0.95 | area=medium |
maxDets=100 ] = 0.192
Average
Precision (AP) @[ IoU=0.50:0.95 | area= large |
maxDets=100 ] = 0.059
Average
Recall (AR)
@[ IoU=0.50:0.95 | area= all |
maxDets= 1 ] = 0.071
Average
Recall (AR)
@[ IoU=0.50:0.95 | area= all |
maxDets= 10 ] = 0.282
Average
Recall (AR)
@[ IoU=0.50:0.95 | area= all |
maxDets=100 ] = 0.282
Average
Recall (AR)
@[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.300
Average
Recall (AR)
@[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.301
Average
Recall (AR)
@[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.057
但此处疑惑的是:
从第一次训练的mAP图来看,学习率为0.00025时,mAP提高也遇到障碍,不再随着轮数提高而提高。
但第二次训练,lr提高到0.001,
mAP却事实上提高了。