PaddleMobile示例

更新时间：2020-11-02

示例介绍

1.1 classification示例目录结构

1.2 detection示例目录结构

1.3 yolov3示例目录结构

1.4 sample_video示例目录结构
连接设备
运行示例

3.1 本地图片预测

3.2 摄像头视频预测
C++接口调用

4.1 FPGA预处理接口

4.2 预测库接口
python接口调用

5.1 python接口目录结构

5.2 安装和使用

5.3 本地图片预测示例

示例介绍

PaddleMobile示例仅在软核1.4.0及以下版本可用，1.5.1及以上版本请使用相对应的paddleLite示例工程。

下载对应版本的示例工程请点击升级软核说明。

EdgeBoard系列硬件产品提供丰富的深度学习示例，用户在掌握EdgeBoard软核基本框架以后通过示例可以了解EdgeBoard中模型部署的基本流程，以及在示例中运行自己的模型，实现模型快速部署，加速项目落地。EdgeBoard系统自带的示例工程路径为：home/root/workspace/sample

系统包含的示例列表如下：

示例名称	说明
classification	图像分类模型示例工程，可以实现图片预测和usb摄像头视频预测
detection	物体检测模型示例工程，可以实现图片预测和usb摄像头视频预测
yolov3	yolov3网络模型示例工程，可以实现图片预测和usb摄像头视频预测
sample_video	视频预测示例工程，可以使用usb/mipi/bt1120接口摄像头、以及压缩的视频文件（EdgeBoardFZ5产品支持H264/H265视频压缩，其他型号不支持视频压缩）进行模型预测，缺点：支持模型单一，目前仅支持mobilenetssd网络模型。
script	EdgeBoard内部视频通道配置脚本，默认为bt1120视频输入方式，如果使用mipi摄像头，需要将视频通道切换为mipi视频输入方式
ebvideo	视频源库，集成项目开发中常用的视频源操作工具，如gstreamer、SDK等，提供统一接口

classification示例目录结构

├── CMakeLists.txt // cmake 工程配置文件。
├── include 
│   └── io // paddle_mobile 头文件目录
│   |   ├── paddle_inference_api.h
│   |   ├── type_define.h
│   |   └── types.h
|   ├── commom.h
|   ├── float16.h
|   ├── image_preprocess.h //图像预处理头文件
|   ├── preprocess_conf.hpp
|   └── zynqmp_api.h
|   
├── configs // 配置文件目录
│   ├── Inceptionv2
│   │   └─ zebra.json //Inceptionv2配置文件（万分类-预置斑马识别）
│   ├── Inceptionv3
│   │   └─ zebra.json //Inceptionv3配置文件（千分类-预置斑马识别）
│   ├── mobilenetv1
│   │   └─ zebra.json //mobilenetv1配置文件（千分类-预置斑马识别）
│   └── resnet50
│       └─ drink.json //resnet50配置文件（三分类-预置矿泉水识别）
├── lib
|   ├── ln.sh //建立预测库软链接脚本文件
│   └── libpaddle-mobile.so.1.5.0 //paddlemobile预测库
├── models // 模型文件目录
│   ├── Inceptionv2
│   ├── Inceptionv3
│   ├── mobilenetv1
│   └── resnet50
│── src
│   ├── json.hpp // json 解析库
│   ├── video_detection.cpp // 视频推理示例(cpu预处理)
|   ├── video_detection_fpga_preprocess.cpp // 视频推理示例（FPGA预处理）
|   ├── image_detection.cpp // 图片推理示例（CPU预处理）
│   └── image_detection_fpga_preprocess.cpp // 图片推理示例（FPGA预处理）
└── README.md

detection示例目录结构

├── CMakeLists.txt // cmake 工程配置文件。
├── include 
│   └── io // paddle_mobile 头文件目录
│   |   ├── paddle_inference_api.h
│   |   ├── type_define.h
│   |   └── types.h
|   ├── commom.h
|   ├── float16.h
|   ├── image_preprocess.h //图像预处理头文件
|   ├── preprocess_conf.hpp
|   └── zynqmp_api.h
|
├── configs // 配置文件目录
│   ├── mobilenet-ssd
│   │   └─ screw.json //mobilenetssd_300配置文件（螺丝螺母检测）
│   ├── mobilenet-ssd-640
│   │   └─ screw.json //mobilenetssd_640配置文件（螺丝螺母检测）
│   └── vgg-ssd
│       └─ screw.json //vggssd_300配置文件（螺丝螺母检测）
|
├── lib
|   ├── ln.sh //建立预测库软链接脚本文件
│   └── libpaddle-mobile.so.1.5.0 //paddlemobile预测库
|
├── models // 模型文件目录
│   ├── mobilenet-ssd
│   ├── mobilenet-ssd-640
│   └── vgg-ssd
│── src
│   ├── json.hpp // json 解析库
│   ├── video_detection.cpp // 视频推理示例(cpu预处理)
|   ├── video_detection_fpga_preprocess.cpp // 视频推理示例（FPGA预处理）
|   ├── image_detection.cpp // 图片推理示例（CPU预处理）
│   └── image_detection_fpga_preprocess.cpp // 图片推理示例（FPGA预处理）
└── README.md

yolov3示例目录结构

├── CMakeLists.txt // cmake 工程配置文件。
├── include 
│   └── io // paddle_mobile 头文件目录
│   |   ├── paddle_inference_api.h
│   |   ├── type_define.h
│   |   └── types.h
|   ├── commom.h
|   ├── float16.h
|   ├── image_preprocess.h //图像预处理头文件
|   ├── preprocess_conf.hpp
|   └── zynqmp_api.h
|
├── configs // 配置文件目录
│   ├── config.json //yolov3_608配置文件（车辆检测）
│   └── config_easydl_416.json //yolov3_416模型配置文件（车辆检测）
│       
├── lib
|   ├── ln.sh //建立预测库软链接脚本文件
│   └── libpaddle-mobile.so.1.5.0 //paddlemobile预测库
|
├── models // 模型文件目录
│   ├── easydl_yolov3
│   └── easydl_yolov3_416
│── src
│   ├── json.hpp // json 解析库
│   ├── video_detection.cpp // 视频推理示例(cpu预处理)
|   ├── video_detection_fpga_preprocess.cpp // 视频推理示例（FPGA预处理）
|   ├── image_detection.cpp // 图片推理示例（CPU预处理）
│   └── image_detection_fpga_preprocess.cpp // 图片推理示例（FPGA预处理）
└── README.md

sample_video示例目录结构

├── CMakeLists.txt // cmake 工程配置文件
├── include //头文件目录
|
├── image // 预测图片目录
|
├── lib
|   ├── ln.sh //建立预测库软链接脚本文件
│   └── libpaddle-mobile.so.1.5.0 //paddlemobile预测库
|
├── models // 模型文件目录(预置vggssd模型)
|
│── src
│   ├── capturer.cpp
│   ├── dev_hw.hpp
│   ├── main.cpp
│   ├── paddle_interface.cpp
│   ├── protocol.cpp
│   └── ssd_detect.cpp
└── README.md

连接设备

连接设备前需了解设备的外设接口，请依据设备型号参考对应的硬件介绍

连接设备步骤：

保证配套的系统TF卡已经插到设备SD卡槽；
使用配套电源给EdgeBoard供电，启动EdgeBoard。启动详情请参考EdgeBoard启动、关机和重启。
插入网线或者配套的串口调试线，EdgeBoard支持两种调试方式，网络调试和串口调试，EdgeBoard出厂默认的静态IP为192.168.1.254，login：root，password：root，推荐使用网络SSH连接方式调试更加方便快捷，设备连接方式详情请参考EdgeBoard常用连接方式

运行示例

考虑到简单通用性，EdgeBoard示例运行方法基本统一，即为每个模型提供配置文件（json文件），从配置文件中读取模型和图片信息，加载并执行。每个实例工程内有对应的配置文件目录，所以运行示例时需要指定相应的配置文件。

配置文件内容示例

{
	"model":"../models/vgg-ssd", 
	"combined_model":true, 
	"input_width":224,
	"input_height":224,
	"image":"../models/vgg-ssd/screw.jpg",
	"mean":[104,117,124],
	"scale":1,
	"format":"BGR"
    "threshold":0.5
}

配置文件参数说明

key	value
model	模型目录存放的位置
combined_model	是否为融合的模型，只有两个文件的是融合模型
input_width	输入网络的图片尺寸输入图片会缩放到该大小
input_height	输入网络的图片尺寸
image	进行分类的图片输入
mean	平均值
scale	输入网络前预算处理为 ( x - mean ) * scale
format	网络所需要的格式，OpenCV默认是BGR
threshold	信心值阈值

本地图片预测

图片预测的步骤如下：

1、连接设备

2、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

3、示例工程lib中添加预测库（系统已添加的可省略）

4、编译示例

5、执行图片预测可执行文件

classification模型本地图片预测

classification模型本地图片预测步骤如下：

1、连接设备

2、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

insmod /home/root/workspace/driver/fpgadrv.ko

3、示例工程lib中添加预测库（系统已添加的可省略）

例如：使用samba工具或者ftp工具拷贝libpaddlemobile.so.1.5.0和ln.sh拷贝到classification示例工程的lib目录home/root/workspace/sample/classification/lib，并执行脚本

在lib目录中执行脚本
sh ln.sh

4、进入classification示例工程，编译示例

//进入classification工程目录
cd /home/root/workspace/sample/classification/    
// 如果没有build目录，创建一个
mkdir build
//打开build目录
cd build
// 调用cmake 创建 Makefile 
cmake ..
// 编译工程。
make

编译完成，build目录下会出现image_classify和video_classify两个可执行文件，图片预测运行image_classify文件

注意：使用系统自带的sample，已默认编译，用户可直接进入build目录使用对应的可执行文件

5、执行示例

cd /home/root/workspace/sample/classification/build

./image_classify {json文件路径}    
// image_classify 使用cpu预处理， image_classify_fpga_preprocess使用fpga预处理

例如：

a、使用cpu预处理调用resnet模型进行图片预测

./image_classify ../configs/resnet50/drink.json

执行结果：

1593670945420

b、使用fpga预处理调用resnet50模型进行图片预测

./image_classify_fpga_preprocess ../configs/resnet50/drink.json

执行结果：

1593670954238

detection模型本地图片预测

detection模型本地图片预测步骤如下：

1、连接设备

2、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

insmod /home/root/workspace/driver/fpgadrv.ko

3、示例工程lib中添加预测库（系统已添加的可省略）

例如：使用samba工具或者ftp工具拷贝libpaddlemobile.so.1.5.0和ln.sh拷贝到detection示例工程的lib目录home/root/workspace/sample/detection/lib，并执行脚本

在lib目录中执行脚本
sh ln.sh

4、进入示例工程，编译示例

//进入yolov3工程目录
cd /home/root/workspace/sample/detection
// 如果没有build目录，创建一个
mkdir build
//打开build目录
cd build
// 调用cmake 创建 Makefile 
cmake ..
// 编译工程。
make

5、执行示例

cd /home/root/workspace/sample/detection/build

// image_detection 使用cpu预处理， image_detection_fpga_preproces使用fpga预处理
./image_detect {json文件路径}

例如：

a、使用cpu预处理调用vgg-ssd模型进行图片预测

./image_detect ../config/vgg-ssd/screw.json

执行结果：

b、使用fpga预处理调用vgg-ssd模型进行图片预测

./image_detect_fpga_preprocess ../config/vgg-ssd/screw.json

执行结果：

结果图片：

yolov3模型本地图片预测

yolov3模型本地图片预测步骤如下：

1、连接设备

2、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

insmod /home/root/workspace/driver/fpgadrv.ko

3、示例工程lib中添加预测库（系统已添加的可省略）

例如：使用samba工具或者ftp工具拷贝libpaddlemobile.so.1.5.0和ln.sh拷贝到yolov3示例工程的lib目录home/root/workspace/sample/yolov3/lib，并执行脚本

在lib目录中执行脚本
sh ln.sh

4、进入示例工程，编译示例

//进入yolov3工程目录
cd /home/root/workspace/sample/yolov3
// 如果没有build目录，创建一个
mkdir build
//打开build目录
cd build
// 调用cmake 创建 Makefile 
cmake ..
// 编译工程。
make

5、执行示例

cd /home/root/workspace/sample/yolov3/build

// image_detection 使用cpu预处理， image_detection_fpga_preprocess使用fpga预处理    
./image_detect {json文件路径}

例如：使用cpu预处理调用yolov3模型进行图片预测

./image_detect ../configs/config.json

执行结果：

结果图片：

摄像头视频预测

使用USB摄像头

使用USB摄像头预测步骤如下：

1、USB摄像头和显示器连接EdgeBoard，USB摄像头和显示器注意事项请参考EdgeBoard视频输入和输出，相关系统工具请参考tools工具。

2、连接设备

3、系统启动后，进入系统命令窗口，打开显示器桌面

//打开显示器命令
startx

4、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

insmod /home/root/workspace/driver/fpgadrv.ko

5、示例工程lib中添加预测库（系统已添加的可省略）

例如：使用samba工具或者ftp工具拷贝libpaddlemobile.so.1.5.0和ln.sh拷贝到示例工程的lib目录(例如classification目录home/root/workspace/sample/classification/lib)，并执行脚本

在lib目录中执行脚本
sh ln.sh

6、编译示例（如在图片预测时已全部编译，可忽略）

7、运行示例

图片分类模型视频预测：

cd /home/root/workspace/sample/classification/build

./video_classify {json文件路径}   
// video_classify 使用cpu预处理， video_classify_fpga_preproces使用fpga预处理

例如：调用resnet50模型进行视频预测

./video_classify ../configs/resnet50/drink.json

物体检测模型视频预测：

cd /home/root/workspace/sample/detection/build

./video_detection {json文件路径}  
// video_detection 使用cpu预处理， video_dection_fpga_preproces使用fpga预处理

例如：调用vgg-ssd模型进行视频预测

./video_detection ../configs/vgg-ssd/screw.json

使用BT1120摄像头

*BT1120仅EdgeBoard软核1.4.0及以下版本支持，1.5.0不支持

使用BT1120摄像头预测步骤：

1、BT1120摄像头和显示器连接EdgeBoard，BT1120摄像头和显示器注意事项请参考EdgeBoard视频输入和输出，相关系统工具请参考tools工具。

2、连接设备

3、系统启动后，进入系统命令窗口，打开显示器桌面

2、系统启动后，进入系统命令窗口，打开显示器桌面

//打开显示器命令
startx

4、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

insmod /home/root/workspace/driver/fpgadrv.ko

5、确保vdma视频设备配置正确，有些版本的系统镜像可能需要对其初始配置，使用如下命令

media-ctl -v --set-format '"a0010000.v_tpg":0 [RBG24 1920x1080 field:none]'

6、示例工程lib中添加预测库和视频源库

a、使用samba工具或者ftp工具拷贝libpaddlemobile.so.1.4.0和ln.sh拷贝到示例工程的lib目录(home/root/workspace/sample_video/lib)，并执行脚本

在lib目录中执行脚本
sh ln.sh

b、编译视频源库ebvideo，将生成的libebvideo.so放入sample_video的lib目录

mkdir build
cd build
cmake ..
make

7、编译示例

// 在本工程目录下建立build目录
mkdir build
// 进入build目录
cd build
//
cmake ..
make

8、运行示例，

调用BT1120摄像头并做预测，将结果显示到桌面显示屏上

./paddle_edgeboard -c 0 -i /dev/video1 -s

更多使用，请参考help

./paddle_edgeboard -h

使用mipi摄像头

使用mipi摄像头预测步骤：

1、mipi摄像头和显示器连接EdgeBoard，mipi摄像头和显示器注意事项请参考EdgeBoard视频输入和输出，相关系统工具请参考tools工具。

2、连接设备

3、系统启动后，进入系统命令窗口，打开显示器桌面

//打开显示器命令
startx

4、加载驱动,系统启动后加载一次即可（系统默认已自启动，此步骤可省略）

insmod /home/root/workspace/driver/fpgadrv.ko

5、切换视频输入

由于EdgeBoard默认video1视频输入为BT1120，所以使用mipi需切换视频输入。拷贝script文件到EdgeBoard，执行./video_config.sh

//切换为mipi视频输入
./video_config.sh 2

脚本执行截图如下：

1595325702958

6、确保vdma视频设备配置正确，有些版本的系统镜像可能需要对其初始配置，使用如下命令

media-ctl -v --set-format '"a0010000.v_tpg":0 [RBG24 1920x1080 field:none]'

7、示例工程lib中添加预测库和视频源库

a、使用samba工具或者ftp工具拷贝libpaddlemobile.so.1.4.0和ln.sh拷贝到示例工程的lib目录(home/root/workspace/sample_video/lib)，并执行脚本

在lib目录中执行脚本
sh ln.sh

b、编译视频源库ebvideo，将生成的libebvideo.so放入sample_video的lib目录

mkdir build
cd build
cmake ..
make

8、编译示例

// 在本工程目录下建立build目录
mkdir build
// 进入build目录
cd build
//
cmake ..
make

9、运行示例，

调用mipi摄像头并做预测，将结果显示到桌面显示屏上

./paddle_edgeboard -c 0 -i /dev/video1 -s

更多使用，请参考help

./paddle_edgeboard -h

C++接口调用

FPGA预处理接口

预处理接口主要是使用FPGA完成图片的缩放、颜色空间转换和mean/std操作。头文件位于include/image_preprocess.h
图像缩放：支持小于等于1080p的图片缩放，超过1080p需要使用opencv cpu进行缩放
颜色空间转换：YUV422, RGB, BGR相互转换
mean/std操作：使用FPGA运算，比cpu快2~5倍，同时支持变种mean‘和scale操作，转换公式为

具体接口

    /**
	 * 判断输入图像是否是wc 16对齐
	 * width 输入图像宽度
	 * channel 输入图像高度
	 **/
	bool img_is_align(int width, int channel);

	/**
	 * 对齐后的大小
	 * width 输入图像宽度
	 * channel 输入图像高度
	 **/
	int align_size(int width, int channel);

	/**
	 * 分配存放图片的内存，析构函数会自动释放 （目前支持BGR->RGB RGB->BGR YUV422->BGR YUV->RGB） 图像最大分辨率支持1080p
	 * height 输入图像的框
	 * width 输入图像宽度
	 * in_format 输入图像格式 参考image_format
	 * return uint8_t*  opencv Mat CV_8UC3
	 **/
	uint8_t* mem_alloc(int img_height, int img_width, image_format in_format);

	/**
	 * 如果传入的图片使用new或fpga_malloc申请的内存，手动释放输入图片内存
     * data 需要释放的内存，比如申请的图像数据
     **/
	void freeImage(void* data);

	/**
	 * 根据setImage设置进行图像缩放、输出数据进行减mean、乘scale 图像最大分辨率支持1080p
	 * input_width 输出图像宽，进入网络宽度
	 * input_height 输出图像高，进入网高度
	 * out_format 输出图像格式 参考image_format
	 * means 图像数据预处理减均值
	 * scales 图像数据预处理 乘scale   
	 **/
	void image2Tensor(int input_width, int input_height,
    	image_format out_format, float means[], float scales[], PaddleTensor* tensor);

	/**
	 * 进行图像缩放、输出数据进行减mean、乘scale， 并wc(宽和通道)对接到16。只支持3通道图像数据输入
	 * （目前支持BGR->RGB RGB->BGR YUV422->BGR YUV->RGB） 图像最大分辨率支持1080p
	 * in 输入数据，uint_8, opencv Mat CV_8UC3
	 * in_width 输入图像宽
	 * in_height 输入图像高
	 * out_width 输出图像宽
	 * out_height 输出图像高
	 * in_format 输入图像格式 参考image_format
	 * out_format 输出图像格式 参考image_format
	 * means 图像数据预处理减均值
	 * stds 图像数据预处理 除std  
	 * tensor 网络输入PaddleTensor
	 **/
	void image2TensorStd(int input_width, int input_height, image_format out_format, 
    	float means[], float stds[], PaddleTensor* tensor);

	/**
	 * 进行图像缩放、输出数据进行减mean、乘scale， 并wc(宽和通道)对接到16。（
	 * 目前支持BGR->RGB RGB->BGR YUV422->BGR YUV->RGB） 图像最大分辨率支持1080p
	 * in 输入数据，uint_8, opencv Mat CV_8UC3
	 * in_width 输入图像宽
	 * in_height 输入图像高
	 * out 输出图像数据 float16 
	 * input_width 输出图像宽，进入网络宽度
	 * input_height 输出图像高，进入网高度
	 * in_format 输入图像格式 参考image_format
	 * out_format 输出图像格式 参考image_format
	 * means 图像数据预处理减均值
	 * scales 图像数据预处理 乘scale    
	 **/
	void resize_preprocess_align(uint8_t* in, int in_width, int in_height, float16* out, int input_width, int input_height,
	 image_format in_format, image_format out_format, float means[], float scales[]);


	/**
	 * 进行图像缩放、输出数据进行减mean、乘scale， 并wc(宽和通道)对接到16。只支持3通道图像数据输入
	 * in 输入数据，uint_8, opencv Mat CV_8UC3
	 * in_width 输入图像宽
	 * in_height 输入图像高
	 * out 输出图像数据 float16 
	 * out_width 输出图像宽
	 * out_height 输出图像高
	 * in_format 输入图像格式 参考image_format
	 * out_format 输出图像格式 参考image_format
	 * means 图像数据预处理减均值
	 * scales 图像数据预处理 除std    
	 **/
	void resize_preprocess_std_align(uint8_t* in, int in_width, int in_height, float16* out, int out_width, int out_height,
	 image_format in_format, image_format out_format, float means[], float stds[]);

	/**
	 * 进行图像颜色空间转换，目前支持BGR->RGB RGB->BGR YUV422->BGR YUV->RGB  图像最大分辨率支持1080p
	 * in 输入数据，uint_8, opencv Mat CV_8UC3
	 * in_width 输入图像宽
	 * in_height 输入图像高
	 * out_width 输出图像宽
	 * out_height 输出图像高
	 * in_format 输入图像格式 参考image_format
	 * out_format 输出图像格式 参考image_format
	 *
	 **/
	void color_transfer(uint8_t* in, int in_width, int in_height, uint8_t* out, image_format in_format, image_format out_format);

	/**
	 * 进行图像缩放及颜色空间转换（目前支持BGR->RGB RGB->BGR YUV422->BGR YUV->RGB） 图像最大分辨率支持1080p
	 * in 输入数据，uint_8, opencv Mat CV_8UC3
	 * in_width 输入图像宽
	 * in_height 输入图像高
	 * out 输出图像数据 float 
	 * out_width 输出图像宽
	 * out_height 输出图像高
	 * in_format 输入图像格式 参考image_format
	 * out_format 输出图像格式 参考image_format
	 *
	 **/
	void resize(uint8_t* in, int in_width, int in_height, float* out, int out_width, int out_height, 
    	image_format in_format, image_format out_format);

FPGA预处理使用步骤

1、初始化FPGA预处理的输入输出图像格式

	static paddle_mobile::zynqmp::image_format in_format = paddle_mobile::zynqmp::BGR;
	static paddle_mobile::zynqmp::image_format out_format = paddle_mobile::zynqmp::BGR;
	
	std::string format = value["format"];
    std::transform(format.begin(), format.end(),format.begin(), ::toupper);
    if (format == "RGB") {
      out_format = paddle_mobile::zynqmp::RGB;
    }

2、构建mean/std 或mean‘/scale，这里以mean’/scale为例

    std::vector<float> mean = value["mean"];
    float scale = value["scale"];

    means[0] = mean[0];
    means[1] = mean[1];
    means[2] = mean[2];
    scales[0] = scale;
    scales[1] = scale;
    scales[2] = scale;

3、构建预处理对象，传入输出的图片格式，mean/std 或mean‘/scale，网络模型输入对象 PaddleTensor。由于FPGA预处理输入数据要求WC(width和channel)乘积是16的倍数，所以我们先计算wc对齐的大小，再算出wc需要跳跃多少个数mod。image2Tensor调用完成后，预处理的数据填充到PaddleTensor中，以HWC和float16格式存储。

	int width = mat.cols;
    int height = mat.rows;
    int channel = 3;
    image_shape[0] = height;
    image_shape[1] = width;
    // 如果图片的输入大于1080p，我们可以先把图片缩放到网络模型输入的尺寸大小，在进行FPGA预处理。
    if (width * height > 1920*1080) {
        float fx = mat.cols / input_width;
        float fy = mat.rows / input_height;
        Mat resize_mat;
        resize(mat, resize_mat, Size(input_width, input_height), fx, fy);
        mat = resize_mat;
        height = mat.rows;
        width = mat.cols;
        std::cout << "read_image resize" << std::endl;
    } 

    paddle_mobile::zynqmp::ImagePreprocess imagePreprocess;

    int align_size = imagePreprocess.align_size(mat.cols , 3); 
    uchar* img = (uchar *) imagePreprocess.mem_alloc(height, width, in_format);
    
    int mod = align_size - width * channel; 
    int count = width * channel;
    for (int i = 0; i < height; i++) {
        uchar* data = mat.ptr<uchar>(i);
        uchar* dst = img + i * (width * channel + mod);               
        memcpy(dst, data, count * sizeof(uchar));
    }
    // 一步完成图像缩放、颜色空间转换和mean/scale操作
    imagePreprocess.image2Tensor(input_width, input_height, out_format, means, scales, tensor);

CPU预处理使用步骤

FPGA预处理为1.5.0新增功能，以前版本采用CPU预处理

	auto image = value["image"];
    Mat img = imread(image);
    image_shape[0] = img.rows;
    image_shape[1] = img.cols;

    std::string format = value["format"];
    std::transform(format.begin(), format.end(),format.begin(), ::toupper);

    input_width = value["input_width"];
    input_height = value["input_height"];
    std::vector<float> mean = value["mean"];
    float scale = value["scale"];

    auto start_time = time();
    Mat img2;
    float fx = img.cols / input_width;
    float fy = img.rows / input_height;
    resize(img, img2, Size(input_width, input_height), fx, fy);
    
    Mat sample_float;
    img2.convertTo(sample_float, CV_32FC3);

    int index = 0;
    for (int row = 0; row < sample_float.rows; ++row) {
        float* ptr = (float*)sample_float.ptr(row);
        for (int col = 0; col < sample_float.cols; col++) {
            float* uc_pixel = ptr;
            float b = uc_pixel[0];
            float g = uc_pixel[1];
            float r = uc_pixel[2];

            if (format == "RGB") {
                data[index] = (r - mean[0]) * scale ;
                data[index + 1] = (g - mean[1]) * scale ;
                data[index + 2] = (b - mean[2]) * scale ;
            } else {
                data[index] = (b - mean[0]) * scale ;
                data[index + 1] = (g - mean[1]) * scale ;
                data[index + 2] = (r - mean[2]) * scale ;
            }
            ptr += 3;
            index += 3;
        }
    }
}

预测库接口

预测库接口主要完成模型的初始化、输入参数构造、预测和结果获取。位于include/io/

具体接口

PaddleMobileConfig接口用初始化
CreatePaddlePredictor接口用于构建预测接口PaddlePredictor
PaddlePredictor用于进行预测
PaddleTensor输入输出参数

预测库使用步骤

1、模型初始化，构建预测对象

	std::unique_ptr<paddle_mobile::PaddlePredictor> g_predictor;
    PaddleMobileConfig config;
    std::string model_dir = j["model"];
    config.precision = PaddleMobileConfig::FP32;
    config.device = PaddleMobileConfig::kFPGA;
    config.prog_file = model_dir + "/model";
    config.param_file = model_dir + "/params";
    config.thread_num = 4;
    g_predictor = CreatePaddlePredictor<PaddleMobileConfig,
                    PaddleEngineKind::kPaddleMobile>(config);

2、输入输出参数

    std::vector<PaddleTensor> paddle_tensor_feeds;
    PaddleTensor tensor;
    tensor.shape = std::vector<int>({1, 3, input_height, input_width});
    tensor.data = PaddleBuf(input, sizeof(input));
    tensor.dtype = PaddleDType::FLOAT32;
    paddle_tensor_feeds.push_back(tensor);

    PaddleTensor tensor_imageshape;
    tensor_imageshape.shape = std::vector<int>({1, 2});
    tensor_imageshape.data = PaddleBuf(image_shape, 1 * 2 * sizeof(float));
    tensor_imageshape.dtype = PaddleDType::FLOAT32;
    paddle_tensor_feeds.push_back(tensor_imageshape);

    PaddleTensor tensor_out;
    tensor_out.shape = std::vector<int>({});
    tensor_out.data = PaddleBuf();
    tensor_out.dtype = PaddleDType::FLOAT32;
    std::vector<PaddleTensor> outputs(1, tensor_out);

3、预测

	g_predictor->Run(paddle_tensor_feeds, &outputs);

4、获取结果

	float *data = static_cast<float *>(outputs[0].data.data());
    int size = outputs[0].shape[0];

python接口调用

python接口目录结构

EdgeBoard系统已经安装了python环境，用户可直接使用即可，同时python接口为用户提供了paddlemobile的python安装包以及示例工程，下载链接，包含文件如下：

文件名称	说明
paddlemobile-0.0.1.linux-aarch64-py2.tar.gz	paddlemobile的python2安装包
edgeboard.py	基于python的模型预测示例
api.py	edgeboard.py的api示例
configs.classification	分类模型的配置文件目录，同C++示例的配置文件
configs.detection	检测模型的配置文件目录，同C++示例的配置文件
models.classification	分类模型的模型文件目录，同C++示例的模型文件
models.detection	检测模型的模型文件目录，同C++示例的模型文件

edgeboard.py代码如下：

#!/usr/bin/python
# -*- coding: UTF-8 -*- 

import errno
import math
import os
import sys
import json

import cv2
import numpy as np

import paddlemobile as pm

predictor = None
labels = []

def init(configs):
    global predictor

    model_dir = configs['model']
    pm_config = pm.PaddleMobileConfig()   # 创建paddlemobile配置对象
    pm_config.precision = pm.PaddleMobileConfig.Precision.FP32  # 对精度赋值成FP32
    pm_config.device = pm.PaddleMobileConfig.Device.kFPGA  # 选定FPGA
    pm_config.prog_file = model_dir + "/model"
    pm_config.param_file = model_dir + '/params'
    pm_config.thread_num = 4

    print('')
    print('configuration for predictor is :')
    print('\tPrecision: ' + str(pm_config.precision))
    print('\t   Device: ' + str(pm_config.device))
    print('\t    Model: ' + str(pm_config.prog_file))
    print('\t   Params: ' + str(pm_config.param_file))
    print('\tThreadNum: ' + str(pm_config.thread_num))
    print('')

    predictor = pm.CreatePaddlePredictor(pm_config)  # 创建预测器

# 读取label_list.txt文件
def read_labels(configs):
    global labels
    if not 'labels' in configs:
        return
    label_path = configs['labels']
    if label_path is None or label_path == '':
        return
    with open(label_path) as label_file:
        line = label_file.readline()
        while line:
            labels.append(line.strip().split(':')[-1])
            line = label_file.readline()

# 读取本地图片            
def read_image(configs):
    image_input = cv2.imread(configs['image'], cv2.IMREAD_COLOR)
    return image_input

# 图像预处理
def preprocess_image(image_input, configs):
    # resizing image
    print('image shape input: ' + str(image_input.shape))
    width = configs['input_width']
    height = configs['input_height']
    image_resized = cv2.resize(image_input, (width, height), cv2.INTER_CUBIC)
    print('image shape resized: ' + str(image_resized.shape))

    # to float32
    image = image_resized.astype('float32')

    # transpose to channel-first format
    image_transposed = np.transpose(image, (2, 0, 1))
    print('image shape transposed: ' + str(image_transposed.shape))

    # mean and scale preprocessing
    mean  = np.array(configs['mean']).reshape((3, 1, 1))
    scale_number = configs['scale']
    scale = np.array([scale_number, scale_number, scale_number]).reshape((3, 1, 1))

    # RGB or BGR formatting
    format = configs['format'].upper()
    if format == 'RGB':
        b = image_transposed[0]
        g = image_transposed[1]
        r = image_transposed[2]
        image_transposed = np.stack([r, g, b])
        print('image shape formatted transposed: ' + str(image_transposed.shape))

    # mean and scale
    print 'substract mean', mean.flatten(), ' and multiple with scale', scale.flatten()
    image_transposed -= mean
    image_transposed *= scale

    # transpose back
    image_result = np.transpose(image_transposed, (1, 2, 0))
    print('image shape transposed-back: ' + str(image_result.shape))

    print('')
    return image_result

# 检测模型输出结果图片
def draw_results(image, output, threshold):
    height, width, _ = image.shape
    print 'boxes with scores above the threshold (%f): \n' % threshold
    i = 1
    for box in output:
        if box[1] > threshold:
            print '\t', i, '\t', box[0], '\t', box[1], '\t', box[2], '\t', box[3], '\t', box[4], '\t', box[5]
            x_min = int(box[2]  * width )
            y_min = int(box[3]  * height)
            x_max = int(box[4]  * width )
            y_max = int(box[5]  * height)
            cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (0, 255, 0), 3)
            i += 1
    cv2.imwrite('result.jpg', image)
    print('')

# 分类模型预测结果输出
def classify(output, configs):
    data = output.flatten()
    max_index = 0
    score = 0.0
    for i in range(len(data)):
        if data[i] > score and not data[i] == float('inf'):
            max_index = i
            score = data[i]

    print 'label: ', labels[max_index]
    print 'index: ', max_index
    print 'score: ', score
    print ''

# 检测模型预测结果输出    
def detect(output, configs):
    image = read_image(configs)
    draw_results(image, output, configs['threshold'])

# 模型预测
def predict(configs, detection):
    global predictor
    width = configs['input_width']
    height = configs['input_height']

    image = read_image(configs)
    input = preprocess_image(image, configs)

    tensor = pm.PaddleTensor()
    tensor.dtype = pm.PaddleDType.FLOAT32
    tensor.shape = (1, 3, width, height)
    tensor.data = pm.PaddleBuf(input)

    paddle_data_feeds = [tensor]

    print('prediction is running ...')
    outputs = predictor.Run(paddle_data_feeds)
    assert len(outputs) == 1, 'error numbers of tensor returned from Predictor.Run function !!!'

    output = np.array(outputs[0], copy = False)

    print('\nprediction result :')
    print('\t nDim: ' + str(output.ndim))
    print('\tShape: ' + str(output.shape))
    print('\tDType: ' + str(output.dtype))
    print('')
    # print(output)
    # print('')
    if detection:
        detect(output, configs)
    else:
        classify(output, configs)

api.py代码如下：

#!/usr/bin/python
# -*- coding: UTF-8 -*- 

import errno
import math
import os
import sys
import argparse
import json

import cv2
import numpy as np

import edgeboard

# 参数设定
def parse_args():
    parser = argparse.ArgumentParser(description='API implementation for Paddle-Mobile')
    parser.add_argument('-d', '--detection',
                        help='flag indicating detections',
                        action="store_true")
    parser.add_argument('-j', '--json',
                        help='configuration file for the prediction')
    return parser.parse_args()

# 输出参数
def print_args(args):
    print 'Arguments: '
    print '\t', '    detection flag: ', args.detection
    print '\t', 'json configuration: ', args.json

# 主函数
def main():
    args = parse_args()
    print_args(args)
    if args.json is None or args.json == '':
        print '\nFor usage, please use the -h switch.\n\n'
        sys.exit(0)

    with open(args.json) as json_file:
        configs = json.load(json_file)

    edgeboard.init(configs)
    edgeboard.read_labels(configs)
    edgeboard.predict(configs, args.detection)

if __name__ == '__main__':
    sys.exit(main())

安装和使用

1、拷贝paddlemobile-0.0.1.linux-aarch64-py2.tar.gz到用户目录，例如拷贝至/home/root/workspace/目录下

2、安装paddlemobile-python SDK，请在根目录中展开tar.gz压缩包

//进入系统根目录
cd /

//展开tar.gz 压缩包
tar -xzvf  home/root/workspace/paddlemobile-0.0.1.linux-aarch64-py2.tar.gz

//返回用户HOME目录
cd

//检查package paddlemobile (0.0.1) 是否已经安装
pip list

3、把EdgeBoard的paddle-mobile预测库到/usr/lib目录下

EdgeBoard软核升级文件的版本（fpgadrv.ko、paddlemobile.so、image.ub&BOOT.BIN）必须版本一致才能正确运行，请升级软核版本。

1.3.0版本以上的预测库包含版本号，需连同ln.sh脚本文件一同拷贝到usr/lib目录下，并执行脚本文件sh ln.sh如下图所示：

4、查看配置文件，调用python接口api.py。python接口的配置文件同C++示例配置文件结构相同，在运行自己的模型时需要更改模型文件及对应的配置文件，在此不再赘述。

本地图片预测示例

1、分类模型预测

 python api.py -j {分类模型json文件}

例如：运行预置模型Inceptionv3

 python api.py -j configs.classification/Inceptionv3/zebra.json

如下图：

2、检测模型预测

python api.py -d -j {检测模型json文件}

例如：运行预置模型vgg-ssd

python api.py -d -j configs.detection/vgg-ssd/screw.json

如下图：

预测结果图：