PaddleOCR hubserving 超时
goingbuy 发布于2021-01 浏览:1520 回复:2
0
收藏

Docker版本,官方提供的基础镜像文件都是包含CUDA的版本,自已基于centos:8.3合成的CPU版,部署和启动时没有任何问题。

W0107 05:13:49.063356     1 analysis_predictor.cc:1058] Deprecated. Please use CreatePredictor instead.
[2021-01-07 05:13:49 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2021-01-07 05:13:49 +0000] [1] [INFO] Listening at: http://0.0.0.0:8868 (1)
[2021-01-07 05:13:49 +0000] [1] [INFO] Using worker: sync
[2021-01-07 05:13:49 +0000] [32] [INFO] Booting worker with pid: 32
[2021-01-07 05:13:49 +0000] [33] [INFO] Booting worker with pid: 33
[2021-01-07 05:13:49 +0000] [34] [INFO] Booting worker with pid: 34
[2021-01-07 05:13:49 +0000] [35] [INFO] Booting worker with pid: 35

但在HTTP调用时,如果图片比较大(分辨率比较高,比如1K x 2K)时,总是出现超时:

[2021-01-07 05:14:35 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:38)
[2021-01-07 05:14:37 +0000] [50] [INFO] Booting worker with pid: 50
[2021-01-07 05:15:19 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:37)
[2021-01-07 05:15:20 +0000] [61] [INFO] Booting worker with pid: 61

将图片用工具缩小(600x800左右)后,就可以正常处理:(有并发调用)

[2021-01-07 12:32:55 +0000] [35] [INFO] Worker exiting (pid: 35)
[2021-01-07 12:32:55 +0000] [33] [INFO] Worker exiting (pid: 33)
ocr_system == 1.0.0
ocr_system == 1.0.0
dt_boxes num : 16, elapse : 1.0802247524261475
cls num  : 16, elapse : 0.3384115695953369
rec_res num  : 16, elapse : 17.49129319190979
dt_boxes num : 4, elapse : 24.130218744277954
cls num  : 4, elapse : 0.06459522247314453
rec_res num  : 4, elapse : 0.899388313293457

代码中哪里可以把超时值设大一些吗?(javaer,不熟悉python,还在摸索中),

另外源码中参数文件(deploy\hubserving\ocr_system\params.py)有详细说明吗?

def read_params():
    cfg = Config()

    #params for text detector
    cfg.det_algorithm = "DB"
    cfg.det_model_dir = "./inference/ch_ppocr_mobile_v1.1_det_infer/"
    cfg.det_max_side_len = 960   #这个参数什么含义?

    #DB parmas
    cfg.det_db_thresh =0.3
    cfg.det_db_box_thresh =0.5
    cfg.det_db_unclip_ratio =2.0

    #EAST parmas
    cfg.det_east_score_thresh = 0.8
    cfg.det_east_cover_thresh = 0.1
    cfg.det_east_nms_thresh = 0.2

    #params for text recognizer
    cfg.rec_algorithm = "CRNN"
    cfg.rec_model_dir = "./inference/ch_ppocr_mobile_v1.1_rec_infer/"

    cfg.rec_image_shape = "3, 32, 320"  #这个参数什么含义?
    cfg.rec_char_type = 'ch'
    cfg.rec_batch_num = 30              #这个参数什么含义?
    cfg.max_text_length = 25            #这个参数什么含义?

    cfg.rec_char_dict_path = "./ppocr/utils/ppocr_keys_v1.txt"
    cfg.use_space_char = True

    #params for text classifier
    cfg.use_angle_cls = True
    cfg.cls_model_dir = "./inference/ch_ppocr_mobile_v1.1_cls_infer/"
    cfg.cls_image_shape = "3, 48, 192" #这个参数什么含义?
    cfg.label_list = ['0', '180']
    cfg.cls_batch_num = 30             #这个参数什么含义?
    cfg.cls_thresh = 0.9

    cfg.use_zero_copy_run = False
    cfg.use_pdserving = False

    return cfg
收藏
点赞
0
个赞
共2条回复 最后由zi76226回复于2022-04
#4xizaohaoduopao回复于2022-01

worker timeout并非是由paddleOCR组件引起,而是由环境中的gunicorn组件。gunicorn中默认的线程timeout为30s,在config.py中找到timeout属性,延长timeout即可

0
#2qq4603009回复于2021-02

可以在PaddleOCR issues下提问,参数含义shape就是输入图片的维度 batch_num是一个批处理的图片张数

0
TOP
切换版块