Python语言

更新时间：2024-01-26

表格文字识别（同步接口）

自动识别表格线及表格内容，结构化输出表头、表尾及每个单元格的文字内容。

    """ 读取图片 """
    def get_file_content(filePath):
        with open(filePath, 'rb') as fp:
            return fp.read()

    image = get_file_content('example.jpg')
    url = "https://www.x.com/sample.jpg"
    
    # 调用表格文字识别（同步接口）
    res_image = client.form(image)
    res_url = client.formUrl(url)    
    print(res_image)
    print(res_url)  
    
    # 如果有可选参数
    options = {}
    options["table_border"] = "none"
    res_image = client.form(image, options)
    res_url = client.formUrl(url, options)    
    print(res_image)
    print(res_url)

表格文字识别（同步接口）请求参数详情

字段	是否必选	类型	说明
log_id	是	long	唯一的log id，用于问题定位
forms_result_num	是	uint32	识别结果元素个数
forms_result	是	array[]	识别结果
+ body	是	array[]	表格主体区域
+ footer	是	array[]	表格尾部区域信息
header	是	array[]	表格头部区域信息
vertexes_location	是	array[]	表格边界顶点

表格文字识别（同步接口）返回示例

   {
        "log_id": 3445697108,
        "forms_result_num": 1,
        "forms_result": [
            {
                "body": [
                    {
                        "column": 0,
                        "probability": 0.99855202436447,
                        "row": 0,
                        "vertexes_location": [
                            {
                                "x": -2,
                                "y": 260
                            },
                            {
                                "x": 21,
                                "y": 244
                            },
                            {
                                "x": 35,
                                "y": 266
                            },
                            {
                                "x": 12,
                                "y": 282
                            }
                        ],
                        "words": "目"
                    },
                    {
                        "column": 3,
                        "probability": 0.99960500001907,
                        "row": 5,
                        "vertexes_location": [
                            {
                                "x": 603,
                                "y": 52
                            },
                            {
                                "x": 634,
                                "y": 32
                            },
                            {
                                "x": 646,
                                "y": 50
                            },
                            {
                                "x": 615,
                                "y": 71
                            }
                        ],
                        "words": "66"
                    },
                    {
                        "column": 3,
                        "probability": 0.99756097793579,
                        "row": 6,
                        "vertexes_location": [
                            {
                                "x": 634,
                                "y": 73
                            },
                            {
                                "x": 648,
                                "y": 63
                            },
                            {
                                "x": 657,
                                "y": 77
                            },
                            {
                                "x": 643,
                                "y": 86
                            }
                        ],
                        "words": "4"
                    },
                    {
                        "column": 3,
                        "probability": 0.96489900350571,
                        "row": 10,
                        "vertexes_location": [
                            {
                                "x": 699,
                                "y": 178
                            },
                            {
                                "x": 717,
                                "y": 167
                            },
                            {
                                "x": 727,
                                "y": 183
                            },
                            {
                                "x": 710,
                                "y": 194
                            }
                        ],
                        "words": "3,"
                    },
                    {
                        "column": 3,
                        "probability": 0.99809801578522,
                        "row": 14,
                        "vertexes_location": [
                            {
                                "x": 751,
                                "y": 296
                            },
                            {
                                "x": 786,
                                "y": 273
                            },
                            {
                                "x": 797,
                                "y": 289
                            },
                            {
                                "x": 761,
                                "y": 312
                            }
                        ],
                        "words": "206"
                    }
                ],
                "footer": [
                    {
                        "column": 0,
                        "probability": 0.99853301048279,
                        "row": 0,
                        "vertexes_location": [
                            {
                                "x": 605,
                                "y": 698
                            },
                            {
                                "x": 632,
                                "y": 680
                            },
                            {
                                "x": 643,
                                "y": 696
                            },
                            {
                                "x": 616,
                                "y": 714
                            }
                        ],
                        "words": "22"
                    }
                ],
                "header": [
                    {
                        "column": 0,
                        "probability": 0.94802802801132,
                        "row": 0,
                        "vertexes_location": [
                            {
                                "x": 183,
                                "y": 96
                            },
                            {
                                "x": 286,
                                "y": 29
                            },
                            {
                                "x": 301,
                                "y": 52
                            },
                            {
                                "x": 199,
                                "y": 120
                            }
                        ],
                        "words": "29月"
                    }
                ],
                "vertexes_location": [
                    {
                        "x": -154,
                        "y": 286
                    },
                    {
                        "x": 512,
                        "y": -153
                    },
                    {
                        "x": 953,
                        "y": 513
                    },
                    {
                        "x": 286,
                        "y": 953
                    }
                ]
            }
        ]
    }

表格文字识别(异步接口)--提交请求

自动识别表格线及表格内容，结构化输出表头、表尾及每个单元格的文字内容。表格文字识别接口为异步接口，分为两个API：提交请求接口、获取结果接口。

    """ 读取图片 """
    def get_file_content(filePath):
        with open(filePath, 'rb') as fp:
            return fp.read()

    image = get_file_content('example.jpg')

    # 调用表格文字识别(异步接口)--提交请求
    res_image = client.tableRecognitionAsync(image)
    print(res_image)

表格文字识别请求参数详情

参数名称	是否必选	类型	说明
image	是	string	图像数据，base64编码，要求base64编码后大小不超过4M，最短边至少15px，最长边最大4096px,支持jpg/png/bmp格式

表格文字识别返回数据参数详情

字段	是否必选	类型	说明
log_id	是	long	唯一的log id，用于问题定位
result	是	list	返回的结果列表
+request_id	是	string	该请求生成的request_id，后续使用该request_id获取识别结果

表格文字识别返回示例

    {
        "result" : [
            {
                "request_id" : "1234_6789"
            }
        ],
        "log_id":149689853984104
    }

失败应答示例（详细的错误码说明见本文档底部）：

    {
        "log_id": 149319909347709,
        "error_code": 282000
        "error_msg":"internal error"
    }

表格文字识别(异步接口)--获取结果

获取表格文字识别结果。

    requestId = "23454320-23255"

    # 调用表格文字识别(异步接口)--获取结果
    res_image = client.getTableRecognitionResult(requestId)
    print(res_image)
    
    # 如果有可选参数
    options = {}
    options["result_type"] = "json"
    res_image = client.getTableRecognitionResult(requestId, options)
    print(res_image)

表格识别结果请求参数详情

参数名称	是否必选	类型	可选值范围	默认值	说明
request_id	是	string			发送表格文字识别请求时返回的request id
result_type	否	string	json excel	excel	期望获取结果的类型，取值为“excel”时返回xls文件的地址，取值为“json”时返回json格式的字符串,默认为”excel”

表格识别结果返回数据参数详情

字段	是否必选	类型	说明
log_id	是	long	唯一的log id，用于问题定位
result	是	object	返回的结果
+result_data	是	string	识别结果字符串，如果request_type是excel，则返回excel的文件下载地址，如果request_type是json，则返回json格式的字符串
+percent	是	int	表格识别进度（百分比）
+request_id	是	string	该图片对应请求的request_id
+ret_code	是	int	识别状态，1：任务未开始，2：进行中,3:已完成
+ret_msg	是	string	识别状态信息，任务未开始，进行中,已完成

表格识别结果返回示例

成功应答示例：

    {
        "result" : {
            "result_data" : "",
            "persent":100,
            "request_id": "149691317905102",
            "ret_code": 3
            "ret_msg": "已完成",
        },
        "log_id":149689853984104
    }

当request_type为excel时，result_data格式样例为：

    {
        "file_url":"https://ai.baidu.com/file/xxxfffddd"
    }

当request_type为json时，result_data格式样例为：

    {
        "form_num": 1,
        "forms": [
            {
                "header": [
                    {
                    "row": [
                        1
                    ],
                    "column": [
                        1,
                        2
                    ],
                    "word": "表头信息1",
                }
            ],
            "footer": [
                {
                    "row": [
                        1
                    ],
                    "column": [
                        1,
                        2
                    ],
                    "word": "表尾信息1",
                }
            ],
            "body": [
                {
                    "row": [
                        1
                    ],
                    "column": [
                        1,
                        2
                    ],
                    "word": "单元格文字",
                }
            ]
        }
    ]
    }

其中各个参数的说明(json方式返回结果时)：

字段	是否必选	类型	说明
form_num	是	int	表格数量（可能一张图片中包含多个表格）
forms	是	list	表格内容信息的列表
+header	是	list	每个表格中，表头数据的相关信息
+footer	是	list	表尾的相关信息
+body	是	list	表格主体部分的数据
++row	是	list	该单元格占据的行号
++column	是	list	该单元格占据的列号
++word	是	string	该单元格中的文字信息

失败应答示例（详细的错误码说明见本文档底部）：

    {
        "log_id": 149319909347709,
        "error_code": 282000
        "error_msg":"internal error"
    }

表格识别接口

调用表格识别请求，获取请求id之后轮询调用表格识别获取结果的接口。

    """ 读取图片 """
    def get_file_content(filePath):
        with open(filePath, 'rb') as fp:
            return fp.read()

    image = get_file_content('example.jpg')
    
    # 调用表格识别
    options = {}
    options["result_type"] = "json"
    res_image = client.tableRecognition(image, options)
    print(res_image)

请求参数

tableRecognition(image, option, timeout)

参数名称	是否必选	类型	默认值	说明
image	是	string		图片base64编码数据
+result_type	是	string		json excel	excel	期望获取结果的类型，取值为“excel”时返回xls文件的地址，取值为“json”时返回json格式的字符串,默认为”excel”
timeout	是	number	10000	轮询tableGetresult接口获取数据的超时时间，单位毫秒

返回参数表格识别结果接口返回相同

通用票据识别

Java语言