手写作文识别（多模态）

更新时间：2026-01-30

接口描述

基于多模态大模型实现手写体作文内容的精准识别。支持单页、跨页、分栏作文等多种版式场景；能够有效过滤阴影、涂抹痕迹、额外批注等多种干扰信息；精准识别中文、英文手写作文笔迹，识别后输出易于处理的结构化文本，包含字层级、行层级、段落层级的坐标，无缝对接后续批改流程。

手写作文识别（多模态）API服务为异步接口，需要先调用提交请求接口获取 task_id，然后调用获取结果接口进行结果轮询，建议提交请求后 5～10 秒轮询。提交请求接口QPS为2，获取结果接口QPS为10。

在线调试

您可以在示例代码中心中调试该接口，可进行签名验证、查看在线调用的请求内容和返回结果、示例代码的自动生成。

申请试用

该接口正在邀测中，请您先提交合作咨询或提交工单，提供公司名称、appid、应用场景等信息，工作人员协助开通权限后方可使用。

提交请求接口

请求说明

请求示例

HTTP 方法：POST

请求URL： https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/create_task

URL参数：

参数	值
access_token	通过API Key和Secret Key获取的access_token,参考“Access Token获取”

Header如下：

参数	值
Content-Type	application/json

Body中放置请求参数，参数详情如下：

请求参数

参数	是否必选	类型	可选值范围	说明
image	和 url/pdf_file 三选一	string	-	图像数据，base64编码后进行urlencode，要求base64编码和urlencode后大小不超过10M，最短边至少15px，最长边最大4096px，支持jpg/jpeg/png/bmp格式优先级： image > url > pdf_file，当image字段存在时，url字段失效
url	和 Image/pdf_file 三选一	string	-	图片完整url，url长度不超过1024字节，url对应的图片base64编码后大小不超过10M，最短边至少15px，最长边最大4096px，支持jpg/jpeg/png/bmp格式优先级： image > url > pdf_file，当image字段存在时，url字段失效请注意关闭URL防盗链
pdf_file	和 image/url 三选一	string	-	PDF文件，base64编码后进行urlencode，要求base64编码和urlencode后大小不超过10M，最短边至少15px，最长边最大4096px 优先级： image > url > pdf_file，当image字段存在时，url字段失效
recognize_granularity	否	string	line/word/none	识别粒度，控制坐标返回，可选： • line: 行级坐标返回 • word: 行级坐标+字级别坐标返回 • none: 不返回坐标
pdf_file_num	否	string	-	-

请求代码示例

提示：使用示例代码前，请记得替换其中的示例Token、文档地址或Base64信息。

curl -i -k 'https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/create_task?access_token=【调用鉴权接口获取的token】' 
 -H 'Content-Type: application/json'
--data '{
    "url": "https://ai.bdstatic.com/file/088749BAB26D4809B8A0B96FE100E7F0"
}'

# encoding:utf-8

import requests
import base64

'''
作文识别提交请求
'''

request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/create_task"
# 二进制方式打开图片文件
f = open('[本地文件]', 'rb')
img = base64.b64encode(f.read())

params = json.dumps({
  "image": img
})

access_token = '[调用鉴权接口获取的token]'
request_url = request_url + "?access_token=" + access_token
headers = {'content-type': 'application/json'}
response = requests.post(request_url, data=params, headers=headers)
if response:
    print (response.json())

package com.baidu.ai.aip;

import com.baidu.ai.aip.utils.Base64Util;
import com.baidu.ai.aip.utils.FileUtil;
import com.baidu.ai.aip.utils.HttpUtil;
import com.google.gson.Gson;
import java.util.HashMap;
import java.util.Map;

/**
* 作文识别提交请求
*/
public class HandwritingCompositionCreateTask {

    /**
    * 重要提示代码中所需工具类
    * FileUtil,Base64Util,HttpUtil,GsonUtils请从
    * https://ai.baidu.com/file/658A35ABAB2D404FBF903F64D47C1F72
    * https://ai.baidu.com/file/C8D81F3301E24D2892968F09AE1AD6E2
    * https://ai.baidu.com/file/544D677F5D4E4F17B4122FBD60DB82B3
    * https://ai.baidu.com/file/470B3ACCA3FE43788B5A963BF0B625F3
    * 下载
    */
    public static String handwritingCompositionCreateTask() {
        // 请求url
        String url = "https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/create_task";
        try {
            // 本地文件路径
            String filePath = "[本地文件路径]";
            byte[] imgData = FileUtil.readFileByBytes(filePath);
            String imgStr = Base64Util.encode(imgData);

            // 构造请求体
            Map&lt;String, Object&gt; map = new HashMap&lt;&gt;();
            map.put("image", imgStr); // 或者使用 url 参数
            String param = new Gson().toJson(map);

            // 注意这里仅为了简化编码每一次请求都去获取access_token，线上环境access_token有过期时间， 客户端可自行缓存，过期后重新获取。
            String accessToken = "[调用鉴权接口获取的token]";

            String result = HttpUtil.post(url, accessToken, "application/json", param);
            System.out.println(result);
            return result;
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    public static void main(String[] args) {
        HandwritingCompositionCreateTask.handwritingCompositionCreateTask();
    }
}

返回说明

返回参数

字段	类型	说明
log_id	uint64	唯一的log id，用于问题定位
error_code	int	错误码
error_msg	string	错误描述信息
result	dict	返回的结果列表
+ task_id	string	该请求生成的task_id，后续使用该task_id获取识别结果

返回示例

成功返回示例：

{
    "error_code": 0,
    "error_msg": "",
    "log_id": "10138598131137362685273505665433",
    "result": {
        "task_id": "task-3zy9Bg8CHt1M4pPOcX2q5bg28j26801S"
    }
}

失败返回示例（详细的错误码说明见API文档-错误码）：

{
    "log_id": 1965746008642488944,
    "error_msg": "并发超限",
    "error_code": 15
}

获取结果接口

请求说明

请求示例

HTTP 方法：POST

请求URL： https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/get_result

URL参数：

参数	值
access_token	通过API Key和Secret Key获取的access_token,参考“Access Token获取”

Header如下：

参数	值
Content-Type	application/json

Body中放置请求参数，参数详情如下：

请求参数

参数	是否必选	类型	说明
task_id	是	string	发送提交请求时返回的task_id

请求代码示例

提示：使用示例代码前，请记得替换其中的示例Token、task_id。

curl --location 'https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/get_result?access_token=【调用鉴权接口获取的token】' \
--header 'Content-Type: application/json' \
--data '{
    "task_id": "1965376138007096888"
}'

# encoding:utf-8

import requests
import base64
'''
作文识别获取请求
'''

request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/get_result"


params = json.dumps({
  "task_id":  "1965376138007096888"
})
access_token = '[调用鉴权接口获取的token]'
request_url = request_url + "?access_token=" + access_token
headers = {'content-type': 'application/json'}
response = requests.post(request_url, data=params, headers=headers)
if response:
    print (response.json())

package com.baidu.ai.aip;

import com.baidu.ai.aip.utils.HttpUtil;
import com.google.gson.Gson;

import java.util.HashMap;
import java.util.Map;

/**
* 作文识别获取请求
*/
public class HandwritingCompositionGetResult {

    /**
    * 重要提示代码中所需工具类
    * HttpUtil,GsonUtils请从
    * https://ai.baidu.com/file/544D677F5D4E4F17B4122FBD60DB82B3
    * https://ai.baidu.com/file/470B3ACCA3FE43788B5A963BF0B625F3
    * 下载
    */
    public static String handwritingCompositionGetResult() {
        // 请求url
        String url = "https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/get_result";
        try {
            // task_id 来自提交请求的返回结果
            Map&lt;String, Object&gt; map = new HashMap&lt;&gt;();
            map.put("task_id", "1965376138007096888");
            String param = new Gson().toJson(map);

            String accessToken = "[调用鉴权接口获取的token]";

            String result = HttpUtil.post(url, accessToken, "application/json", param);
            System.out.println(result);
            return result;
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    public static void main(String[] args) {
        HandwritingCompositionGetResult.handwritingCompositionGetResult();
    }
}



#include &lt;iostream&gt;
#include &lt;curl/curl.h&gt;
#include &lt;string&gt;

const static std::string get_request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting_composition/get_result";
static std::string get_result_str;

/**
 * curl发送http请求调用的回调函数
 */
static size_t get_callback(void *ptr, size_t size, size_t nmemb, void *stream) {
    get_result_str = std::string((char *) ptr, size * nmemb);
    return size * nmemb;
}

/**
 * 作文识别 - 获取任务结果
 * @return 调用成功返回0，发生错误返回其他错误码
 */
int handwriting_composition_get_result(std::string &amp;json_result, const std::string &amp;access_token, const std::string &amp;task_id) {
    std::string url = get_request_url + "?access_token=" + access_token;

    CURL *curl = NULL;
    CURLcode result_code;
    int is_success;

    // 构造JSON请求体
    std::string json_body = "{\"task_id\":\"" + task_id + "\"}";

    curl = curl_easy_init();
    if (curl) {
        curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
        curl_easy_setopt(curl, CURLOPT_POST, 1L);

        // 设置请求头 Content-Type: application/json
        struct curl_slist *headers = NULL;
        headers = curl_slist_append(headers, "Content-Type: application/json");
        curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);

        curl_easy_setopt(curl, CURLOPT_POSTFIELDS, json_body.c_str());

        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, get_callback);
        result_code = curl_easy_perform(curl);

        if (result_code != CURLE_OK) {
            fprintf(stderr, "curl_easy_perform() failed: %s\n",
                    curl_easy_strerror(result_code));
            is_success = 1;
        } else {
            json_result = get_result_str;
            is_success = 0;
        }

        curl_slist_free_all(headers);
        curl_easy_cleanup(curl);
    } else {
        fprintf(stderr, "curl_easy_init() failed.\n");
        is_success = 1;
    }
    return is_success;
}

返回说明

返回参数

字段	类型	说明
log_id	uint64	唯一的log id，用于问题定位
error_code	int	错误码
error_msg	string	错误描述信息
result	dict	返回的结果列表
+ task_id	string	任务ID
+ status	string	任务状态，pending：排队中；processing：运行中；success：成功；failed：失败
+ created_time	string	任务创建时间
+ started_time	string	任务开始时间
+ finished_time	string	任务结束时间
+ duration	string	任务执行时长
+ result	dict	作文识别结果
++ recognize_granularity	string	识别粒度，控制坐标返回，可选： • line: 行级坐标返回 • word: 行级坐标+字级别坐标返回 • none: 不返回坐标
++ essayOverall	dict	识别的文本总体内容
+++ titleText	string	作文题目文本
+++ contentText	string	作文正文文本
++ title	dict	作文题目详细信息
+++ bbox	list	仅字级和行级粒度返回，标题外接矩形坐标
+++ text	string	标题文本内容
+++ chars	list	仅字级粒度返回，标题字级别详细列表
++++ isPunctuation	string	该字符是否为标点符号
++++ bbox	dict	字坐标
++++ char	string	单个字符
++++ index	string	字符索引
++ content	dict	作文正文详细信息
+++ lines	list	仅字级和行级粒度返回，行级信息列表
++++ lineId	string	行的唯一标识符
++++ text	string	该行的文本内容
++++ bbox	string	该行坐标
++++ paragraphId	string	该行所属段落的ID ，关联 paragraphs
++++ chars	list	仅字级粒度返回，行内单字/字符详细列表
+++++ char	string	正文行的单字内容
+++++ index	string	正文行的单字编号
+++++ bbox	string	正文行的单字坐标
+++++ isPunctuation	string	正文行的单字是否为坐标
+++ paragraphs	list	段落级逻辑信息列表
++++ bbox	list	仅字级和行级粒度返回，段落轮廓坐标列表，可能含多个框
++++ paragraphId	string	段落唯一标识符 (如 "p1")
++++ isColumn	string	仅字级和行级粒度返回，是否分栏 (1:分栏，0: 不分栏)
++++ text	string	段落完整文本
++++ sentences	list	段落内的句子列表
+++++ bbox	list	仅字级和行级粒度返回，句子轮廓坐标列表
+++++ sentenceId	string	句子唯一标识符
+++++ text	string	句子文本内容
+++++ lineSegments	list	仅字级和行级粒度返回，句行映射片段，描述该句子对应的行及起止位置
++++++ lineId	string	对应 lines 里的 lineId
++++++ startIndex	string	该句在行中的起始位置标识
++++++ endIndex	string	该句在行中的结束位置标识

返回示例

通用成功返回示例：

{
    "error_code": "0",
    "error_msg": "",
    "result": {
        "task_id": "1965376138007096888",
        "status": "Success",
        "created_time": 1757417161000,
        "started_time": 1757417162000,
        "finished_time": 1757497412914,
        "duration": 80250914,
        "result": {
            "recognize_granularity": "none",
            "essayOverall": {
                "titleText": "我的童年",
                "contentText": "每个人都有难忘的童年。我的童年充满了欢声笑语，\n夏天会和小伙伴去河边捉鱼。冬天则围在火炉旁\n听奶奶讲故事。"
            }，
            "title": { // 标题极简设计：仅文本+坐标（按需加单字）
                "text": "我的童年",
                "bbox": { "x": 100, "y": 50, "w": 95, "h": 20 } // 标题整体的xywh包围盒
                // 字级粒度时新增chars数组，否则无此字段
            },
            "content": { // 正文保留完整层级：行→段落→句子
                "paragraphs": [], // 逻辑段落（含坐标+句子关联）
                "lines": [] // 物理行（含坐标+段落归属）
            }
    }
}

行级别坐标

"result": {
  "recognize_granularity": "line",
  "essayOverall": { /* 同上 */ },
  "title": {
    "text": "我的童年",
    "bbox": [{ "x": 100, "y": 50, "w": 95, "h": 20 } ]// 标题整体坐标，无行/句子拆分
  },
  "content": {
    "paragraphs": [
      {
        "paragraphId": "p1",
        "text": "每个人都有难忘的童年。我的童年充满了欢声笑语，夏天会和小伙伴去河边捉鱼。",
        "bbox": [{ "x": 80, "y": 100, "w": 350, "h": 40 }], // 段落最小包围盒
        "isColumn": 0,
        "sentences": [
          {
            "sentenceId": "s1-p1",
            "text": "每个人都有难忘的童年。",
            "bbox":[{ "x": 80, "y": 100, "w": 260, "h": 20 }], // 句子最小包围盒
            "lineSegments": [ // 句子对应的行片段（解决跨行列）
              { "lineId": "l1", "startIndex": 0, "endIndex": 10 }
            ]
          },
          {
            "sentenceId": "s2-p1",
            "text": "我的童年充满了欢声笑语，夏天会和小伙伴去河边捉鱼。",
            "bbox": [{ "x": 80, "y": 100, "w": 350, "h": 40 }],
            "lineSegments": [
              { "lineId": "l1", "startIndex": 11, "endIndex": 25 },
              { "lineId": "l2", "startIndex": 0, "endIndex": 12 }
            ]
          }
        ]
      },
      {
        "paragraphId": "p2",
        "text": "冬天则围在火炉旁听奶奶讲故事。",
        "bbox": [{ "x": 80, "y": 130, "w": 350, "h": 40 }],
        "sentences": [
          {
            "sentenceId": "s1-p2",
            "text": "冬天则围在火炉旁听奶奶讲故事。",
            "bbox":[{ "x": 80, "y": 130, "w": 350, "h": 40 }],
            "lineSegments": [
              { "lineId": "l2", "startIndex": 13, "endIndex": 20 },
              { "lineId": "l3", "startIndex": 0, "endIndex": 8 }
            ]
          }
        ]
      }
    ],
    "lines": [
      {
        "lineId": "l1",
        "text": "每个人都有难忘的童年。我的童年充满了欢声笑语，",
        "bbox": { "x": 80, "y": 100, "w": 350, "h": 20 },
        "paragraphId": "p1" // 仅关联一个段落，无数组
      },
      {
        "lineId": "l2",
        "text": "夏天会和小伙伴去河边捉鱼。冬天则围在火炉旁",
        "bbox": { "x": 80, "y": 130, "w": 350, "h": 20 },
        "paragraphId": "p1" // 仅关联一个段落，无数组
      },
      {
        "lineId": "l3",
        "text": "听奶奶讲故事。",
        "bbox": { "x": 80, "y": 160, "w": 120, "h": 20 },
        "paragraphId": "p2" // 仅关联一个段落，无数组
      }
    ]
  }
}

字级别坐标

"result": {
  "recognize_granularity": "word",
  "essayOverall": { /* 同上 */ },
  "title": {
    "text": "我的童年",
    "bbox": { "x": 100, "y": 50, "w": 95, "h": 20 }, // 标题整体坐标
    "chars": [ // 标题单字坐标（仅字级粒度返回）
      { "char": "我", "index": 0, "bbox": { "x": 100, "y": 50, "w": 20, "h": 20 }, "isPunctuation": false },
      { "char": "的", "index": 1, "bbox": { "x": 125, "y": 50, "w": 20, "h": 20 }, "isPunctuation": false },
      { "char": "童", "index": 2, "bbox": { "x": 150, "y": 50, "w": 20, "h": 20 }, "isPunctuation": false },
      { "char": "年", "index": 3, "bbox": { "x": 175, "y": 50, "w": 20, "h": 20 }, "isPunctuation": false }
    ]
  },
  "content": {
    "paragraphs": [ /* 同line粒度的段落/句子结构 */ ],
    "lines": [
      {
        "lineId": "l1",
        "text": "每个人都有难忘的童年。我的童年充满了欢声笑语，",
        "bbox": { "x": 80, "y": 100, "w": 350, "h": 20 },
        "paragraphId": "p1",
        "chars": [ // 正文行的单字坐标（仅字级粒度返回）
          { "char": "每", "index": 0, "bbox": { "x": 80, "y": 100, "w": 20, "h": 20 }, "isPunctuation": false },
          { "char": "个", "index": 1, "bbox": { "x": 105, "y": 100, "w": 20, "h": 20 }, "isPunctuation": false },
          // 其余单字省略...
          { "char": "，", "index": 24, "bbox": { "x": 425, "y": 100, "w": 10, "h": 20 }, "isPunctuation": true }
        ]
      },
      // 其余行同此结构（含chars数组）
    ]
  }
}
    
 //无坐标
"result": {
  "recognizeGranularity": "none",
  "essayOverall": { /* 同上 */ },
  "title": {
    "text": "我的童年" // 无bbox，无chars
  },
  "content": {
    "paragraphs": [
      {
        "paragraphId": "p1",
        "text": "每个人都有难忘的童年。我的童年充满了欢声笑语，夏天会和小伙伴去河边捉鱼。",
        "sentences": [
          {
            "sentenceId": "s1-p1", 
            "text": "每个人都有难忘的童年。" },
          { 
            "sentenceId": "s2-p1", 
            "text": "我的童年充满了欢声笑语，夏天会和小伙伴去河边捉鱼。" }
        ]
      },
      {
        "paragraphId": "p2",
        "text": "冬天则围在火炉旁听奶奶讲故事。",
        "sentences": [ 
          { 
            "sentenceId": "s1-p2",
            "text": "冬天则围在火炉旁听奶奶讲故事。" } 
        ]
      }
    ]
    // 无lines字段
  }
}

失败返回示例（详细的错误码说明见API文档-错误码）：

{
    "log_id": 1965712846932687146,
    "error_msg": "the input image is not a composition",
    "error_code": 216100
}

试卷切题识别

英语答题卡识别（多模态）