开源模型token计算说明

更新时间：2025-05-14

token长度获取方式

千帆提供token计算器，用户可以登录token计算器页面，获取文本、图片的token长度。

开源模型token计算方法

以deepseek-v3为例，在huggingface上下载模型token计算相关的两个文件，分别是：
- tokenizer.json
- tokenizer_config.json
创建model_tokenizer.py文件，该文件与上面下载的两个文件放在同一个目录下。
model_tokenizer.py代码如下：

# pip3 install transformers
# python3 model_tokenizer.py
import transformers

chat_tokenizer_dir = "./"

tokenizer = transformers.AutoTokenizer.from_pretrained( 
        chat_tokenizer_dir, trust_remote_code=True
        )
text = "开源模型token计算说明"

result = tokenizer.encode(text)
print("ids:",result)
count = len(result)
print("token数量:",count)

运行model_tokenizer.py文件，输出结果如下：

ids: [83649, 8842, 33912, 4339, 6977]
token数量: 5

由此表明，"开源模型token计算说明"的token数量为5，并且给出了每一个token的id，用户可以通过id在tokenizer.json文件当中找到其对应的字符含义。

复杂输入的token计算方式

当你的输入当中有多轮对话，又有tools工具定义，此时如果要计算token长度，需要借助tokenizer_config.json文件当中chat_template定义。
以qwen3-8b为例，其输入如下：

{
    "model": "qwen3-8b",
    "messages": [
        {
            "role": "user",
            "content": "查一下上海和北京现在的天气"
        }
    ],
    "tools": [{
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "天气查询工具",
            "parameters": {
                "properties": {
                    "location": {
                        "description": "地理位置，精确到区县级别",
                        "type": "string"
                    },
                    "time": {
                        "description": "时间，格式为YYYY-MM-DD",
                        "type": "string"
                    }
                },
                "type": "object"
            }
        }

    }],
    "stream": false,
    "enable_thinking":false,
    "tool_choice" : "auto",
    "tool_options" : {"thoughts_output" : true}
}

经过chat_template转换以后变为如下结构：

<|im_start|>system
# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "get_current_weather", "description": "天气查询工具", "parameters": {"properties": {"location": {"description": "地理位置，精确到区县级别", "type": "string"}, "time": {"description": "时间，格式为YYYY-MM-DD", "type": "string"}}, "type": "object"}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
查一下上海和北京现在的天气<|im_end|>
<|im_start|>assistant
<think>

</think>

计算上述文本token长度代码如下：

# pip3 install transformers
# python3 model_tokenizer.py
import transformers

chat_tokenizer_dir = "./"

tokenizer = transformers.AutoTokenizer.from_pretrained( 
        chat_tokenizer_dir, trust_remote_code=True
        )

text = """<|im_start|>system
# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "get_current_weather", "description": "天气查询工具", "parameters": {"properties": {"location": {"description": "地理位置，精确到区县级别", "type": "string"}, "time": {"description": "时间，格式为YYYY-MM-DD", "type": "string"}}, "type": "object"}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
查一下上海和北京现在的天气<|im_end|>
<|im_start|>assistant
<think>

</think>"""

result = tokenizer.encode(text)

print("ids:",result)
count = len(result)
print("token数量:",count)

运行model_tokenizer.py文件，输出结果如下：

ids: [151644, 8948, 198, 2, 13852, 271, 2610, 1231, 1618, 825, 476, 803, 5746, 311, 7789, 448, 279, 1196, 3239, 382, 2610, 525, 3897, 448, 729, 32628, 2878, 366, 15918, 1472, 15918, 29, 11874, 9492, 510, 27, 15918, 397, 4913, 1313, 788, 330, 1688, 497, 330, 1688, 788, 5212, 606, 788, 330, 455, 11080, 69364, 497, 330, 4684, 788, 330, 104307, 51154, 102011, 497, 330, 13786, 788, 5212, 13193, 788, 5212, 2527, 788, 5212, 4684, 788, 330, 111692, 3837, 108639, 26939, 23836, 24342, 105972, 497, 330, 1313, 788, 330, 917, 14345, 330, 1678, 788, 5212, 4684, 788, 330, 20450, 3837, 68805, 17714, 28189, 18506, 40175, 497, 330, 1313, 788, 330, 917, 9207, 2137, 330, 1313, 788, 330, 1700, 30975, 532, 522, 15918, 1339, 2461, 1817, 729, 1618, 11, 470, 264, 2951, 1633, 448, 729, 829, 323, 5977, 2878, 220, 151657, 151658, 11874, 9492, 510, 151657, 198, 4913, 606, 788, 366, 1688, 11494, 8066, 330, 16370, 788, 366, 2116, 56080, 40432, 31296, 151658, 151645, 198, 151644, 872, 198, 32876, 100158, 100633, 33108, 68990, 104718, 104307, 151645, 198, 151644, 77091, 198, 151667, 271, 151668]
token数量: 181

用户输入token长度为181，与大模型推理结果返回的token长度一致。

特殊说明

内置prompt模板带来token增长：部分模型的chat_template当中，没有定义如何处理tools（比如deepseek-r1），此类模型的function call能力是通过内置prompt模板拼接实现function call能力。在这种情况下，会有隐藏prompt输入给大模型，由此导致大模型推理返回的token长度会大于用户输入的token长度，此类为正常现象。
工具触发带来token增长：deepseek模型支持联网搜索，触发联网搜索以后带来输入token膨胀，计费参考联网搜索

ernie模型token计算说明

推理服务API