资讯 文档
技术能力
语音技术
文字识别
人脸与人体
图像技术
语言与知识
视频技术

开源模型token计算说明

token长度获取方式

千帆提供token计算器,用户可以登录token计算器页面,获取文本、图片的token长度。

开源模型token计算方法

  1. deepseek-v3为例,在huggingface上下载模型token计算相关的两个文件,分别是:

    • tokenizer.json
    • tokenizer_config.json
  2. 创建model_tokenizer.py文件,该文件与上面下载的两个文件放在同一个目录下。
  3. model_tokenizer.py代码如下:
# pip3 install transformers
# python3 model_tokenizer.py
import transformers

chat_tokenizer_dir = "./"

tokenizer = transformers.AutoTokenizer.from_pretrained( 
        chat_tokenizer_dir, trust_remote_code=True
        )
text = "开源模型token计算说明"

result = tokenizer.encode(text)
print("ids:",result)
count = len(result)
print("token数量:",count)
  1. 运行model_tokenizer.py文件,输出结果如下:
ids: [83649, 8842, 33912, 4339, 6977]
token数量: 5
  1. 由此表明,"开源模型token计算说明"的token数量为5,并且给出了每一个token的id,用户可以通过id在tokenizer.json文件当中找到其对应的字符含义。

复杂输入的token计算方式

  1. 当你的输入当中有多轮对话,又有tools工具定义,此时如果要计算token长度,需要借助tokenizer_config.json文件当中chat_template定义。
  2. qwen3-8b为例,其输入如下:
{
    "model": "qwen3-8b",
    "messages": [
        {
            "role": "user",
            "content": "查一下上海和北京现在的天气"
        }
    ],
    "tools": [{
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "天气查询工具",
            "parameters": {
                "properties": {
                    "location": {
                        "description": "地理位置,精确到区县级别",
                        "type": "string"
                    },
                    "time": {
                        "description": "时间,格式为YYYY-MM-DD",
                        "type": "string"
                    }
                },
                "type": "object"
            }
        }

    }],
    "stream": false,
    "enable_thinking":false,
    "tool_choice" : "auto",
    "tool_options" : {"thoughts_output" : true}
}
  1. 经过chat_template转换以后变为如下结构:
<|im_start|>system
# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "get_current_weather", "description": "天气查询工具", "parameters": {"properties": {"location": {"description": "地理位置,精确到区县级别", "type": "string"}, "time": {"description": "时间,格式为YYYY-MM-DD", "type": "string"}}, "type": "object"}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
查一下上海和北京现在的天气<|im_end|>
<|im_start|>assistant
<think>

</think>
  1. 计算上述文本token长度代码如下:
# pip3 install transformers
# python3 model_tokenizer.py
import transformers

chat_tokenizer_dir = "./"

tokenizer = transformers.AutoTokenizer.from_pretrained( 
        chat_tokenizer_dir, trust_remote_code=True
        )

text = """<|im_start|>system
# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "get_current_weather", "description": "天气查询工具", "parameters": {"properties": {"location": {"description": "地理位置,精确到区县级别", "type": "string"}, "time": {"description": "时间,格式为YYYY-MM-DD", "type": "string"}}, "type": "object"}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
查一下上海和北京现在的天气<|im_end|>
<|im_start|>assistant
<think>

</think>"""

result = tokenizer.encode(text)

print("ids:",result)
count = len(result)
print("token数量:",count)
  1. 运行model_tokenizer.py文件,输出结果如下:
ids: [151644, 8948, 198, 2, 13852, 271, 2610, 1231, 1618, 825, 476, 803, 5746, 311, 7789, 448, 279, 1196, 3239, 382, 2610, 525, 3897, 448, 729, 32628, 2878, 366, 15918, 1472, 15918, 29, 11874, 9492, 510, 27, 15918, 397, 4913, 1313, 788, 330, 1688, 497, 330, 1688, 788, 5212, 606, 788, 330, 455, 11080, 69364, 497, 330, 4684, 788, 330, 104307, 51154, 102011, 497, 330, 13786, 788, 5212, 13193, 788, 5212, 2527, 788, 5212, 4684, 788, 330, 111692, 3837, 108639, 26939, 23836, 24342, 105972, 497, 330, 1313, 788, 330, 917, 14345, 330, 1678, 788, 5212, 4684, 788, 330, 20450, 3837, 68805, 17714, 28189, 18506, 40175, 497, 330, 1313, 788, 330, 917, 9207, 2137, 330, 1313, 788, 330, 1700, 30975, 532, 522, 15918, 1339, 2461, 1817, 729, 1618, 11, 470, 264, 2951, 1633, 448, 729, 829, 323, 5977, 2878, 220, 151657, 151658, 11874, 9492, 510, 151657, 198, 4913, 606, 788, 366, 1688, 11494, 8066, 330, 16370, 788, 366, 2116, 56080, 40432, 31296, 151658, 151645, 198, 151644, 872, 198, 32876, 100158, 100633, 33108, 68990, 104718, 104307, 151645, 198, 151644, 77091, 198, 151667, 271, 151668]
token数量: 181
  1. 用户输入token长度为181,与大模型推理结果返回的token长度一致。

特殊说明

  1. 内置prompt模板带来token增长:部分模型的chat_template当中,没有定义如何处理tools(比如deepseek-r1),此类模型的function call能力是通过内置prompt模板拼接实现function call能力。在这种情况下,会有隐藏prompt输入给大模型,由此导致大模型推理返回的token长度会大于用户输入的token长度,此类为正常现象。
  2. 工具触发带来token增长:deepseek模型支持联网搜索,触发联网搜索以后带来输入token膨胀,计费参考联网搜索
上一篇
ernie模型token计算说明
下一篇
推理服务API