资讯 文档
技术能力
语音技术
文字识别
人脸与人体
图像技术
语言与知识
视频技术

模型支持情况说明

本文介绍了模型支持情况,在调用模型精调V2版本部分API时,需查看此文档各参数支持情况。

对话续写类

SFT

ERNIE系列

model trainMode parameterScale hyperParameterConfig
ERNIE-Lite-8K-0308 SFT FullFineTuning、LoRA 、LoRA-GA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA、LoRA-GA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen:
FullFineTuning、LoRA:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
LoRA-GA:单选,4096 或 8192,默认值4096
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长4
LoRA:[1,10000],默认值16, 步长8(当maxSeqLen=8192时,推荐步长4)
LoRA-GA:[1,10000],默认值16,步长4(当maxSeqLen=8192时,推荐步长8)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8,默认为8
loraAllLinear: 单选,True 或 False,默认为True
· 仅LoRA-GA支持:
loraRank:单选,8 或 64,默认为64
loragaInitIters:[0,10000000],默认值4
loragaStableGamma:[0,10000000],默认值64
loragaGradientOffload:字符串,False 或 True,默认值False
loraAllLinear:单选,True 或 False,默认为True
ERNIE-Lite-128K-0419 SFT FullFineTuning · epoch:[1,50],默认值1
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:[1,10000],默认16,步长1
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
ERNIE-Lite-128K-0722 SFT FullFineTuning、LoRA、LoRA-GA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA、LoRA-GA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:[1,10000],默认16,步长1
· pseudoSamplingProb:[0,1.9],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA-GA支持:
loragaInitIters:[0,10000000],默认值4
loragaStableGamma:[0,10000000],默认值64
loragaGradientOffload:字符串,False 或 True,默认值False
ERNIE-Speed-8K SFT FullFineTuning、LoRA、LoRA-GA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA、LoRA-GA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen:
FullFineTuning、LoRA:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
LoRA-GA:单选,4096 或 8192,默认值4096
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长2(当maxSeqLen=8192时,推荐步长1)
LoRA、LoRA-GA:[1,10000],默认值16,步长4(当maxSeqLen=8192时,推荐步长2)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,8 或 64,默认为64
loraAllLinear: 单选,True 或 False,默认为True
· 仅LoRA-GA支持:
loraRank:单选,8 或 64,默认为64
loragaInitIters:[0,10000000],默认值4
loragaStableGamma:[0,10000000],默认值64
loragaGradientOffload:字符串,False 或 True,默认值False
loraAllLinear:单选,True 或 False,默认为True
ERNIE-Character-8K-0321 SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长4;当maxSeqLen=8192时,推荐步长2)
LoRA:[1,10000],默认值16,步长2(当maxSeqLen=4096时,推荐步长2;当maxSeqLen=8192时,推荐步长1)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8,默认为8
loraAllLinear: 单选,True 或 False,默认为True
ERNIE-Tiny-8K SFT FullFineTuning、LoRA、LoRA-GA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA、LoRA-GA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen:
FullFineTuning、LoRA:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
LoRA-GA:单选,4096 或 8192,默认值4096
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning:[1,10000],默认值32,步长16(当maxSeqLen=8192时,推荐步长8)
LoRA、LoRA-GA:[1,10000],默认值32,步长16
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8,默认为8
loraAllLinear: 单选,True 或 False,默认为True
· 仅LoRA-GA支持:
loraRank:单选,8 或 64,默认为64
loragaInitIters:[0,10000000],默认值4
loragaStableGamma:[0,10000000],默认值64
loragaGradientOffload:字符串,False 或 True,默认值False
loraAllLinear:单选,True 或 False,默认为True >
ERNIE-4.0-Turbo-8K SFT LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.001],默认0.000001,步长0.000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:[1,10000],默认值18,步长2(当maxSeqLen=8192时,推荐步长1)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认constant
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· loraRank:单选,2、4、8、16、32 或 64,默认为64
· loraAllLinear: 单选,True 或 False,默认为True
ERNIE-Speed-Pro-128K SFT FullFineTuning、LoRA、LoRA-GA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA、LoRA-GA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen:
FullFineTuning、LoRA:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
LoRA-GA:单选,4096 或 8192,默认值8192
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长2(当maxSeqLen=131072时,推荐步长1)
LoRA:[1,10000],默认值16,步长4(当maxSeqLen=131072时,推荐步长1)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 8 或 4,默认为8
loraAllLinear: 单选,True 或 False,默认为True
· 仅LoRA-GA支持:
loraRank:单选,8 或 64,默认为64
loragaInitIters:[0,10000000],默认值4
loragaStableGamma:[0,10000000],默认值64
loragaGradientOffload:字符串,False 或 True,默认值False
loraAllLinear:单选,True 或 False,默认为True
ERNIE-Tiny-128K-0929 SFT FullFineTuning、LoRA、LoRA-GA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA、LoRA-GA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning、LoRA:[1,10000],默认值16,步长4(当maxSeqLen=65536时,推荐步长2,当maxSeqLen=131072时,推荐步长8)
LoRA-GA:[1,10000],默认值16,步长4(当maxSeqLen=131072时,推荐步长1)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8,默认为8
loraAllLinear: 单选,True 或 False,默认为True
· 仅LoRA-GA支持:
loraRank:单选,8 或 64,默认为64
loragaInitIters:[0,10000000],默认值4
loragaStableGamma:[0,10000000],默认值64
loragaGradientOffload:字符串,False 或 True,默认值False
loraAllLinear:单选,True 或 False,默认为True
ERNIE-3.5-8K SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认0.00003,步长0.000001
LoRA:[0.0000001,0.001],默认0.0003,步长0.000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值 4096
· globalBatchSize:
FullFineTuning:[1,10000],默认64,步长1
LoRA:[1,10000],默认64,步长1
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType:
FullFineTuning:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
LoRA:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值constant
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8 或 16 或 32 或 64,默认为64
loraAllLinear: 单选,True 或 False,默认为True
ERNIE-Character-Fiction-8K-1028 SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003
LoRA:[0.000001,0.001],默认值0.0003
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning:[1,10000],默认值2,步长1(当maxSeqLen=4096时,推荐步长2)
LoRA:[1,10000],默认值4,步长1(当maxSeqLen=4096时,推荐步长4;当maxSeqLen=8192时,推荐步长2)
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8,默认值8
loraAllLinear: 单选,True 或 False,默认为True
ERNIE-Code-3-128K SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003
LoRA:[0.000001,0.001],默认值0.0003
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072 默认值32768
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:[1,10000],默认值16,步长1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· tensorParallelDegree:[1,8],默认值8
· 仅LoRA支持:
loraRank: 单选,8 或 64,默认值64
loraAllLinear: 单选,True 或 False,默认为True
Qianfan-Sug SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.00003,步长0.000001
LoRA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096 <
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· globalBatchSize:
FullFineTuning:[1,10000],默认值32,步长16(当maxSeqLen=8192时,推荐步长8)
LoRA:[1,10000],默认值32, 步长16
· pseudoSamplingProb:[0,1],默认值0,步长0.1
· checkpointSaveStrategy:单选,step或epoch,默认值step
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认为64,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8,默认为8
loraAllLinear: 单选,True 或 False,默认为True

开源系列

model trainMode parameterScale hyperParameterConfig
Meta-Llama-3.1-8B SFT FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· validationStep:[0,1000000],默认值16,步长1
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:当参数checkpointSaveStrategy=step时,此参数有效
FullFineTuning:[64,4096],默认值64
LoRA:[64,4096],默认值256
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Meta-Llama-3-8B SFT FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,2],默认值1
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16, 步长1
Meta-Llama-3.2-1B-128K SFT FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· validationStep:[0,1000000],默认值16,步长1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值8192
· batchSize:[1,N],默认值1,其中的 N 和 maxSeqLen 有关联,关联关系如下:
maxSeqLen = 131072 时,N=1
maxSeqLen = 65536 时,N=2
maxSeqLen = 32768 时,N=4
maxSeqLen = 16384 时,N=8
maxSeqLen = 8192 时,N=16
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
Qianfan-Chinese-Llama-2-1.3B SFT FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
Qianfan-Chinese-Llama-2-7B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,8],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Qianfan-Chinese-Llama-2-7B-32K SFT FullFineTuning、LoRA · epoch:[1,50],默认值3
· learningRate:[0.0000000001,0.0002],默认值0.000001, 步长0.000001
· batchSize:1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:单选,4096 或 8192 或 16384 或 32768,默认值32768
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值64
· validationStep:[0,1000000],默认值16,步长1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Qianfan-Chinese-Llama-2-13B-v1 SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,8],默认1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Qianfan-Chinese-Llama-2-13B-v2 SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,8],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Mixtral-8x7B SFT FullFineTuning · epoch:[1,20],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.00001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
SQLCoder-7B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
ChatGLM2-6B-32K SFT FullFineTuning · epoch:[1,50],默认值1
· maxSeqLen:单选,4096 或 8192 或 16384 或 32768,默认值32768
· batchSize32k:1,前置条件maxSeqLen=32768
· batchSize16k:[1,2],默认值1,前置条件maxSeqLen=16384
· batchSize8k:[1,6],默认值1,前置条件maxSeqLen=8192
· batchSize4k:[1,12],默认值 1,前置条件:maxSeqLen=4096
· Packing:字符串,true 或 false 或 auto,默认值auto
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,,步长0.001
· weightDecay:[0.001,1],默认值0.01, 步长0.001
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· validationStep:[0,1000000],默认值16,,步长1
· saveStep:[64,4096],默认值256
ChatGLM2-6B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,2],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
ChatGLM3-6B SFT FullFineTuning、LoRA · epoch:
FullFineTuning:[1,50],默认值3
LoRA:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:16 或 32 或 64,默认值16
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:
FullFineTuning:单选,4096 或 8192,默认值4096
LoRA:单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Baichuan2-7B-Chat SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
Baichuan2-13B-Chat SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,2],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
BLOOMZ-7B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
CodeLlama-7B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[64,4096],默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· validationStep:[0,1000000],默认值16,步长1
· 仅Lora支持:
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1
loraTargetModules:多选,self_attn.q_proj、self_attn.k_proj、self_attn.v_proj、self_attn.o_proj、mlp.gate_proj、mlp.up_proj、mlp.down_proj,默认值self_attn.q_proj + self_attn.v_proj
Custom-Model(自定义模型) SFT FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
Qwen2.5-7B-Instruct SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· globalBatchSize:[8,100000],默认值16,步长8
· maxSeqLen:512 或 1024 或 2048 或 4096 或 8192 或 16384 或 32768,默认值4096
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
QwQ-32B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· globalBatchSize:[8,100000],默认值16,步长8
· maxSeqLen:512 或 1024 或 2048 或 4096 或 8192 或 16384 或 32768,默认值4096
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
ChatGLM4-9B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64
· validationStep:[0,1000000],默认值16,步长1
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64,默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
DeepSeek-R1-Distill-Qwen-32B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· globalBatchSize:[8,100000],默认值16,步长8
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192 或 16384 或 32768,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1,步长1
· validationStep:[0,1000000],默认值16,步长1
· saveStep:[1,50000],默认值64
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
DeepSeek-R1-Distill-Qwen-7B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· globalBatchSize:[8,100000],默认值16,步长8
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192 或 16384 或 32768,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1,步长1
· validationStep:[0,1000000],默认值16,步长1
· saveStep:[1,50000],默认值64
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 默认值32
loraAlpha:单选,8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
DeepSeek-R1 SFT LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· batchSize:[1,4],默认值1
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· loraRank:8 或 16 或 32 或 64,默认值32
· loraAlpha:8 或 16 或 32 或 64,默认值32
· loraDropout:[0.01, 0.5],默认值0.1,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
DeepSeek-R1-Distill-Qwen-14B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· globalBatchSize:[8,100000],默认值16,步长8
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096或 8192 或 16384 或 32768,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1,步长1
· validationStep:[0,1000000],默认值16,步长1
· saveStep:当参数checkpointSaveStrategy=step时,此参数有效
FullFineTuning:[64,4096],默认值64
LoRA:[64,4096],默认值256
· 仅LoRA支持:
loraRank:8 或 16 或 32 或 64,默认值32
loraAlpha: 8 或 16 或 32 或 64,默认值32
loraDropout: [0.01, 0.5],默认值0.1,步长0.001
DeepSeek-R1-Distill-Qianfan-Llama-8B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· globalBatchSize:[8,100000],默认值16,步长8
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096或 8192 或 16384 或 32768,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1,步长1
· validationStep:[0,1000000],默认值16,步长1
· saveStep:[1,50000],默认值64
DeepSeek-R1-Distill-Qwen-1.5B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· globalBatchSize:[8,100000],默认值16,步长8
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096或 8192 或 16384 或 32768,默认值4096
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1,步长1
· validationStep:[0,1000000],默认值16,步长1
· saveStep:[64,4096],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· 仅LoRA支持:
loraRank:8 或 16 或 32 或 64,默认值32
loraAlpha:8 或 16 或 32 或 64,默认值32
loraDropout:[0.01,0.5],默认值0.1,步长0.001
DeepSeek-R1-Distill-Qianfan-Llama-70B SFT FullFineTuning、LoRA · epoch:[1,50],默认1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· globalBatchSize:[8,100000],默认16,步长8
· Packing:字符串,true 或 false 或 auto,默认值auto
· schedulerName:单选,单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认0.03,步长0.001
· weightDecay:[ 0.001,1],默认0.01,步长0.001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192 或 16384 或 32768,默认4096
· checkpointCount:[1,10],默认1
· saveStep:[1,50000],默认为64,checkpointCount需要是validationStep的整数倍
· validationStep:[0,1000000],默认16,步长1
· 仅LoRA:
loraRank:单选,[8,16,32,64 ],默认32
loraAlpha:单选,[8,16,32,64],默认32
loraDropout:[0.01,0.5],默认0.1,步长0.001

RFT

model trainMode parameterScale hyperParameterConfig
DeepSeek-R1-Distill-Qwen-14B RFT FullFineTuning · epoch:[1,50],默认值1
· criticLearningRate:[0.0000001,0.00001],默认值0.000009,步长0.0000001,当RlMethod=PPO时,此参数有效
· actorLearningRate:[0.0000001,0.00001],默认值0.0000005,步长0.0000001
· maxSeqLen:4096 或 8192 或 16384 或 32768,默认值4096
· globalBatchSize:[1,10000],默认值64(当maxSeqLen=4096时,推荐步长4;当maxSeqLen=8192时,推荐步长1;当maxSeqLen=16384时,推荐步长1;当maxSeqLen=32768时,推荐步长1)
· rolloutBatchSize:[1,10000],默认值64,步长4(当maxSeqLen=8192时,推荐步长1;当maxSeqLen=16384时,推荐步长1;当maxSeqLen=32768时,推荐步长1)
· numSamplesPerPrompt:[1,32],默认值8,当RlMethod=GRPO时,此参数有效
· maxPromptLen4k:[512,3072],默认值1024,当maxSeqLen=4096时,此参数有效
· maxPromptLen8k:[512,8092],默认值1024,当maxSeqLen=8092时,此参数有效
· maxPromptLen16k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxPromptLen32k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxLength16k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxLength32k:[512,30720],默认值1024,当maxSeqLen=32768时,此参数有效
· maxLength4k:[512,3072],默认值1024,当maxSeqLen=4096时,此参数有效
· maxLength8k:[512,8092],默认值1024,当maxSeqLen=8192时,此参数有效
· loggingSteps:[1,1],默认值1
· klCoeff:[0.00001,0.01],默认值0.001
· checkpointSaveStrategy:字符串,默认值step
· checkpointCount:[1,20],默认值1,步长1,checkpointCount数不得大于迭代轮次
· saveStep:2 或 4 或 8 或 16 或 32 或 64 或 128 或 256,默认值16,当checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
DeepSeek-R1-Distill-Qwen-7B RFT FullFineTuning · epoch:[1,50],默认值1
· criticLearningRate:[0.0000001,0.00001],默认值0.000009,步长0.0000001,当RlMethod=PPO时,此参数有效
· actorLearningRate:[0.0000001,0.00001],默认值0.0000005,步长0.0000001
· maxSeqLen:4096 或 8192 或 16384 或 32768,默认值4096
· globalBatchSize:[1,10000],默认值64(当maxSeqLen=4096时,推荐步长4;当maxSeqLen=8192时,推荐步长1;当maxSeqLen=16384时,推荐步长1;当maxSeqLen=32768时,推荐步长1)
· rolloutBatchSize:[1,10000],默认值64,步长4(当maxSeqLen=8192时,推荐步长1;当maxSeqLen=16384时,推荐步长1;当maxSeqLen=32768时,推荐步长1)
· numSamplesPerPrompt:[1,32],默认值8,当RlMethod=GRPO时,此参数有效
· maxPromptLen4k:[512,3072],默认值1024,当maxSeqLen=4096时,此参数有效
· maxPromptLen8k:[512,8092],默认值1024,当maxSeqLen=8092时,此参数有效
· maxPromptLen16k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxPromptLen32k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxLength16k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxLength32k:[512,30720],默认值1024,当maxSeqLen=32768时,此参数有效
· maxLength4k:[512,3072],默认值1024,当maxSeqLen=4096时,此参数有效
· maxLength8k:[512,8092],默认值1024,当maxSeqLen=8192时,此参数有效
· loggingSteps:[1,1],默认值1
· klCoeff:[0.00001,0.01],默认值0.001
· checkpointSaveStrategy:字符串,默认值step
· checkpointCount:[1,20],默认值1,步长1,checkpointCount数不得大于迭代轮次
· saveStep:2 或 4 或 8 或 16 或 32 或 64 或 128 或 256,默认值16,当checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
Qwen2.5-7B-Instruct RFT FullFineTuning · epoch:[1,50],默认值1
· criticLearningRate:[0.0000001,0.00001],默认值0.000009,步长0.0000001,当RlMethod=PPO时,此参数有效
· actorLearningRate:[0.0000001,0.00001],默认值0.0000005,步长0.0000001
· maxSeqLen:4096 或 8192 或 16384 或 32768,默认值4096
· globalBatchSize:[1,10000],默认值64(当maxSeqLen=4096时,推荐步长4;当maxSeqLen=8192时,推荐步长1;当maxSeqLen=16384时,推荐步长1;当maxSeqLen=32768时,推荐步长1)
· rolloutBatchSize:[1,10000],默认值64,步长4(当maxSeqLen=8192时,推荐步长1;当maxSeqLen=16384时,推荐步长1;当maxSeqLen=32768时,推荐步长1)
· numSamplesPerPrompt:[1,32],默认值8,当RlMethod=GRPO时,此参数有效
· maxPromptLen4k:[512,3072],默认值1024,当maxSeqLen=4096时,此参数有效
· maxPromptLen8k:[512,8092],默认值1024,当maxSeqLen=8092时,此参数有效
· maxPromptLen16k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxPromptLen32k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxLength16k:[512,15360],默认值1024,当maxSeqLen=16384时,此参数有效
· maxLength32k:[512,30720],默认值1024,当maxSeqLen=32768时,此参数有效
· maxLength4k:[512,3072],默认值1024,当maxSeqLen=4096时,此参数有效
· maxLength8k:[512,8092],默认值1024,当maxSeqLen=8192时,此参数有效
· loggingSteps:[1,1],默认值1
· klCoeff:[0.00001,0.01],默认值0.001
· checkpointSaveStrategy:字符串,默认值step
· checkpointCount:[1,20],默认值1,步长1,checkpointCount数不得大于迭代轮次
· saveStep:2 或 4 或 8 或 16 或 32 或 64 或 128 或 256,默认值16,当checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42

PostPretrain

model trainMode parameterScale hyperParameterConfig
ERNIE-Speed-8K PostPretrain - · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen: 单选,4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值32,步长1(当maxSeqLen=4096时,推荐步长2)
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1,saveStep需要是validationStep的整数倍
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
ERNIE-Tiny-8K PostPretrain - · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen: 单选,4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值32,步长8(当maxSeqLen=4096时,推荐步长16)
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1,saveStep需要是validationStep的整数倍
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
Qianfan-Chinese-Llama-2-13B-v1 PostPretrain - · epoch:1
· learningRate:[0.0000002,0.0002],默认值0.00002,步长0.000001
· batchSize:[48,960],默认值192,步长48
· weightDecay:[0.0001,0.05],默认值0.01,步长0.001
· checkpointCount:[1,10],默认值1
· saveStep:[64,8192],默认值64
· validationStep:[0, 1000000],默认值16,步长1
ERNIE-Speed-Pro-128K PostPretrain - · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长2;当maxSeqLen=32768时,推荐步长2;当maxSeqLen=65536时,推荐步长2)
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1,saveStep需要是validationStep的整数倍
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
ERNIE-Tiny-128K-0929 PostPretrain - · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen: 单选,16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长2(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=131072时,推荐步长8)
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1,saveStep需要是validationStep的整数倍
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
ERNIE-Lite-128K-0722 PostPretrain - · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072, 默认值32768
· globalBatchSize:[1,10000], 默认值16, 步长2(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=65536时,推荐步长4)
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1,saveStep需要是validationStep的整数倍
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· tensorParallelDegree:[1,8],默认值4
· shardingParallelDegree:[1,64],默认值2
· sharding:stage1 或 stage2 或 stage3,默认值stage2
· recompute:0 或 1,默认值1
ERNIE-Character-Fiction-8K PostPretrain - · epoch:[1,50],默认值1
· learningRate:[0.00000010, 0.01],默认值 0.00003,步长0.0000010
· maxSeqLen: 单选, 可选项4096、8192, 默认值 4096
· globalBatchSize:[1,10000],默认值32,步长2(当maxSeqLen=8192时,推荐步长1)
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1, 2147483647],默认值 42
· lrSchedulerType:单选,可选项linear、cosine、polynomial、constant、constant_with_warmup,默认值 linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001, 0.0000010],默认值 0.00000010,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:单选, 可选项False、True,默认值 False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:单选,可选项validationLoss,默认值 validationLoss
earlyStoppingThreshold:[0,5],默认值 0.01,步长0.01
earlyStoppingPatience:[1,50],默认值 3,步长1

DPO

model trainMode parameterScale hyperParameterConfig
ERNIE-Lite-8K-0308 DPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长4(当maxSeqLen=8192时,推荐步长8)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
ERNIE-Lite-128K-0722 DPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长2;当maxSeqLen=32768时,推荐步长2;当maxSeqLen=65536时,推荐步长2,)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
ERNIE-Lite-128K-0419 DPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长2;当maxSeqLen=32768时,推荐步长2;当maxSeqLen=65536时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
ERNIE-Speed-8K DPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
ERNIE-Tiny-8K DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值32,步长8(当maxSeqLen=4096时,推荐步长16)
LoRA:[1,10000],默认值32,步长8(当maxSeqLen=4096时,推荐步长16;当maxSeqLen=8192时,推荐步长16)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Speed-Pro-128K DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1
LoRA:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=65536时,推荐步长4)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,8 或 64 ,默认值64
ERNIE-Tiny-128K-0929 DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长2(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=131072时,推荐步长8)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Character-8K-0321 DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000001,0.01],默认值0.000001,步长0.0000001
LoRA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen: 单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1
LoRA:[1,10000],默认值16,步长2
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:
FullFineTuning:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:
FullFineTuning:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
pseudoSamplingProb:[0,0.9],默认值0,步长0.1
loraAllLinear:True 或 False,默认值True
loraRank:2 或 4 或 8,默认值8
Meta-Llama-3.1-8B DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.0002],默认值0.000001,步长0.000001
LoRA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen: 单选,1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长1
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍
· schedulerName: 单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· perDeviceTrainBatchSize:[1,8],默认值1
· 仅LoRA支持:
loraRank:[8,64], 默认值64
ERNIE-3.5-8K DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 单选,4096 或 8192,默认值8192
· maxPromptLen:[1,131062],默认值2048
· maxSteps:[0,10000000],默认值0
· recompute:
FullFineTuning:0或1,默认值1
LoRA:0或1,默认值0
· globalBatchSize:[1,10000],默认值16,步长1
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:[2,4,8], 默认值8
Baichuan2-7B-Chat DPO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.0002],默认值0.000001,步长0.000001
LoRA:[0.000001,0.001],默认值0.0003,步长0.000001
· maxSeqLen: 单选,1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长1
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍
· schedulerName: 单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· perDeviceTrainBatchSize:[1,8],默认值1
· 仅LoRA支持:
loraRank:[8,64], 默认值64
Qianfan-Sug DPO FullFineTuning、LoRA · epoch:[1,50]默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值32,步长16(当maxSeqLen=4096时,步长为8)
LoRA:[1,10000],默认值32,步长16
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1,步长1
· saveStep:[1,50000],步长64
· seed:[1,2147483647],默认值42
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1
· lossType:字符串 sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数checkpointSaveStrategy=step时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:2 或 4 或 8,默认值8
DeepSeek-R1-Distill-Qwen-14B DPO FullFineTuning、LoRA · epoch:[1,50]默认值1
· learningRate:[0.0000000001,0.0002],默认值0.000001,步长0.000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192 或 16384 或 32768,默认值4096
· globalBatchSize:[8,100000],默认值16,步长8
· warmupRatio:[0.01,0.1],默认值0.03,步长0.001
· weightDecay:[0.001,1],默认值0.01,步长0.001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointCount:[1,10],默认值1,步长1
· saveStep:[1,50000],默认值64
· seed:[1,2147483647],默认值42
· schedulerName:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值cosine
· validationStep:[0,1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 仅LoRA支持:
loraRank:8 或 64,默认值64
ERNIE-4.0-Turbo-8K DPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen: 4096 或 8192,默认值8192(最长提示长度不得大于序列长度减10)
· globalBatchSize:[1,10000],默认值18,步长1
· maxPromptLen:[1,131062],默认值2048(最长提示长度不得大于序列长度减10)
· maxSteps:[0,10000000], 默认值0
· [1,8],默认值8
· shardingParallelDegree:[1,64],默认值1
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· dpoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· seed:[1,2147483647],默认值42
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,saveStep需要是validationStep的整数倍,当参数checkpointSaveStrategy=step时,此参数有效
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· lossType:sigmoid 或 ipo 或 kto_pair,默认值sigmoid
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,当参数earlyStopping为True时,此参数有效

KTO

model trainMode parameterScale hyperParameterConfig
ERNIE-Speed-8K KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
LoRA:[1,10000],默认值16,步长2(当maxSeqLen=4096时,推荐步长4)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,8 或 64 ,默认值64
ERNIE-Lite-128K-0419 KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长2;当maxSeqLen=32768时,推荐步长2;当maxSeqLen=65536时,推荐步长2)
LoRA:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长8;当maxSeqLen=32768时,推荐步长8;当maxSeqLen=65536时,推荐步长4;当maxSeqLen=131072时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Lite-8K-0308 KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长4(当maxSeqLen=8192时,推荐步长8)
LoRA:[1,10000],默认值16,步长4(当maxSeqLen=4096时,推荐步长8)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank: 单选,2 或 4 或 8 ,默认值8
ERNIE-Character-Fiction-8K KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
·globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
LoRA:[1,10000],默认值16,步长2(当maxSeqLen=4096时,推荐步长4)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Character-8K-0321 KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
LoRA:[1,10000],默认值16,步长2(当maxSeqLen=4096时,推荐步长4)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Tiny-8K KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值32,步长8(当maxSeqLen=4096时,推荐步长16)
LoRA:[1,10000],默认值32,步长8(当maxSeqLen=4096时;当推荐步长16时,maxSeqLen=8192,推荐步长16)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Tiny-128K-0929 KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长2(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=131072时,推荐步长8)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
ERNIE-Speed-Pro-128K KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1
LoRA:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=65536时,推荐步长4)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,8 或 64 ,默认值64
ERNIE-3.5-8K KTO FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:
FullFineTuning:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
LoRA:[1,10000],默认值16,步长2(当maxSeqLen=4096时,推荐步长4)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· ktoBeta:[0.01,1],默认值0.1,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42
· lrSchedulerType: 单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· validationStep:[0,1000000],默认值16,步长1
· power:[1,3],默认值1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,8 或 64 ,默认值64

RLHF

model trainMode parameterScale hyperParameterConfig
ERNIE-Lite-8K-0308 RM FullFineTuning · epoch:[1, 50],默认值 1
· learningRate:[0.00000010, 0.01],默认值0.0000010,步长0.00000010
· maxSeqLen:单选,可选项4096、8192,默认值4096
· globalBatchSize:[1, 10000],默认值16,步长2(当maxSeqLen=8192时,推荐步长4)
· useCls:单选,可选项true、false,默认值true
· warmupRatio:[0.01, 0.5],默认值0.1,步长0.01
· weightDecay:[0.0001, 0.1],默认值0.01,步长0.0001
· pseudoSamplingProb:[0, 0.9],默认值0,步长0.1
· seed:[1, 2147483647],默认值42
· lrSchedulerType:单选,可选项linear、cosine、polynomial、constant、constant_with_warmup,默认值linear
· numCycles:[0.1, 0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001, 0.0000010],默认值0.00000010,步长0.00000001
· validationStep:[0, 1000000],默认值16,步长1
· power:[1, 3],默认值1
ERNIE-Lite-8K-0308 PPO FullFineTuning · epoch:[1, 50],默认值 1
· critic_learning_rate:[0.00000010, 0.00001],默认值0.000002,步长0.00000010
· learningRate:[0.00000010, 0.00001],默认值0.0000010,步长0.00000010
· maxSeqLen:单选,可选项4096、8192,默认值4096
· globalBatchSize:[1, 10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长4)
· clip_range_score:[5, 50],默认值10
· clip_range_value:[5, 50],默认值5
· clip_range_ratio:[0.01, 0.3],默认值0.2
· loggingSteps:[1, 1],默认值1
· warmupRatio:[0.01, 0.5],默认值0.1,步长0.01
· weightDecay:[0.0001, 0.1],默认值0.01,步长0.0001
· top_p:[0, 1],默认值0.9
· validationStep:[0, 1000000],默认值16,步长1
· repetition_penalty:[1, 2],默认值1
· temperature:[0, 1],默认值1
· kl_coeff:[0.001, 0.1],默认值0.02
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1, 10],默认值1
· saveStep:单选,可选项64、128、256、512、1024、2048、4096,默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1, 2147483647],默认值42
· lrSchedulerType:单选,可选项linear、cosine、polynomial、constant、constant_with_warmup,默认值linear
· numCycles:[0.1, 0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001, 0.0000010],默认值0.00000010,步长0.00000001
· power:[1, 3],默认值1
ERNIE-Tiny-8K RM FullFineTuning · epoch:[1, 50],默认值 1
· learningRate:[0.00000010, 0.01],默认值0.0000010,步长0.00000010
· maxSeqLen:单选,可选项4096、8192,默认值4096
· globalBatchSize:[1, 10000],默认值16,步长2(当maxSeqLen=8192时,推荐步长4)
· useCls:单选,可选项true、false,默认值true
· warmupRatio:[0.01, 0.5],默认值0.1,步长0.01
· weightDecay:[0.0001, 0.1],默认值0.01,步长0.0001
· pseudoSamplingProb:[0, 0.9],默认值0,步长0.1
· seed:[1, 2147483647],默认值42
· lrSchedulerType:单选,可选项linear、cosine、polynomial、constant、constant_with_warmup,默认值linear
· numCycles:[0.1, 0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001, 0.0000010],默认值0.00000010,步长0.00000001
· validationStep:[0, 1000000],默认值16,步长1
· power:[1, 3],默认值1
ERNIE-Tiny-8K PPO FullFineTuning · epoch:[1, 50],默认值 1
· critic_learning_rate:[0.00000010, 0.00001],默认值0.000002,步长0.00000010
· learningRate:[0.00000010, 0.00001],默认值0.0000010,步长0.00000010
· maxSeqLen:单选,可选项4096、8192,默认值4096
· globalBatchSize:[1, 10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长4)
· clip_range_score:[5, 50],默认值10
· clip_range_value:[5, 50],默认值5
· clip_range_ratio:[0.01, 0.3],默认值0.2
· loggingSteps:[1, 1],默认值1
· warmupRatio:[0.01, 0.5],默认值0.1,步长0.01
· weightDecay:[0.0001, 0.1],默认值0.01,步长0.0001
· top_p:[0, 1],默认值0.9
· validationStep:[0, 1000000],默认值16,步长1
· repetition_penalty:[1, 2],默认值1
· temperature:[0, 1],默认值1
· kl_coeff:[0.001, 0.1],默认值0.02
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1, 10],默认值1
· saveStep:单选,可选项64、128、256、512、1024、2048、4096,默认值256,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1, 2147483647],默认值42
· lrSchedulerType:单选,可选项linear、cosine、polynomial、constant、constant_with_warmup,默认值linear
· numCycles:[0.1, 0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001, 0.0000010],默认值0.00000010,步长0.00000001
· power:[1, 3],默认值1

SimPO

model trainMode parameterScale hyperParameterConfig
ERNIE-Character-Fiction-8K-1028 SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
ERNIE-Lite-128K-0722 SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001, 0.01],默认值0.00003,步长0.000001
· maxSeqLen:单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长2;当maxSeqLen=65536时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· tensorParallelDegree:[1,8],默认值4
· shardingParallelDegree:[1,64],默认值2
· sharding:stage1 或 stage2 或 stage3,默认值stage2
· recompute:0 或 1,默认值1
ERNIE-Lite-8K-0308 SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长4
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· tensorParallelDegree:[1,8],默认值2
· shardingParallelDegree:[1,64],默认值4
· sharding:stage1 或 stage2 或 stage3,默认值stage2
ERNIE-Speed-8K SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=4096时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
ERNIE-Speed-Pro-128K SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen:单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长2;当maxSeqLen=32768时,推荐步长2;当maxSeqLen=65536时,推荐步长2)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· sharding:stage1 或 stage2 或 stage3,默认值stage2
· recompute:0 或 1,默认值 1
ERNIE-Tiny-8K SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.000001,步长0.0000001
· maxSeqLen:单选,512 或 1024 或 2048 或 4096 或 8192,默认值4096
· globalBatchSize:[1,10000],默认16,步长8(当maxSeqLen=4096时,推荐步长16)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0, 1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· tensorParallelDegree:[1,8],默认1
· shardingParallelDegree:[1,64],默认8
ERNIE-Tiny-128K-0929 SimPO FullFineTuning · epoch:[1,50],默认值1
· learningRate:[0.0000001,0.01],默认值0.00003,步长0.000001
· maxSeqLen:单选,8192 或 16384 或 32768 或 65536 或 131072,默认值32768
· globalBatchSize:[1,10000],默认值16,步长1(当maxSeqLen=16384时,推荐步长4;当maxSeqLen=32768时,推荐步长4;当maxSeqLen=65536时,推荐步长2;当maxSeqLen=131072时,推荐步长8)
· loggingSteps:1
· warmupRatio:[0.01,0.5],默认值0.1,步长0.01
· weightDecay:[0.0001,0.1],默认值0.01,步长0.0001
· simpoBeta:[2,2.5],默认值2,步长0.001
· simpoGamma:[0.01,1.5],默认值0.5,步长0.001
· checkpointSaveStrategy: 单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· seed:[1,2147483647],默认值42,步长1
· lrSchedulerType:单选,linear 或 cosine 或 polynomial 或 constant 或 constant_with_warmup,默认值linear
· numCycles:[0.1,0.5],默认值0.5,步长0.1
· lrEnd:[0.00000001,0.000001],默认值0.0000001,步长0.00000001
· power:[1,3],默认值1
· validationStep:[0,1000000],默认值16,步长1
· 早停策略相关参数:
earlyStopping:True 或 False,默认False,当参数checkpointSaveStrategy=step时,此参数有效
earlyStopMetric:ValidationLoss,当参数earlyStopping为True时,此参数有效
earlyStoppingThreshold:[0,5] ,默认值 0.01,步长0.01,当参数earlyStopping为True时,此参数有效
earlyStoppingPatience:[1,50],默认值 3,步长1,当参数earlyStopping为True时,此参数有效
· 仅LoRA支持:
loraRank:单选,2 或 4 或 8 ,默认值8
· tensorParallelDegree:[1,8],默认值1
· shardingParallelDegree:[1,64],默认值8
· sharding:stage1 或 stage2 或 stage3,默认值stage2
· recompute:0 或 1,默认值1

图像生成类

model trainMode parameterScale hyperParameterConfig
WENXIN-YIGE SFT FullFineTuning · epoch:[1,100],默认值20
· learningRate:[0.00000001,0.01],默认值0.00001
· batchSize:[1,8],默认值8
Stable-Diffusion-XL-Base-1.0 SFT LoRA · epoch:[1,100],默认值20
· learningRate:[0.00001,0.0001],默认值0.00005
· batchSize:[2,8],默认值8

图像理解类型

model trainMode parameterScale hyperParameterConfig
LLAVA-V1.6-13B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.001],默认值0.00001,递增步长0.000001
LoRA:[0.0000000001,0.001],默认值0.0001,递增步长0.00004
· validationStep:[0,1000000],默认值16,递增步长1
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001,1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096,默认值2048
· freezeViT:布尔值,True 或 False,默认False
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
InternVL2-2B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,001],默认值0.00001,递增步长0.000001
LoRA:[0.0000000001,0.001],默认值0.0001,递增步长0.00004
· validationStep:[0, 1000000],默认值16,递增步长1
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01, 0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,True 或 False,默认False
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
Qwen2-VL-7B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.001],默认值0.00001,递增步长0.000001
LoRA:[0.0000000001,0.001],默认值0.0001,递增步长0.00004
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64(,当checkpointSaveStrategy=step,此参数有效
· validationStep:[0, 1000000],默认值16,递增步长1
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01, 0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,true 或 false,默认false
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
InternVL2-8B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.001],默认值0.00001,递增步长0.000001
LoRA: [0.0000000001,0.001],默认值0.0001,步长0.00004
· validationStep:[0, 1000000], 默认值16,递增步长1
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01, 0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,True 或 False,默认False
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
InternLM-XComposer2.5 SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:[0.0000000001,0.001],默认值0.00001,步长0.00001
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· validationStep:[0, 1000000],默认值16,递增步长1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,True 或 False,默认False
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
Qwen2-VL-2B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.001],默认值0.00001,递增步长0.00004
LoRA:[0.0000000001,0.001],默认值0.0001,递增步长0.00004
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:step 或 epoch,默认值step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当checkpointSaveStrategy=step时,此参数有效
· validationStep:[0, 1000000],默认值16,递增步长1
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01, 0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,true 或 false,默认false
· 仅LoRA:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
InternVL2.5-8B SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.001],默认值0.00001,递增步长0.00004
LoRA:[0.0000000001,0.001],默认值0.0001,递增步长0.00004
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:单选,step 或 epoch,默认step
· validationStep:[0, 1000000],默认值16,递增步长1
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当参数checkpointSaveStrategy=step时,此参数有效
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01,0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,True 或 False,默认False
· 仅LoRA支持:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
Qwen2.5-VL-7B-Instruct SFT FullFineTuning、LoRA · epoch:[1,50],默认值1
· learningRate:
FullFineTuning:[0.0000000001,0.001],默认值0.00001,递增步长0.00004
LoRA:[0.0000000001,0.001],默认值0.0001,递增步长0.00004
· batchSize:默认值1,步长1,取值范围如下:
当maxSeqLen为8192时,取值范围为[1,4]
当maxSeqLen为4096时,取值范围为[1,8]
当maxSeqLen为2048、1024、512时,取值范围为[1,16]
· checkpointSaveStrategy:step 或 epoch,默认值step
· checkpointCount:[1,10],默认值1
· saveStep:[1,50000],默认值64,当checkpointSaveStrategy=step时,此参数有效
· validationStep:[0, 1000000],默认值16,递增步长1
· schedulerName:单选:linear、cosine、polynomial、constant、constant_with_warmup,默认值cosine
· warmupRatio:[0.01, 0.1],默认值0.05,递增步长0.001
· weightDecay:[0.001, 1],默认值0.1,递增步长0.001
· maxSeqLen:单选:512、1024、2048、4096、8192,默认值2048
· freezeViT:布尔值,true 或 false,默认false
· 仅LoRA:
loraRank:单选,8 或 16 或 32 或 64 或 128 或 256,默认值8
上一篇
删除模型精调任务
下一篇
模型压缩