Knover的使用问题
白羊Nealnice 发布于2020-11 浏览:1646 回复:2
0
收藏
快速回复

 

-----------------------------------------------
INFO 2020-11-15 17:16:07,120 launch.py:216] get cluster from args:job_server:None pods:['rank:0 id:None addr:127.0.0.1 port:None visible_gpu:[] trainers:["gpu:[\'0\'] endpoint:127.0.0.1:59297 rank:0"]'] job_stage_flag:None hdfs:None
INFO 2020-11-15 17:16:07,121 utils.py:379] start trainer proc:['/opt/conda/envs/python35-paddle120-env/bin/python', '-u', './train.py', '--is_distributed', 'true', '--model', 'UnifiedTransformer\r', '--task', 'DialogGeneration\r', '--vocab_path', './config/vocab.txt\r', '--do_lower_case', 'false', '--spm_model_file', './config/spm.model\r', '--init_pretraining_params', '12L', '--init_checkpoint', '', '--train_file', '/home/aistudio/data/input/train.txt\r', '--valid_file', '/home/aistudio/data/input/valid.txt\r', '--data_format', 'numerical\r', '--file_format', 'file\r', '--config_path', './config/12L.json\r', '--max_src_len', '384', '--max_tgt_len', '128', '--max_seq_len', '512', '--in_tokens', 'true', '--batch_size', '8192', '--learning_rate', '1e-5', '--warmup_steps', '1000', '--weight_decay', '0.01', '--use_amp', 'true', '--use_recompute', 'false', '--num_epochs', '20', '--log_steps', '100', '--validation_steps', '1000', '--save_steps', '1000', '--save_path', '/home/aistudio/work/output\r', '--random_seed', '11'] env:{'FLAGS_selected_gpus': '0', 'PADDLE_TRAINER_ID': '0', 'PADDLE_CURRENT_ENDPOINT': '127.0.0.1:59297', 'PADDLE_TRAINERS_NUM': '1', 'PADDLE_TRAINER_ENDPOINTS': '127.0.0.1:59297'}
You are using Paddle compiled with TensorRT, but TensorRT dynamic library is not found. Ignore this if TensorRT is not needed.Traceback (most recent call last):
File "./train.py", line 173, in
args = setup_args()
File "./train.py", line 50, in setup_args
models.add_cmdline_args(parser)
File "/home/aistudio/Knover/models/__init__.py", line 65, in add_cmdline_args
raise ValueError(f"Unknown model type: {args.model}")
ValueError: Unknown model type: UnifiedTransformer
INFO 2020-11-15 17:16:13,140 utils.py:275] terminate all the procs
ERROR 2020-11-15 17:16:13,140 utils.py:445] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
INFO 2020-11-15 17:16:16,144 utils.py:275] terminate all the procs
+ exit_code=1
+ exit 1

 

 

收藏
点赞
0
个赞
共2条回复 最后由白羊Nealnice回复于2020-12
#3白羊Nealnice回复于2020-12

解决了,windos暂时不支持并行,没法本地跑

0
#2野小桌子回复于2020-11

请问这个问题你现在解决了吗

0
TOP
切换版块