Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddleDetection的ppyoloe_seg算法的README.md缺乏 ONNX转换 + trtexec测试 #9289

Open
zjykzj opened this issue Jan 21, 2025 · 8 comments
Assignees

Comments

@zjykzj
Copy link

zjykzj commented Jan 21, 2025

文档链接&描述 Document Links & Description

在文档PP-YOLOE Instance segmentation中仅提供了该算法的训练记录,并没有相关的ONNX格式转换,以及trtexec测试,就像PP-YOLOE的实现

# 导出模型
python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams exclude_nms=True trt=True

# 转化成ONNX格式
paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_s_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_crn_s_80e_coco.onnx

# 测试速度,半精度,batch_size=1
trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16

# 测试速度,半精度,batch_size=32
trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16

# 使用上边的脚本, 在T4 和 TensorRT 7.2的环境下,PPYOLOE-plus-s模型速度如下
# batch_size=1, 2.80ms, 357fps
# batch_size=32, 67.69ms, 472fps

是否可以补全这部分内容,ppyoloe_seg算法可以适用于大部分实时实例分割场景,非常感谢!!!

请提出你的建议 Please give your suggestion

No response

@Bobholamovic
Copy link
Member

你好,目前我们并不能保证所有模型都能够正常导出为ONNX格式并使用TensorRT推理。建议可以参考PP-YOLOE的文档,尝试对ppyoloe_seg模型进行导出和推理,如果遇到问题的话欢迎在这里交流~

@zjykzj
Copy link
Author

zjykzj commented Jan 22, 2025

你好,目前我们并不能保证所有模型都能够正常导出为ONNX格式并使用TensorRT推理。建议可以参考PP-YOLOE的文档,尝试对ppyoloe_seg模型进行导出和推理,如果遇到问题的话欢迎在这里交流~

@Bobholamovic 非常感谢你的回复。实际上,我已经尝试使用PPYOLO-E的ONNX导出和TRTEXEC测试脚本,具体日志如下:

python tools/export_model.py -c configs/ppyoloe_seg/ppyoloe_seg_s_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_seg_s_80e_coco.pdparams exclude_nms=True trt=True
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[01/22 05:49:39] ppdet.utils.download INFO: Downloading ppyoloe_seg_s_80e_coco.pdparams from https://paddledet.bj.bcebos.com/models/ppyoloe_seg_s_80e_coco.pdparams
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 35276/35276 [00:15<00:00, 2273.62KB/s]
[01/22 05:49:55] ppdet.utils.checkpoint INFO: Finish loading model weights: /root/.cache/paddle/weights/ppyoloe_seg_s_80e_coco.pdparams
loading annotations into memory...
Done (t=0.80s)
creating index...
index created!
[01/22 05:49:57] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_seg_s_80e_coco/infer_cfg.yml
Traceback (most recent call last):
  File "/data/zj/paddle/PaddleDetection/tools/export_model.py", line 118, in <module>
    main()
  File "/data/zj/paddle/PaddleDetection/tools/export_model.py", line 114, in main
    run(FLAGS, cfg)
  File "/data/zj/paddle/PaddleDetection/tools/export_model.py", line 80, in run
    trainer.export(FLAGS.output_dir, for_fd=FLAGS.for_fd)
  File "/data/zj/paddle/PaddleDetection/ppdet/engine/trainer.py", line 1282, in export
    static_model, pruned_input_spec = self._get_infer_cfg_and_input_spec(
  File "/data/zj/paddle/PaddleDetection/ppdet/engine/trainer.py", line 1233, in _get_infer_cfg_and_input_spec
    input_spec, static_model.forward.main_program,
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1062, in main_program
    concrete_program = self.concrete_program
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 941, in concrete_program
    return self.concrete_program_specify_input_spec(input_spec=None)
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 986, in concrete_program_specify_input_spec
    concrete_program, _ = self.get_concrete_program(
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 875, in get_concrete_program
    concrete_program, partial_program_layer = self._program_cache[
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1648, in __getitem__
    self._caches[item_id] = self._build_once(item)
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1575, in _build_once
    concrete_program = ConcreteProgram.from_func_spec(
  File "/usr/local/lib/python3.10/dist-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 68, in __impl__
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1346, in from_func_spec
    error_data.raise_new_exception()
  File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/error.py", line 452, in raise_new_exception
    raise new_exception from None
TypeError: In transformed code:

    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 59, in forward
        if self.training:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 69, in forward
        for inp in inputs_list:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 76, in forward
        outs.append(self.get_pred())
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/ppyoloe.py", line 144, in get_pred
        return self._forward()
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/ppyoloe.py", line 99, in _forward
        if self.training or self.is_teacher:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/ppyoloe.py", line 114, in _forward
        if self.post_process is not None:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/ppyoloe.py", line 120, in _forward
        if not self.with_mask:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/architectures/ppyoloe.py", line 124, in _forward
        bbox, bbox_num, mask, nms_keep_idx = self.yolo_head.post_process(
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/heads/ppyoloe_ins_head.py", line 694, in post_process
        if self.exclude_post_process:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/heads/ppyoloe_ins_head.py", line 705, in post_process
        if bbox_num.sum() > 0:
    File "/data/zj/paddle/PaddleDetection/ppdet/modeling/heads/ppyoloe_ins_head.py", line 707, in post_process
        if bbox_num.sum() > 0:
            pred_mask_coeffs = pred_mask_coeffs.transpose([0, 2, 1])
            mask_coeffs = paddle.gather(
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                pred_mask_coeffs.reshape([-1, self.num_masks]), keep_idxs)

    File "/usr/local/lib/python3.10/dist-packages/paddle/tensor/manipulation.py", line 3010, in gather
        check_variable_and_dtype(index, 'index', ['int32', 'int64'], 'gather')
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/data_feeder.py", line 170, in check_variable_and_dtype
        check_type(input, input_name, Variable, op_name, extra_message)
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/data_feeder.py", line 201, in check_type
        raise TypeError(

    TypeError: The type of 'index' in gather must be (<class 'paddle.base.framework.Variable'>, <class 'paddle.Tensor'>), but received <class 'NoneType'>.

@zjykzj
Copy link
Author

zjykzj commented Jan 22, 2025

我使用了百度Docker容器:ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:2.6.2-gpu-cuda11.2-cudnn8.2-trt8.0。并且切换到PaddleDetection的最新环境release/2.8

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal

>>> import paddle
>>> paddle.__version__
'2.6.2'

$ git branch -vv
  release/2.6 7fde274c2 [origin/release/2.6] update paddlex of readme (#8770)
* release/2.8 7a4fc2578 [origin/release/2.8] update install command (#9230)
  v2.8.0      666f597fd Face det (#9179)

@Bobholamovic
Copy link
Member

看起来是动转静失败了,请使用最新的paddle 3.0版本试试~

@zjykzj
Copy link
Author

zjykzj commented Feb 5, 2025

看起来是动转静失败了,请使用最新的paddle 3.0版本试试~

@Bobholamovic 非常感谢,使用paddle 3.0版本容器(paddlepaddle/paddle:3.0.0b1-gpu-cuda11.8-cudnn8.6-trt8.5)可以将ppyoloe_seg目标分割算法的模型成功转换成ONNX格式

λ b2a8e6f217f3 /data/zj/paddle/PaddleDetection python tools/export_model.py -c configs/ppyoloe_seg/ppyoloe_seg_s_80e_xfy.yml -o exclude_nms=True
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[02/05 10:20:18] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_seg_s_80e_xfy/model_final.pdparams
loading annotations into memory...
Done (t=1.22s)
creating index...
index created!
[02/05 10:20:19] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_seg_s_80e_xfy/infer_cfg.yml
I0205 10:20:24.771257   104 program_interpreter.cc:243] New Executor is Running.
[02/05 10:20:25] ppdet.engine INFO: Export model and saved in output_inference/ppyoloe_seg_s_80e_xfy


λ b2a8e6f217f3 /data/zj/paddle/PaddleDetection paddle2onnx --model_dir output_inference/ppyoloe_seg_s_80e_xfy --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 13 --save_file ppyoloe_seg_s_80e_xfy.onnx
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
[Paddle2ONNX] Start to parse PaddlePaddle model...
[Paddle2ONNX] Model file path: output_inference/ppyoloe_seg_s_80e_xfy/model.pdmodel
[Paddle2ONNX] Parameters file path: output_inference/ppyoloe_seg_s_80e_xfy/model.pdiparams
[Paddle2ONNX] Start to parsing Paddle model...
[Paddle2ONNX] [reduce_mean: mean_0.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [reduce_mean: mean_1.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [reduce_mean: mean_2.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [reduce_mean: mean_3.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [multiclass_nms3: multiclass_nms3_0.tmp_1] Requires the minimal opset version of 10.
[Paddle2ONNX] [reduce_sum: sum_0.tmp_0] Requires the minimal opset version of 13.
[Paddle2ONNX] Detected there's control flow op('conditional_block/select_input') in your model, this requires the minimal opset version of 11.
[Paddle2ONNX] Detected there's control flow op('conditional_block/select_input') in your model, this requires the minimal opset version of 11.
[Paddle2ONNX] Detected there's control flow op('conditional_block/select_input') in your model, this requires the minimal opset version of 11.
[Paddle2ONNX] [gather: gather_0.tmp_0] While rank of index is 2, Requires the minimal opset version of 11.
[Paddle2ONNX] [range: range_0.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [range: range_1.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [round: round_0.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [round: round_1.tmp_0] Requires the minimal opset version of 11.
[Paddle2ONNX] [slice: bilinear_interp_v2_1.tmp_0_slice_0] While has input StartsTensorList/EndsTensorListStridesTensorList, Requires the minimal opset version of 10.
[Paddle2ONNX] Use opset_version = 13 for ONNX export.
[WARN][Paddle2ONNX] [multiclass_nms3: multiclass_nms3_0.tmp_1] [WARNING] Due to the operator multiclass_nms3, the exported ONNX model will only supports inference with input batch_size == 1.
[Paddle2ONNX] PaddlePaddle model is exported as ONNX format now.

但是我把ONNX模型放到NVIDIA边缘端盒子Xaiver NX之后,转换成trt格式报错了???请问这个应该怎么解决

nvidia@linux:~/zj/paddle$ trtexec --onnx=ppyoloe_seg_s_80e_xfy.onnx --saveEngine=ppyoloe_seg_s_80e_xfy.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16
&&&& RUNNING TensorRT.trtexec # trtexec --onnx=ppyoloe_seg_s_80e_xfy.onnx --saveEngine=ppyoloe_seg_s_80e_xfy.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16
[02/05/2025-18:28:15] [I] === Model Options ===
[02/05/2025-18:28:15] [I] Format: ONNX
[02/05/2025-18:28:15] [I] Model: ppyoloe_seg_s_80e_xfy.onnx
[02/05/2025-18:28:15] [I] Output:
[02/05/2025-18:28:15] [I] === Build Options ===
[02/05/2025-18:28:15] [I] Max batch: explicit
[02/05/2025-18:28:15] [I] Workspace: 1024 MB
[02/05/2025-18:28:15] [I] minTiming: 1
[02/05/2025-18:28:15] [I] avgTiming: 8
[02/05/2025-18:28:15] [I] Precision: FP32+FP16
[02/05/2025-18:28:15] [I] Calibration:
[02/05/2025-18:28:15] [I] Safe mode: Disabled
[02/05/2025-18:28:15] [I] Save engine: ppyoloe_seg_s_80e_xfy.engine
[02/05/2025-18:28:15] [I] Load engine:
[02/05/2025-18:28:15] [I] Builder Cache: Enabled
[02/05/2025-18:28:15] [I] NVTX verbosity: 0
[02/05/2025-18:28:15] [I] Inputs format: fp32:CHW
[02/05/2025-18:28:15] [I] Outputs format: fp32:CHW
[02/05/2025-18:28:15] [I] Input build shape: image=1x3x640x640+1x3x640x640+1x3x640x640
[02/05/2025-18:28:15] [I] Input build shape: scale_factor=1x2+1x2+1x2
[02/05/2025-18:28:15] [I] Input calibration shapes: model
[02/05/2025-18:28:15] [I] === System Options ===
[02/05/2025-18:28:15] [I] Device: 0
[02/05/2025-18:28:15] [I] DLACore:
[02/05/2025-18:28:15] [I] Plugins:
[02/05/2025-18:28:15] [I] === Inference Options ===
[02/05/2025-18:28:15] [I] Batch: Explicit
[02/05/2025-18:28:15] [I] Input inference shape: scale_factor=1x2
[02/05/2025-18:28:15] [I] Input inference shape: image=1x3x640x640
[02/05/2025-18:28:15] [I] Iterations: 10
[02/05/2025-18:28:15] [I] Duration: 3s (+ 200ms warm up)
[02/05/2025-18:28:15] [I] Sleep time: 0ms
[02/05/2025-18:28:15] [I] Streams: 1
[02/05/2025-18:28:15] [I] ExposeDMA: Disabled
[02/05/2025-18:28:15] [I] Spin-wait: Disabled
[02/05/2025-18:28:15] [I] Multithreading: Disabled
[02/05/2025-18:28:15] [I] CUDA Graph: Disabled
[02/05/2025-18:28:15] [I] Skip inference: Disabled
[02/05/2025-18:28:15] [I] Inputs:
[02/05/2025-18:28:15] [I] === Reporting Options ===
[02/05/2025-18:28:15] [I] Verbose: Disabled
[02/05/2025-18:28:15] [I] Averages: 1000 inferences
[02/05/2025-18:28:15] [I] Percentile: 99
[02/05/2025-18:28:15] [I] Dump output: Disabled
[02/05/2025-18:28:15] [I] Profile: Disabled
[02/05/2025-18:28:15] [I] Export timing to JSON file:
[02/05/2025-18:28:15] [I] Export output to JSON file:
[02/05/2025-18:28:15] [I] Export profile to JSON file:
[02/05/2025-18:28:15] [I]
----------------------------------------------------------------
Input filename:   ppyoloe_seg_s_80e_xfy.onnx
ONNX IR version:  0.0.7
Opset version:    13
Producer name:
Producer version:
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
[02/05/2025-18:28:18] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of 'std::out_of_range'
  what():  Attribute not found: axes
Aborted

@Bobholamovic
Copy link
Member

建议尝试使用paddle静态图格式(pdmodel或json格式)模型是否能够正常推理(参考教程)。如果paddle静态图格式模型可以正常推理,则比较可能是ONNX模型的问题,即paddle2onnx步骤引入的问题,建议可以在Paddle2ONNX仓库的issue区提一个issue反馈~

@zjykzj
Copy link
Author

zjykzj commented Feb 6, 2025

建议尝试使用paddle静态图格式(pdmodel或json格式)模型是否能够正常推理(参考教程)。如果paddle静态图格式模型可以正常推理,则比较可能是ONNX模型的问题,即paddle2onnx步骤引入的问题,建议可以在Paddle2ONNX仓库的issue区提一个issue反馈~

从测试结果来看,在GPU服务器上面使用tools/infer.py文件 + 训练好的ppyoloe_seg_s分割模型是可以正常推理的:

λ b2a8e6f217f3 /data/zj/paddle/PaddleDetection CUDA_VISIBLE_DEVICES=1 python tools/infer.py -c configs/ppyoloe_seg/ppyoloe_seg_s_80e_xfy.yml -o use_gpu=true --infer_img=./dataset/xfy/images/val/646291.jpg --output_dir=./output/xfy/
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: Unable to use numba in PP-Tracking, please install numba, for example(python3.7): `pip install numba==0.56.4`
Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W0206 07:06:38.705178   408 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8
W0206 07:06:38.706279   408 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
[02/06 07:06:39] ppdet.utils.checkpoint INFO: Finish loading model weights: output/ppyoloe_seg_s_80e_xfy/model_final.pdparams
loading annotations into memory...
Done (t=1.21s)
creating index...
index created!
loading annotations into memory...
Done (t=1.22s)
creating index...
index created!
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00,  4.23s/it]

[02/06 07:06:49] ppdet.engine INFO: Detection bbox results save in ./output/xfy/646291.jpg

@Bobholamovic 这样是不是可以说明是ONNX格式转换问题,而不是Paddle实现的ppyoloe_seg算法问题?

@Bobholamovic
Copy link
Member

是的,这样比较可能是paddle2onnx步骤出现了问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants