-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The need for complete code #1
Comments
Hi @JiacliUstc , Thank you for your interest in our work. You can find part of our changes to the TensorFlow source here. I would like to kindly remind you that our modifications were based on an older version of TFLite. Since TFLite is regularly updated, the measurements may differ in the latest version. Our modifications mainly include: (1) updating the TFLite benchmark GPU delegate to report opeartion-wise latency; (2) repeating the dispatching of GPU kernels to acquire stable measurements. However, as far as I know, recent versions of TFLite have already introduced both GPU profiling and the benchmark option gpu_invoke_loop_times to enhance the stability of measurements. Therefore, I would suggest you download the latest pre-built native command-line binaries for TFLite benchmark tools. |
Hi Zhuojin,
I really appreciate your taking time out of your busy schedule to reply to my email. During this time, I found the code related to model_generator from the branch code of nn-meter and successfully executed it through debugging. Before that, I tried to convert the tf model provided by imgclsmob repository mentioned in your paper to tflite. Unfortunately, the tf model is converted from another framework. After converting to tflie model, there are many transpose operators, which obviously affect the reasoning efficiency. I decided to use the nn-meter method to generate the variant dataset, although there are far fewer types of models. I may try to add more types of models on top of nn-meter later, although it might be complicated for me. Anyway, thanks for your reply. Thank you
-----原始邮件-----
发件人: Zhuojin ***@***.***>
发送时间: 2024-12-11 12:55:56 (星期三)
收件人: qed-usc/mobile-ml-benchmark ***@***.***>
抄送: Jiacli ***@***.***>, Mention ***@***.***>
主题: Re: [qed-usc/mobile-ml-benchmark] The need for complete code (Issue #1)
Hi @JiacliUstc ,
Thank you for your interest in our work. You can find part of our changes to the TensorFlow source here.
I would like to kindly remind you that our modifications were based on an older version of TFLite. Since TFLite is regularly updated, the measurements may differ in the latest version. Our modifications mainly include: (1) updating the TFLite benchmark GPU delegate to report opeartion-wise latency; (2) repeating the dispatching of GPU kernels to acquire stable measurements. However, as far as I know, recent versions of TFLite have already introduced both GPU profiling and the benchmark option gpu_invoke_loop_times to enhance the stability of measurements. Therefore, I would suggest you download the latest pre-built native command-line binaries for TFLite benchmark tools.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @JiacliUstc , Correct. Importing models from another framework can introduce additional operations, such as transposes, in your tflite models. This is primarily due to differences in memory formats (e.g., NHWC in TensorFlow vs NCHW in PyTorch). It is recommended to implement your models directly in TensorFlow rather than converting them from PyTorch. For example, you should use the TensorFlow implementation provided by imgclsmob. |
Hello zhuojinl |
Hello, I am following the research direction of DNN Inference and prediction on Mobile Devices. I was lucky to read your paper titled "A Benchmark for ML Inference Latency on Mobile Devices". But I can't find the modified tflite benchmark tool you mentioned and the method to measure the stable inference delay in this repository. Can this part of the code be provided?
The text was updated successfully, but these errors were encountered: