Skip to content

Paddleocr#108

Open
cagnulein wants to merge 34 commits intomasterfrom
paddleocr
Open

Paddleocr#108
cagnulein wants to merge 34 commits intomasterfrom
paddleocr

Conversation

@cagnulein
Copy link
Copy Markdown
Owner

No description provided.

@cagnulein
Copy link
Copy Markdown
Owner Author

@victorypoint it's ready for you test when you will come back (i should probably modify the return of the paddle engine because it's returning some verbose things so probably qz will not handle them yet, but the biggest part has done).
Have a nice trip!

@victorypoint
Copy link
Copy Markdown
Collaborator

@cagnulein, I'm back from Scotland and just tested this latest APK. I made a number of speed and incline changes. Here is the logcat:

stream-logcat.txt

@cagnulein
Copy link
Copy Markdown
Owner Author

welcome back @victorypoint ! I hope you had a great vacation in Scotland!
Thanks for the test, it indeed worked but only with words, i can't see any number. The problem could be 2: the chinese assset that i'm using (but the number should be the same in chinese) or the reduced image that i was using to improve the performance.

I built a new one with the full image https://github.com/cagnulein/QZCompanionNordictrackTreadmill/actions/runs/10678955998/artifacts/1884857824

Let's collect another log from this one, let's see if it's better. I saw in your previous one that the time that paddle took for the small image was about 1 second, not bad but also not perfect

@victorypoint
Copy link
Copy Markdown
Collaborator

@cagnulein, I just tested the new apk and have attached the logcat. Again I made a number of speed and incline changes.
stream-logcat.txt

@cagnulein
Copy link
Copy Markdown
Owner Author

thanks @victorypoint there is some issue that i have to address that i didn't see on my android simulator. it seems it can't process numbers at all. that's very strange. maybe because it's a different ARM type? maybe it's a 32bit one?

@victorypoint
Copy link
Copy Markdown
Collaborator

victorypoint commented Sep 6, 2024

Ok no worries. I've been using your last APK that uses Google OCR as it's working fine for me with iFit Beta regardless of the OCR delay. The only problem is I have to do some fiddling around to get the APK to show the "start ocr recording" prompt. Could this be modified so ocr recording is on at startup?

Also, is it worth doing a build that uses AI.Server for OCR much like what you've done with QZ?

@cagnulein
Copy link
Copy Markdown
Owner Author

cagnulein commented Sep 6, 2024 via email

@victorypoint
Copy link
Copy Markdown
Collaborator

victorypoint commented Sep 6, 2024

Right, the prompt doesn't appear on TM boot. I have to stop and start iFit and Companion a few times to get the prompt to appear.

For Ai.server, we're sending a request to an OCR server like: ocr = requests.post("http://localhost:32168/v1/image/ocr", files={"image":image_data}).json(). The 3 python scripts in QZ handled the cropping and image processing of Zwift screens. So something like that in Companion. UDP may work?

@cagnulein
Copy link
Copy Markdown
Owner Author

cagnulein commented Sep 6, 2024 via email

@victorypoint
Copy link
Copy Markdown
Collaborator

Let's use 192.168.1.4 for ai.server IP.

@cagnulein
Copy link
Copy Markdown
Owner Author

this is for me for reference

image

@cagnulein
Copy link
Copy Markdown
Owner Author

Right, the prompt doesn't appear on TM boot. I have to stop and start iFit and Companion a few times to get the prompt to appear.

For Ai.server, we're sending a request to an OCR server like: ocr = requests.post("http://localhost:32168/v1/image/ocr", files={"image":image_data}).json(). The 3 python scripts in QZ handled the cropping and image processing of Zwift screens. So something like that in Companion. UDP may work?

i will use this as a prompt for cursor ai editor, i would like to see its functionality :)

@cagnulein
Copy link
Copy Markdown
Owner Author

@victorypoint i found the root issue of the paddleocr built, it can't recognize the small fonts on the incline and speed metrics on the bottom. there is any way to make them bigger in the ui settings? otherwise can you try on paddle on pc if you can get it?

in the meantime i'm adding the ai server support

@victorypoint
Copy link
Copy Markdown
Collaborator

@victorypoint i found the root issue of the paddleocr built, it can't recognize the small fonts on the incline and speed metrics on the bottom. there is any way to make them bigger in the ui settings? otherwise can you try on paddle on pc if you can get it?

in the meantime i'm adding the ai server support

@cagnulein, no sorry, there appears to be no way currently to resize UI elements in iFit beta. I'll keep investigating.

@cagnulein
Copy link
Copy Markdown
Owner Author

cagnulein commented Sep 13, 2024 via email

@victorypoint
Copy link
Copy Markdown
Collaborator

@cagnulein, yes absolutely we can try paddleocr windows on the ifit images. How would I get the images your generating in companion?

@cagnulein
Copy link
Copy Markdown
Owner Author

i guess you can use the screenshot

image

it's what i'm using to test it

@victorypoint
Copy link
Copy Markdown
Collaborator

@cagnulein, I ran both PaddleOCR Windows and ai.server against the above screenshot. Both versions convert all the screen elements to OCR with very little error. Note that the raw OCR output in the attached logs is formatted a bit different.

ai.server-log.txt
paddleocr-log.txt

@cagnulein
Copy link
Copy Markdown
Owner Author

cagnulein commented Sep 13, 2024 via email

@cagnulein
Copy link
Copy Markdown
Owner Author

i tried also the paddleocr4android app as it is with our screenshot and it doesn't work either. i'm trying to issue a new on their repository

@cagnulein
Copy link
Copy Markdown
Owner Author

ticket here equationl/paddleocr4android#42

@cagnulein
Copy link
Copy Markdown
Owner Author

hi @victorypoint are you aware of this equationl/paddleocr4android#42 (comment) ?
I'm trying to port this to Fastdeploy then

@victorypoint
Copy link
Copy Markdown
Collaborator

@cagnulein, looks like good news on that ticket. Very helpful fellow. Since I'm using the Windows python deploy, it uses the PP-OCRv4 English model by default. I haven't tried the other models but read they are very good as well. https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_en/quickstart_en.md

@cagnulein
Copy link
Copy Markdown
Owner Author

Yes with the model proposed by the mantainer I was already able to see speed and incline on the demo project. I will port it to this branch.
THe only thing that I saw it's that on the emulator it's very slow, like 3 seconds per image, so i guess on your tablet could be only slower.
But let's see when I will complete the port

@cagnulein
Copy link
Copy Markdown
Owner Author

@stale
Copy link
Copy Markdown

stale Bot commented Jun 5, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale Bot added the wontfix This will not be worked on label Jun 5, 2025
@stale stale Bot closed this Jun 12, 2025
@cagnulein cagnulein reopened this Jun 13, 2025
@stale stale Bot closed this Jun 20, 2025
@cagnulein cagnulein reopened this Jun 20, 2025
@stale stale Bot removed the wontfix This will not be worked on label Jun 20, 2025
@stale
Copy link
Copy Markdown

stale Bot commented Jul 5, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale Bot added the wontfix This will not be worked on label Jul 5, 2025
@stale stale Bot closed this Jul 12, 2025
@cagnulein cagnulein reopened this Jul 12, 2025
@stale stale Bot removed the wontfix This will not be worked on label Jul 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SYSTEM_ALERT_WINDOW permission

2 participants