Need Advice on Training a Model for Blind Navigation App (iOS) #1201

Rat228420 · 2025-04-15T15:52:27Z

Rat228420
Apr 15, 2025

Hey everyone!

I'm currently working on a project for my bachelors degree, and I'd love to get some feedback, guidance

The idea is to develop an iOS navigation app for blind users, which uses the phone’s camera and machine learning to analyze the environment in real time, and provide audio guidance to help users navigate safely.

How it works (in theory):
The app will use the phone camera to detect important objects and features in the environment — things like:

Pedestrian paths

Traffic lights

Dangerous areas (e.g., stairs, escalators)

Braille signs/text (if possible)

It will also combine this visual input with the user's GPS location to help guide them toward their destination.

I’m still defining what the ML model should be able to recognize and how to structure everything. Right now, I’m trying to figure out:

Should I use object detection, image segmentation, or a combination of both?

How can I find or build datasets for training?

Is this realistic as a single project, or should I drop some features for now?

If anyone has experience with similar projects, or knows good resources, tutorials, or datasets — I’d be super grateful for any tips or advice!

Thanks a lot in advance!

fawern · 2025-04-25T19:25:19Z

fawern
Apr 25, 2025

Hey @Rat228420, I worked on a similar project

For your goals, I recommend using object detection (like YOLOv8) for elements such as traffic lights, and segmentation (like DeepLabV3) for detecting pedestrian paths or stairs. You can also combine both with a multi-task model.

Useful datasets:

Cityscapes, BDD100K, Open Images, Mapillary Vistas
You may need to collect custom data for Braille or indoor features

To combine camera and GPS, look into basic sensor fusion or just use GPS for route guidance and visual data to validate surroundings.

Start with basic features like detecting traffic lights, then expand. Tools like Detectron2 and Label Studio are also very helpful.

Good luck.

0 replies

Rat228420 · 2025-04-27T15:09:24Z

Rat228420
Apr 27, 2025
Author

hi , @fawern
Thanks for your help! I did try using YOLOv8 as you recommended. I loaded the model and exported it as an .mlmodel file, so I can use it later. Have you worked with it yourself?

Right now, I’m having trouble understanding the model’s output. It has the shape (1, 84, 8400), and each bounding box is predicting something different. Also, the raw logits from index 4 to 83 in each bounding box are strangely high.

Have you worked on a similar project or know how to fix this? Any advice would be greatly appreciated! Thanks again!

0 replies

fawern · 2025-04-27T15:45:49Z

fawern
Apr 27, 2025

Hello @Rat228420 again,

I’ve worked with YOLOv8 before,

1 → batch size (since you probably passed one image)
84 → for each prediction, you get:
- 4 numbers: bounding box coordinates (x, y, width, height)
- 1 number: confidence score
- 79 numbers: class probabilities (if you trained on 79 classes depends on your dataset)

So index 0–3 are bounding box coordinates, index 4 is objectness confidence, index 5–83 are the class probabilities.
That’s why you see raw logits at index 4–83 they aren't normalized yet.

Apply sigmoid activation to the raw outputs (YOLOv8 expects post-processing). Bounding box coordinates are usually fine. Objectness score and class scores need to be passed through a sigmoid.

here is a bsic python script:

import torch

output = model_output.squeeze(0)  # shape will be  (84, 8400)  
objectness = torch.sigmoid(output[4, :])
class_scores = torch.sigmoid(output[5:, :].transpose(0, 1))  # (8400, 79)

If you need any more help, feel free to ask

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Need Advice on Training a Model for Blind Navigation App (iOS) #1201

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Need Advice on Training a Model for Blind Navigation App (iOS) #1201

Uh oh!

Rat228420 Apr 15, 2025

Replies: 3 comments

Uh oh!

fawern Apr 25, 2025

Uh oh!

Rat228420 Apr 27, 2025 Author

Uh oh!

fawern Apr 27, 2025

Rat228420
Apr 15, 2025

fawern
Apr 25, 2025

Rat228420
Apr 27, 2025
Author

fawern
Apr 27, 2025