This project, developed in the course INT2041 - Human & Computer Interaction, aims to aid visually impaired people in interacting with their environment and enhancing social interaction.
- Introduction
- Features
- Exploring mode
- Object Detection
- Explore Surroundings
- Socializing mode
- Mood Tracking
- Face Recognition
- Exploring mode
- Built with
- Contributors
Our team developed this project to enhance VIPs' proactivity in interactions. The app consists of two primary modes, each featuring different functionalities.
The first mode, called Explore Mode, allows users to actively identify objects in their surroundings and gather detailed information, such as color and size, through PaliGemma. This mode includes 2 features: Object Detection and Explore Surroundings.
The second mode called Socializing Mode, focuses on enhancing social interactions and includes two features: Mood Tracking and Face Recognition.
Modes are navigated between each other by sliding vertically on the screen, while functions within the same mode are switched back and forth by sliding horizontally on the screen.
This is the default feature when starting the app. It allows users to identify objects in their surroundings in real-time by pointing the camera at the objects.
This feature allows users to explore their surroundings by taking a picture of the environment. The app will then analyze the image and provide a detailed description of user requests.
1. Switch to Explore | 2. Take a picture of the surroundings by tapping twice on the screen | 3. Continue tapping twice on the screen to turn on recognizing speech |
4. The system is processing the image with a request from user | 5. The system returns the results and reads them |
Allows real-time recognition of the other person's emotions and returns sounds corresponding to each emotion.
This feature allows real-time recognition of the faces of other people around them. The app will then return the name of the recognized person or none if their face is not in the database.
- Android Studio - The IDE used for developing the app
- Jetpack Compose - The UI toolkit used for building the app
- Google Text-to-Speech - The API used for text-to-speech
- Google Speech-to-Text - The API used for speech-to-text
- MobileNets - The model used for object detection
- PaliGemma - The API used for exploring surroundings by giving detailed descriptions
- Google Face Detection - The API used for face detection and emotion tracking
- MobileFaceNet - The model used for face recognition with the face detected from Google MLKit Face Detection API
- Nguyễn Hữu Thế - 22028155
- Tăng Vĩnh Hà - 22028129
- Vũ Minh Thư - 22028116
- Chu Quang Cần - 22028093
- Lê Xuân Hùng - 22028172