HackathonParty

Framing the Problem

What is the problem you are trying to solve? Who does it affect?

Art Beyond Sight was born from a simple belief: art is for everyone. After reflecting on what art truly means, we realized that art isn’t just something to look at, but it’s something to feel, hear, and experience. Our project redefines how people engage with art by transforming images into immersive auditory experiences.

While we built this app with accessibility in mind, it isn’t only for people with visual impairments. It’s for anyone who wants to experience art in a new, multi-sensory way, to hear the emotion behind every brushstroke or feel the atmosphere of a monument through sound. When we met Jose, a visually impaired student, his feedback helped us refine the interface, narration flow, and audio design to make the experience truly inclusive. His perspective reminded us that accessibility doesn’t just help a specific group, it enhances the experience for everyone.

Idea Explanation

What is your idea? How does it fix the problem?

We created Art Beyond Sight, which is a mobile app (accessible to android and iOS) that implements technology to translate images such as paintings, landscapes, and monuments into melodies that allows the user to immerse themselves in art. For accessibility, there is full text-to-speech added and we ensured it is VoiceOver friendly. By turning visual art into auditory experiences, we’re giving everyone the chance to connect with creativity in a way that’s deeply personal and inclusive.

Implementation

How do all the pieces fit together? Does your frontend make requests to your backend? Where does your database fit in?

When a user takes or uploads an image, our AI analyzes it to generate a vivid visual and historical description, create a melody that reflects the image’s mood, emotion, and color palette, and narrate the description aloud through a human sounding text-to-speech.

The app is powered by a modern, scalable tech stack. The frontend, built with React Native and Expo, allows users to capture or select images and experience the generated sound and narration. The backend, developed using FastAPI and MongoDB, handles AI processing, Suno audio generation, and database storage. Through AI integration, the system identifies and describes each artwork or object, while the Suno API transforms these descriptions into harmonized audio that mirrors the emotion of the image. All data, including images, captions, and audio references, is stored in MongoDB Atlas for easy access and reusability. Together, these components connect seamlessly to deliver an inclusive, multi-sensory experience.

Challenges

What did you struggle with? How did you overcome it?

We were introduced to a lot of new technologies during this project, which pushed us to learn quickly and adapt. From exploring unfamiliar APIs and libraries to developing in React Native for the first time, there was a lot to figure out. One of our biggest technical challenges was establishing reliable callbacks to our server URL, something we ultimately solved using a polling approach. We also experimented with a ResNet-based machine learning model to identify images in real time through the camera. However, when the accuracy didn’t meet our expectations, we pivoted to using the Navigator UF API, which provided more consistent results for our use case.

As our vision expanded to include accessibility for people with sensory disabilities, we faced new challenges in understanding how visually impaired users navigate digital interfaces. That’s when we met Jose, a visually impaired student, whose honest feedback helped us refine the app’s accessibility, narration flow, and user experience.

Throughout all the ups and downs, we stayed grounded by focusing on our main goal which was creating something that truly makes art accessible to everyone. By communicating openly, dividing tasks effectively, and supporting each other, we were able to move past our challenges.

Accomplishments

What did you learn? What did you accomplish?

We are incredibly grateful for this experience and feel that we’ve grown both personally and technically throughout the process. On the technical side, we gained valuable hands-on experience working with REST APIs, accessibility frameworks, text-to-speech tools, and AI-based image analysis. We also explored machine learning models such as ResNet for object detection and image recognition, which deepened our understanding of how AI can interpret the visual world. Beyond the technical growth, this project taught us the importance of designing with empathy, of building technology that doesn’t just function, but truly connects with people. Seeing our app create moments of joy and curiosity for users like Jose reminded us why we build: to make technology inclusive, meaningful, and human-centered.

Next Steps

What are the next steps for your project? How can you improve it?

Our next goal is to expand the app’s capabilities by introducing a free-roam mode that allows users to take a picture of anything and instantly hear its story. Moving forward, we plan to enhance our image recognition models for broader real-world use, improve our audio generation system to dynamically adapt tone and tempo, and continue collaborating with visually impaired users for ongoing feedback. Ultimately, our vision is to deploy Art Beyond Sight for real-world use in museums, schools, and public spaces, creating an inclusive platform where everyone can experience art, wherever they are.