HackathonParty

Framing the Problem

What is the problem you are trying to solve? Who does it affect?

Finding the perfect song to match a photo, outfit, or moment can be surprisingly time-consuming... Whether it’s planning an outfit that fits a certain vibe, curating a playlist for a party, or choosing background music for a social media post, people often spend far too much time scrolling and second-guessing what feels “right.” This problem affects two main groups of people: social media users who want their photos or videos to have the perfect soundtrack that reflects their aesthetic, and individuals who want to align their daily experiences, like dressing up, studying, or hosting events, with music that matches their mood. In both cases, the process of finding a song that truly fits can lead to decision fatigue and creative frustration. Our goal is to simplify this process by using AI to automatically identify the vibe of an image(s) and suggest songs that match it. By doing so, we make it easier for users to express themselves, save time, and discover music that complements their visual and emotional world. Because we all know music moves the world.

Idea Explanation

What is your idea? How does it fix the problem?

Our idea is Tunetrest... an AI-powered platform using computer vision to match the vibe of your photos or outfits to the perfect songs. Users can paste or upload images (PNG), and our AI analyzes the aesthetic, mood, and emotional tone to instantly recommend a song that fits. If available, a Spotify preview will allow users to play a 30-second preview of the song. Moreover, they can add the song to their playlist by pressing "Open in Spotify", making the whole process of decision-making easier. Instead of spending minutes (or hours) scrolling through tracks, Tunetrest helps people discover music that visually and emotionally resonates with their content. It reduces decision fatigue, enhances creativity, and makes everyday moments —such as outfit planning and party playlists —feel more personal and inspiring.

Implementation

How do all the pieces fit together? Does your frontend make requests to your backend? Where does your database fit in?

Our app connects the frontend and backend via HTTP requests. When a user uploads an image or pastes an image URL into the frontend, that data is sent as a POST request to our backend endpoint (/song) using Axios. Before sending, the frontend sanitizes all the inputs... making sure URLs are valid and uploaded images are converted into safe Base64-encoded strings.

Once the backend receives the request, it processes it through Express and routes it to the /song endpoint. This endpoint acts as the bridge between our app and OpenAI’s Vision model (GPT-4o-mini). The backend sends the images to OpenAI along with a custom prompt that tells the model to analyze the photo’s aesthetic, mood, and emotional tone and then return a structured JSON response describing the vibe and suggesting a matching song.

After OpenAI generates its output, the backend extracts the suggested song name and artist and calls the Spotify API to search for that track. If a match is found, Spotify returns key information like the song title, artist, album cover, preview URL, and Spotify link. The backend then packages all of this data into a clean JSON response and sends it back to the frontend.

On the frontend, the result is displayed beautifully: users can view their uploaded images, see the matched aesthetic and mood, preview the 30-second Spotify clip, and open the full song directly in Spotify.

Currently, no database is required because all processing happens in real time & user data and images are not stored. However, a database could easily be integrated in the future to save user sessions, song history, or playlists.

Challenges

What did you struggle with? How did you overcome it?

Throughout this project, we faced several challenges that pushed us to problem-solve creatively and learn new technologies quickly. Our original idea was to create a Pinterest board-to-Spotify song generator, where users could directly pull aesthetic images from their Pinterest boards and get music recommendations that matched. However, after spending hours studying the Pinterest API documentation, we realized that access required a verified business account... something we couldn’t get, even after attempting registration under our student credentials... it would take days. That was a tough pivot moment, but we adapted by allowing users to upload their own images or paste public image URLs instead, which still accomplished the core goal of vibe-based song matching.

Another major challenge was integrating the backend with the frontend. For many of us, this was the first time building a full-stack application from scratch, so understanding how to properly handle POST requests, CORS configuration, and server-client communication took a lot of trial and error. We ran into issues with merge conflicts, inconsistent endpoints, and servers shutting down unexpectedly. To overcome that, we used a very methodical debugging process: logging each step, testing endpoints line by line, and rebuilding our requests until data flowed smoothly between the two sides.

Working with APIs was another obstacle. The Spotify API had multiple limitations and deprecated features, meaning we couldn’t access the full range of endpoints we initially hoped for. We learned to work within those constraints by leveraging Spotify’s search and preview endpoints to still return meaningful song results and embed 30-second previews. Which, again, the 30-second previews we learned were only available for 60-70% of the songs on Spotify - we then pivoted to add the feature to allow users to listen on Spotify directly. Adding to the issues we had with the Spotify API, we originally were planning to use Spofiy's features like danceability, energy_valence, etc as the way to find a song within Spotify - but, that was not possible given that unless we had a premium account, we couldn't access what attributes of a Spotify song would be available - and therefore, we couldn't create an OpenAI prompt for that. Again, we pivoted by asking OpenAI to instead find the song for us that we can search through the Spotify API.

We also struggled with team skill differences; some of us had experience with frontend React development, while others were learning backend Express or API integration for the first time. Instead of seeing that as a limitation, we used it as an opportunity to teach and learn from one another. Team members paired up to cross-train on areas they were less comfortable with, turning the project into both a technical build and a learning experience.

In the end, overcoming these challenges made our final product, an AI-powered vibe-to-song recommender, much more rewarding. We left with stronger technical skills, a better understanding of API ecosystems, and a deeper appreciation for how every piece of a full-stack project fits together.

Accomplishments

What did you learn? What did you accomplish?

Throughout this project, we learned a ton! On the technical side, one of our biggest wins was building comfort with real API integrations. We went from not fully understanding how frontend and backend communicate to actually wiring them together using POST/GET requests, creating our own Express endpoints, and testing them with tools like Postman to make sure data was flowing the way we expected. We also learned how to read and interpret API documentation practically. By the end, we could look at the Spotify API docs, understand what each endpoint expected, and use that to pull song data, album art, and preview clips. That alone was a huge milestone for us because it was one of our main goals going into the project.

We also learned how to work with the OpenAI API and prompt it to behave the way we needed. Specifically, we figured out how to send images to an AI model and get back structured JSON with “aesthetic,” “mood,” and a recommended song. Along the way, we had to debug real-world issues (i.e. handling 500 errors, CORS problems, sanitizing user input, and figuring out why something worked in theory but not in code). That forced us to slow down, log everything, and reason about what was actually happening in our backend instead of just guessing.

Another big thing we took away was seeing how similar patterns show up across languages. Most of our stack was JavaScript/React/Node, but we noticed that concepts like mapping over arrays, transforming data, and returning objects felt a lot like what some of us were used to doing in Python. That made the codebase feel more approachable across different experience levels and helped people jump in and contribute even if they weren’t “full-stack engineers” on day one.

In terms of what we actually built, we’re proud of the final product. We have a working app that lets a user upload up to five images (or paste in image URLs), and we intentionally limited it to five to control cost on the vision model. The backend takes those images, sends them to OpenAI’s vision model, and gets back the vibe of the images with aesthetic, mood, and an AI-suggested song. We then take that suggested song (title + artist), look it up on Spotify, and return details like the track name, artist, album cover, and a 30-second preview clip when available. On the frontend, the user then sees: (1) the images they submitted, (2) the detected aesthetic/mood, and (3) a personalized song recommendation with a button to open the track directly in Spotify.

The bigger accomplishment here is not just that it works, but that we understand how it works. By the end, we had a much clearer mental model for how different components in a full-stack app talk to each other, and we also learned a lesson about product planning: before committing to a feature idea, read the API docs and understand the access limits. That lesson came directly from our early attempt to use the Pinterest API (which we couldn’t access the way we wanted) and our later discovery that some Spotify features were deprecated. Now we know how to evaluate feasibility up front instead of finding out the hard way mid-build

Next Steps

What are the next steps for your project? How can you improve it?

Looking ahead, there are several exciting directions to take this project. One of the biggest improvements would be allowing users to save their uploaded images and revisit them later, & essentially creating personalized “vibe boards” that function like Pinterest playlists for music discovery. Each board could capture a theme, mood, or event and automatically generate new song matches over time.

We also want to expand integration with Spotify’s API, giving users the ability to not only listen to a 30-second preview but also create and save full playlists directly from the app. This would make the experience feel more connected and social, bridging the gap between aesthetic inspiration and real-world music discovery.

On the frontend, we could enhance the user experience by adding more interactive and immersive features, like dynamic color themes or animations that change based on the vibe of the uploaded images... giving the user a more “aura-based” experience. Another idea would be to build a quiz or mood-based exploration tool where users can upload multiple images and receive a personality-like breakdown of their aesthetic, matched with a curated set of songs.

Finally, to make the platform scalable, we’d consider introducing a lightweight database layer to save user sessions, images, and playlists. This would let users return to previous sessions and grow their personalized music–image ecosystem over time.

Tunetrest