Project Overview:
This was a project I worked on with a colleague, in which we automated a process that would pick up visuals (images, videos), that a client ran for their ads, download them, and attach labels to them to be able to sort data by. There was an existing process in place that caused a lot of issues, and the major issue lied in not being able to call individual steps in the process. We used AWS (Lambdas, EventBridge, S3, RDS), mongoDB, Meta API, and Google’s Gemini API to build an event bus system to smoothly process visuals.
Downloading + Storing Visuals:
To determine what visuals needed to be downloaded, we would use Meta’s API to find ads that we have not stored yet, and send them through an AWS EventBridge process. Note: when building this project, we originally focused on only Meta data. However, once we had built and tested the events, we incorporated data from other platform APIs such as TikTok and Twitter. This process allowed us to easily add in new platforms and build upon only the necessary event lambdas in the event bus.
The first step: get the image/video data from Meta via their API, and store it in a database (using mongoDB), in addition to storing the actual visual in an AWS S3 bucket.
Event Bus Steps:
-
Technically, at step 0, we have a handler that takes in the missing visuals request. We get a list of missing ads to send to the handler, and it then breaks up each ad in the array to send to the First Event.
The first event for each individual request by ad or visual uses the MetaAPI to get creative data returned. Then we store the data into mongoDB and the S3 bucket.
Once that process is finished, the response returned gets picked up by the next event in the bus.
-
We use a series of visual recognition APIs to get as much information as possible.
In this first step, we focused on using AWS Rekognition in order to return an array of labels for each visual.
The responses are stored in our database. Then we move onto the next labelling process that gets picked up in the Event Bus.
-
The next two events picked up by EventBridge use the following API services:
Google Vision
Google Gemini
Both of these are separate events since we are looking for specific kinds of descriptive labels.
With the popularity of AI services such as Gemini, we are able to fine tune the types of labels we are looking for. Instead of only being able to grab “Blue” as a label for just the image background, we can give granular prompts such as “What is the hair color of the model in this image?” The open ended-ness of Gemini allows users to customize what they are looking for in their visuals.
Putting it Together
Using an EventBridge via AWS allowed us to easily rerun specific steps in case they failed, and store data along the way. Instead of having to wait until the end of the process to save everything into our database and AWS, we could store data and make updates in each step.
This was a really cool project to work on, since I actually really enjoy rewriting and refactoring code to improve it. A lot of it ended up having to be written from scratch in order to fit it into this new system, but it was a great challenge, and I learned a lot. This was my introduction into using the EventBridge and RDS in AWS, which were really interesting to create with.