Create Your Custom AI Avatar Generation Website with Stable Diffusion
By the end of this Case Study, you will have the tools necessary to replicate your custom AI profile picture generator.
As you might have seen recently in your feed, AI-generated profile pictures with Stable Diffusion and Dreambooth are becoming common. How can you capitalize on this AI gold rush?
Imagine the possibilities of creating your own custom AI avatar generation website - one that you can endlessly customize with new prompts and that helps acquire customers in a growing and lucrative market. That's precisely what we did for Cheveux.AI with the help of Stable Diffusion, and the results were nothing short of impressive.
Table of Contents
What is Stable Diffusion
Stable Diffusion is a revolutionary machine-learning model that has overtaken the world. Developed by Stability AI, this open-source model allows anyone to generate photorealistic digital images from simple text descriptions. Stable Diffusion is a tool to generate intricate and detailed illustrations based on text prompts. This lets users free their wild creativity and envision any setting, characters, or objects they can imagine.
You can try it on Hugging Face, create an account, and visualize whatever comes through your mind in seconds. Need for inspiration? Find ideas in 'prompts databases', such as Lexica.
The image generation is straightforward to use; write a 'prompt' and watch as the model creates a series of images before your eyes. Stable Diffusion can also be used to enhance and manipulate existing images with text prompts or be applied to tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. This flexibility makes it a precious tool for various industries and applications, from art and advertising to film production.
How does Stable Diffusion work?
So, how does it work? Stable Diffusion splits the image generation process into a diffusion process, starting with a noisy image and gradually improving its quality until it closely matches the description in the text prompt. This approach allows for the generation of high-quality images in a relatively short amount of time.
Not only is Stable Diffusion a powerful tool, but it's also accessible to the masses. It can run on most consumer hardware, as long as you have a medium-range graphics cards, making it widely available to the public.
Why Stable Diffusion is a Big Deal?
Stable Diffusion is a game-changer in the world of image generation.
Its ability to generate detailed images based on text prompts, combined with its versatility and accessibility, make it an invaluable tool for a wide range of applications. It has allowed creative people who can't draw to create remarkable visual pieces.
If you are a designer, a stylist, or an interior architect, you can use Stable Diffusion to generate sketches, digital renderings, or images in just as little as one word.
Dreambooth
Enough about Stable Diffusion; what is Dreambooth? DreamBooth is a way to make text-to-image models more personalized.
It teaches the Stable Diffusion AI model to link new words with specific subjects and create realistic images of those subjects in different settings. The process starts with just a few pictures of a subject and then fine-tunes the model to create unique images of that subject. This approach allows for tasks such as changing the subject's context, guiding the image's view with text, changing the subject's appearance, and creating artistic images while keeping the subject's main features.
For our website to work, we need training images on which we will fine-tune a Stable Diffusion model to synthesize new images.
The Tech Stack used
I tell you, this tech stack is the ultimate dream team for any project that wants to launch fast and stay lean. It's like having a team of Avengers but with less spandex and more savings.
The project's main goal was to minimize time-to-market and reduce the fluff to the bare minimum. This is exactly why we chose our battle-tested and highly efficient serverless stack :
-
Next.js hosted on Netlify. The dynamic duo provides the frontend and backend of the application, respectively.
-
Stripe to generate secure billing pages.
-
Supabase as the database, mainly for storing the status of the sessions. This fully-featured, open-source alternative to Firebase provides real-time databases and authentication.
-
AWS to store the freshly generated images.
-
Banana.dev, a new player in the serverless GPU market, on which we are relying for all the compute-intensive tasks: Dreambooth model fine-tuning and inferring.
A few words about Serverless Architecture
As the name suggests, a serverless stack eliminates the need for provisioning and maintaining servers. Instead, it allows running code and executing functions responding to events, such as HTTP requests or file uploads. This allows for a more cost-efficient and scalable approach, as resources are only used when needed, and charges are based on usage rather than pre-allocated capacity. Additionally, serverless enables faster development and deployment cycles as the focus shifts away from managing infrastructure towards writing and deploying code.
Next.js and Netlify
Next.js is a widely used framework for building server-rendered React applications. It's designed to make it easy for developers to create high-performance and scalable web apps. One of the main reasons we chose to use Next.js for our project was its efficient routing and rendering capabilities.
To make the development process even more efficient, we decided to host our app on Netlify to reduce our dev-ops costs and allow us to focus more on the actual development.
However, instead of using Next as a full-stack framework, we chose to use Netlify functions for the backend, as it was something we were already familiar with. This allowed us to have all our APIs as serverless endpoints served by Netlify. We also set up recurrent cronjobs (read more about Netlify Scheduled Functions here), which allowed us to run specific scripts regularly, such as :
-
checking new user sessions
-
generating pictures after checkout
-
sending emails notification once the images are generated
This added a significant level of automation to our project and allowed us to separate the frontend and backend, using Next.js for the frontend and Netlify for the backend.
Overall, using Next.js and Netlify allowed us to create a high-performance, scalable and easy-to-maintain web application. It's an excellent choice for developers looking to build server-rendered React applications, and the integration with Netlify provides added capabilities that can help improve the overall development experience.
Supabase
Supabase is an amazing tool that helps simplifies the process of setting up and managing a database, and authentication, which can be complicated and time-consuming.
The database consists of one table, which helps us store the necessary information to verify payments, generate images and notify the customer. We are using it to link a user to a stripe session and, most importantly, to a set of freshly generated user profile pictures that are stored separately in AWS.
AWS
Even though Supabase does allow for large file storage, we found that using AWS was the best option for most of our projects. AWS provides a robust and flexible solution for managing large amounts of data, which is essential for our application that generates and stores user profile pictures.
While integrating AWS does add a slight level of complexity to the project, it is worth it in the long term. It allows us to keep costs low by only storing the necessary data and avoiding extra storage fees.
Banana.dev
I can already hear you ask: what's 🍌 Banana? Banana is one of the most cost-effective serverless GPU providers on the market. You write code, they host it, and give you an endpoint that you can then call to fine-tune your model and infer new profile pictures. This means you don't have to worry about setting up and maintaining your servers, which can be time-consuming and costly. Instead, you can focus on developing your model, uploading it, and calling its endpoint.
We wrote our custom code that fetches the source images from AWS, trains Dreambooth on these images, infers 100 different pictures, stores the news pictures in AWS, and then shuts off in about ~ 30mins.
There also exist generic repos that you can fork to work with Stable Diffusion on Banana, such as :
Project Features & Benefits
Cheveux.AI offers a great experience for users looking to create personalized avatars.
Overall features
-
Option to choose a plan (basic or premium)
-
Secure checkout with Stripe
-
Upload pictures to AWS servers
-
Dedicated gallery to view profile pictures generated with Stable Diffusion
-
Download and share personalized avatars on social media
-
Secure infrastructure and data management with AWS and Stripe
-
No sharing of user pictures with third-party.
The flow
It starts with a simple user flow; you can choose your desired plan (basic or premium) depending on the features you wish to have access to. Then, you can securely checkout with Stripe. This will allow you to proceed to the upload section, where you can upload your pictures to our AWS servers. After the uploading process, you will be able to see your personalized avatars on your dedicated gallery.
From there, you can download the photos, share them on your social media or use them as your profile pictures platforms. The website's infrastructure and data management are secure, with AWS and Stripe providing the necessary protection for users' data. This means that your pictures are safe with us, and we take the security of our users seriously. The user's pictures are not shared with any third party.
Fine-Tune with Dreambooth
We suggest you check Hugging Face's Training page to read more about the complete process and find code samples ready to use.
The Main Benefit: Cost-Efficiency
The main benefit of our architecture over the others is that we run a very lean and cost-efficient flow. Generally, the AI profile pictures generation websites all run on Astria.ai API to fine-tune their model and infer their prompts but that ease-of-use costs 3 to 4 times the cost of our setup with Banana, for instance.
Results & Wrap-up
The three most prominent profile picture applications currently generate $70k-130k monthly, with Lensa.ai being the outlier with over 300k monthly.
Not sure how to acquire customers?
Check out this great guide on how you can leverage AI to generate leads at a low cost!Create Your Lead Generation Tool with AI (GPT-3)
Learn how to create a cost-effective lead generation tool with OpenAI's GPT-3 technology. This case study showcases a practical example of using GPT-3 to build a lead generation webs...