It’s a strange world that we live in. Now that generative AI is so commonplace, it’s reasonable to start distrusting everything that you see and read. The ability to make images of whatever you want is… well, it’s pretty cool honestly.
When I do photoshoots, which I try and do at least once a year, it’s typical to spend some thousand or so dollars on a flight, hotel room, food, new clothes, camera stuff, etc etc. And what do I get out of it? A bunch of pictures of me (and a mini vacation!). But in this new terrifying age, could I achieve a similar, or better result by using generative AI for less money?
The short answer is kind-of, but with a shit-ton of caveats.
It turns out that there is a tool called “Dreambooth” which, given some input photographs of yourself, allows you to generate a model that you can use to make a huge slew of pictures of almost whatever you can imagine.
Here’s a few I made.
Not one of these pictures is real – these were all generated with generative AI, and I find this deeply terrifying on a number of levels. Anyway, here’s how I did it!
The Basics
If you’re anything like me, you are a crossdresser/genderfluid/whatever person who has spent a lot of time taking photos of yourself. Cool, you’ve already got a training set! You’ll need a diverse set of about ~90 of these images, cropped to a square image, mostly of your face from different angles, and a few of your torso, and some of your full body.
I used getimg.ai to do all of this (warning, it does cost money – I threw about $100 at it). From there you can use your training set (read the instructions!) to make some models. I mostly used their Stable Diffusion 2.1 model, but also Dreamshaper and ICBINP Afterburn. Their instructions say to use around 20 images, but I found that some of the images generated didn’t fully capture my face properly, but models trained with 90ish images did really well. I also bumped the ‘train steps’ to 10000 (the max), and the number of class images to 2000. I’m no expert, but that seemed to help.
Made with a small training set. I don’t look like this.
After your models have been trained (this may take a few hours) you’re ready to drop into their AI generator tool and start creating.
Creating images
To create an image, you need a prompt – which is essentially how you tell the model what to create. As an example, using a fairly simple prompt like “woman sitting on a bench” gives me stuff like this:
Which… look fine if you don’t look closely. But if you look closely… yikes. I look like a pug! Occasionally with extra legs!
Expanding the prompt to “woman sitting on a bench, detailed face, smooth skin, sharp focus”, and adding a negative prompt of “yellow face, stubble, ugly, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, signature, cut off, old woman, elderly woman, wrinkles” yields much better results:
These are much better, but not perfect.
And that’s kind of par for the course – much like a photoshoot where you take hundreds of photos and only end up with a small handful of acceptable photos, the same applies here. I’ve been sick with bronchitis over the last few weeks (and it was the holidays!) so I ended up sitting motionless on the couch generating thousands and thousands of images, and deleting most of them. Even with the negative prompts to push the models to better images, you still end up with a lot of things that are unacceptable – extra legs, extra arms, malformed face, odd poses.
Quadboobs. If I had 4 hands, Quaid…
It’s a frustrating process. A lot of trial and error, figuring out what kind of picture you want, trying new prompts, fine-tuning the prompts, trying new models, playing with the number of steps, etc etc for hours and hours and hours. It’s exhausting, and you run out of creative juice real fast. I relied on a number of different resources to get a sense of what kinds of things I could even think to try and generate. This website is a great resource for being inspired by what things the various base models were trained on. Along with that, there was a lot of trawling the web for Dreambooth prompts and art styles and thinking of periods in history that might be cool to emulate.
Emotional Impact
I mean, let’s be real – there’s nothing healthy about sitting for hours and hours, days on end, trying to get a computer to generate pixel-perfect images of yourself. It’s deeply narcissistic, and allows for an opportunity to build up an image of yourself that isn’t real, and is impossible to achieve. I don’t know if it made me happier. Doing a real photoshoot with real pictures of what I actually look like? There’s a certain level of accomplishment associated with that. This felt addictive.
I’d only really intended to spend a tiny amount of money on this, partially out of curiosity, partly as fodder for a blog post (ayyyy thanks for reading!), and partly to start getting a better handle on the available AI tools. I slotted in $20, and then ran out of credits, topped up some more credits, and more, and then more. This was basically all I did for days. I think I finally, eventually, got bored of it. But I’m also curious how long that boredom is going to last for before I think “ooooh, I haven’t seen Liz as woolly mammoth yet!” (I have, actually – but that was an accident – I was trying to generate an ice-age woman). And I can imagine someone else pouring more and time, money, and effort into this trying to tackle their own self-image issues.
Gender and gender identity are complex things already. Adding in a tool to generate pictures of yourself to ease your gender-identity-related-feelings doesn’t seem helpful, but maybe I’m wrong! Maybe it’ll work for you. As for me, am I going to remember the work it took to put together a real photoshoot, get real pictures of me and know that’s how I actually looked? Definitely. Am I going to look at these AI-generated images and feel good about myself? Probably not. It’s the creation of a fantasy, not real life. Maybe that’s ok. Either way, I’ll still look at them.
Photograph vs artwork
So these models can generate reasonably good looking photographs – I’m conflicted on that front, because those things aren’t real – but what about strictly art work? I’m much less conflicted about generating cool art of Liz in oil paintings, or charcoal, or watercolor. There’s no way to trick yourself or someone else into thinking that’s what you look like – it’s not meant to. I ended up having a lot of fun trying to think of artists and art styles and what Liz might be doing in those pictures. That felt, for lack of a better word, less fraudulent.
Ethics
Being able to generate reasonably cool art is pretty fucking awesome, right? Kind of. If I had paid a real artist to do an oil painting of Liz, I would likely cherish it forever. Getting a robot to do thousands of them over the course of a day feels cheap in lots of ways. It’s essentially free which is great for me, but not for artists. And these generative tools are all running on the labor of thousands of individuals who’ve made real art, but not made any money from their work being copied by machines. These tools are in a very early stage still and are set only to improve as time goes on. We’re in the very early stages of trying to understand their impact and their place in society.
I guess in a year or so, I’ll check back in on how AI art generation has improved! Yay, THE FIRST BLOG POST IN A SERIES.
The possibilities are endless.
Full set of images and prompts here: Dreambox / Stable Diffusion prompts