Alright, folks, gather ’round! Today, I wanna share my little adventure in trying to get a computer to spot people in videos. It’s called “recognizing partners in video,” and let me tell you, it was a bit of a rollercoaster.

It all started with this idea: Could I make something that automatically figures out who’s who in a video? Maybe for, like, automatically tagging people in home videos or something. I had zero experience with this, so, naturally, I jumped right in.
The First Attempt (and Fail)
First, I needed some video to play with. Grabbed a random clip of my friends, easy enough. Then came the “brain” of the operation. Found this thing called OpenCV, which is supposedly good for image stuff. Installed it – that was a whole other can of worms, with dependencies and whatnot.
- Step 1: Get video. Check.
- Step 2: Install OpenCV. Ugh, finally.
- Step 3: …Profit? Nope.
I found some pre-built face detection examples, slapped one onto my video, and… well, it kinda worked. It drew boxes around faces, which was neat! But it didn’t know who those faces were. Just “face.” Not very helpful.
Getting Smarter (Slightly)
So, detecting faces is one thing, but recognizing them? That’s a whole different ballgame. Found out I needed something called “face recognition,” not just detection. Back to Google I went.

This led me down a rabbit hole of “models” and “training.” Apparently, you gotta “teach” the computer what each person looks like. Found another library, this one called “face_recognition,”(creative name). installed that, and it will do all the job, including recognizing faces.
The “Aha!” Moment
I grabbed a bunch of pictures of my friends, one by one. Fed those pictures into the “face_recognition” thing, telling it, “This is Bob,” “This is Alice,” and so on. It chugged away for a bit, creating these… “encodings,” I think they’re called. Basically, it turned each face into a bunch of numbers.
Then, I pointed it at my original video. And… BAM! It started putting names on the boxes! “Bob,” “Alice,” it was actually working! Not perfectly, mind you. Sometimes it got confused, especially if someone turned their head too much. But it was recognizing people!
The Leftovers
So, where am I now? Well, it’s still pretty basic. It only works with the people I “trained” it on. New person? It’s clueless. And it’s slow, takes a few seconds to process each frame of video. Definitely not “real-time.”

But hey, it’s a start! I learned a ton about how this whole “computer vision” thing works. It’s messy, it’s complicated, but it’s also kinda magical. I can see how this stuff could be used for all sorts of things, from security cameras to, well, tagging your friends in your vacation videos.
Next steps? I dunno, maybe try to make it faster. Or maybe teach it to recognize my cat. The possibilities are endless (and probably involve a lot more Googling).