Wednesday Wonders: Facing the music

For some reason, face morphing in music videos really took off, and the whole thing was launched with Michael Jackson’s video for Black or White in 1991. If you’re a 90s kid, you remember a good solid decade of music videos using face-morphing left and right.

Hell, I remember at the time picking up a face-morphing app in the five dollar bin at Fry’s, and although it ran slow as shit on my PC at the time, it did the job and morphed faces and, luckily, it never got killed by the “Oops, Windows isn’t backward compatible with this” problem, so it runs fast as hell now. Well, whenever I last used it, and it’s been a hot minute.

If you’ve never worked with the software, it basically goes like this. You load two photos, the before and after. Then, you mark out reference points on the first photo.

These are generally single dots marking common facial landmarks: inside and outside of each eye, likewise the eyebrows and mouth, bridge of the nose, outside and inside of the nostrils, top and bottom of where the ear hits the face, major landmarks along the hairline, and otherwise places where there are major changes of angle.

Next, you play connect the dots, at first in general, but then it becomes a game of triangles. If you’re patient enough and do it right, you wind up with a first image that is pretty closely mapped with a bunch of little triangles.

Meanwhile, this entire time, your software has been plopping that same mapping onto the second image. But, at least with the software I was working with then (and this may have changed) it only plops those points relative to the boundaries of the image, and not the features in it.

Oh yeah — first essential step in the process: Start with two images of identical dimensions, and faces placed about the same way in each.

The next step in the morph is to painstakingly drag each of the points overlaid on the second image to its corresponding face part. Depending upon how detailed you were in the first image, this can take a long, long time. At least the resizing of all those triangles happens automatically.

When you think you’ve got it, click the magic button, and the first image should morph into the second, based on the other parameters you gave it, which are mostly screen rate.

And that’s just for a still image. For a music video, repeat that for however many seconds any particular transition takes, times 24 frames per second. Ouch!

I think this will give you a greater appreciation of what Jackson’s producers did.

However… this was only the first computerized attempt at the effect in a music video. Six years earlier in 1985, the English duo Godley & Creme (one half of 10cc so… 5cc?) released their video Cry, and their face morphing effect is full-on analog. They didn’t have the advantage of powerful (or even wimpy) computers back then. Oh, sure, they had pulled off kind of early CGI effects for TRON in 1982, but those simple graphics were nowhere near good enough to swap faces.

So Godley & Crème did it the old fashioned way, and anyone who has ever worked in old school video production (or has nerded out over the firing up the Death Star firing moments in Episode IV) will know the term “Grass Valley Switcher.”

Basically, it was a mechanical device that could take the input from two or more video sources, as well as provide its own video input in the form of color fields and masks, and then swap them back and forth or transition one to the other.

And this is what they did in their music video for Cry.

Although, to be fair, they did it brilliantly because they were careful in their choices. Some of their transitions are fades from image A to B, while others are wipes, top down or bottom up. It all depended upon how well the images matched.

In 2017, the group Elbow did an intentional homage to this video using the same technique well into the digital age — and with a nod from Benedict Cumberbatch, with their song Gentle Storm.

And now we come to 2020. See, all of those face morphing videos from 1991 through the early 2000s still required humans to sit down and mark out the face parts and those triangles and whatnot, so it was a painstaking process.

And then, this happens…

These face morphs were created by a neural network that basically looked at the mouth parts and listened to the syllables of the song, and then kind of sort of found other faces and phonemes that matched, and then yanked them all together.

The most disturbing part of it, I think, is how damn good it is compared to all of the other versions. Turn off the sound or don’t understand the language, and it takes Jackson’s message from Black or White into the stratosphere.

Note, though, that this song is from a band named for its lead singer, Lil’ Coin (translated from Russian) and the song itself is about crime and corruption in Russia in the 1990s, titled Everytime. So… without cultural context, the reason for the morphing is ambiguous.

But it’s still an interesting note that 35 years after Godley & Crème first did the music video face morph, it’s still a popular technique with artists. And, honestly, if we don’t limit it to faces or moving media, it’s a hell of a lot older than that. As soon as humans figured out that they could exploit a difference in point of view, they began making images change before our eyes.

Sometimes, that’s a good thing artistically. Other times, when the changes are less benevolent, it’s a bad thing. It’s especially disturbing that AI is getting into the game, and Lil’ Coin’s video is not necessarily a good sign.

Oh, sure, a good music video, but I can’t help but think that it was just a test launch in what is going to become a long, nasty, and ultimately unwinnable cyber war.

After all… how can any of you prove that this article wasn’t created by AI? Without asking me the right questions, you can’t. So there you go.

Image: (CC BY-SA 2.0) Edward Webb

The Information Age? Part one

This was originally going to be a single piece but, as often happens, I got so into the subject at hand that I had to split it. The second installment will appear tomorrow.

“When did Camilla become the queen?” a voice behind me in line at Ralphs suddenly blurts out. The clerk and I look over to see a thirtyish woman who otherwise seems well-off and intelligent holding up a magazine with a cover story title along the lines of “Elizabeth and Camilla face off over the crown!”

The course of that conversation was rather illuminating for me — and I hope for the woman — but before I can get to that part, let’s take a little detour, shall we?

The so-called Information Age, also called the Computer Age, among other names, began around the 1970s, and its major hallmark was that the gathering, processing, and dissemination of information was rapidly moving away from analog media and devices — snail mail, paper files, and typewritten documents among them.

Telephone landlines were also moving away from using analog encryption and decryption of signals, as well as rotary dials that literally told the switches what digits you’d dialed by the number of clicks.

I could swear I’ve written about it here before, but can’t find the link. But this “number of clicks” thing is why the most populous regions at the time they instigated area codes got the ones with the fewest number of “clicks”: 212, 312, and 213 going respectively to New York City, Chicago, and Los Angeles.

Digital technology had existed by this point, ever since the invention of the transistor, first demonstrated in 1947. However, the earliest computers were huge — the size of entire rooms — very expensive, and ran ridiculously hot because they used vacuum tubes instead of integrated circuits, which didn’t exist yet anyway.

In addition, they were programmed either by physically flipping switches to set parameters, or in the more “advanced” models, data was fed into them — and read out of them — using either punch-encoded paper tape, or the infamous punch card, yards of the former and tons of the latter. To store just one gigabyte of data would have taken over 16,000 typical punch cards, at 64 bytes per card.

That’s bytes, using the 8-bit standard. The highest eight-digit number in binary is 1111 1111, which equals 255. Add in the extra number zero, 0000 0000, and that is why the number 256 is so important in digital technology.

In contrast, a 16-bit system gives you 65,536, which happened to be the original number of possible colors on the first VGA monitors eons ago. Beyond 16-bit, you’d need scientific notation. A 64-bit system would give you 1.85×10^19.

These are physical limit on how big a number any sized system can generate and use at one time, as well as how many binary digits can pass through a serial bus at one time.

The likely reason that the Information Age finally took off is directly tied into the birth of what would become the Internet in October 1969. It developed on into the 1980s. At the same time, the first small-ish computers that were affordable — at first to business, and then eventually to more affluent consumers — started to come onto the market.

Those early machines didn’t do a whole lot, but they created the first generation to go digital: Gen X. To this day, that seems to be the dividing line, and I know very few Boomers who seem comfortable with computers if they were never exposed to them during their professional lives.

I’m not generalizing Boomers, though — I know Millennials who couldn’t even manage to turn on a laptop, and I’ve met octogenarians who can work their way around a computer like an expert.

And then, about a decade after the first really affordable personal computers, the Internet happened. Well, I think it was the World Wide Web (WWW) at the time, although people didn’t used to distinguish the two.

The short version is that the WWW is the stuff you see online — the web pages that actually come to your machine. The Internet, meanwhile, is the vast network of computers that hosts the WWW, along with lots of other stuff, and which makes the magic happen that gets a document on somebody’s private server in Tierra del Fuego through an elaborate route that can sometimes cross the entire planet before it shows up on your laptop while you’re on vacation in Iceland, usually in under a second.

Another way to think of it is that the WWW are the letters and packages, and the Internet is the postal service. Of course, anyone who refers to the World Wide Web nowadays will just get looked at funny. There’s a reason that the “www.” Part of a URL hasn’t been required for a long, long time.

End result? The Information Age, which has been a wonderful thing.

If I’d tried to write this article, or something like it, in 1970, with all of the above information, it would have involved one or more trips to a physical library or a few, lots of manual searches and pulling out books to look for information in a very analog way — by reading it.

Notes by hand, or by using the library’s copiers, paying per page. And if said information weren’t in the library, I would have somehow had to find someone with expertise in the field and either (gasp!) call them on the phone, or send them a neatly typed letter nicely asking for info and waiting for however long it took them to respond, if they ever did.

I could never have adulted in that age. So that’s the upside of The Age of Information.

But there are several downsides, one having to do with quantity (and quality) of information, and the other having to do with a frequent lack of engagement, which is not unrelated to the first part.

I have a terabyte hard drive in my computer, and another terabyte external drive connected to it. These are ridiculously huge amounts of data, something we couldn’t even have conceived of needing back in the 1990s.

Now, multiply those two drives of mine by five to bring it to ten terabytes, and that’s enough storage space to hold the entire collection of the U.S. Library of Congress, digitized.

The amount of information available via the Internet dwarfs that number by a lot. It’s been estimated that just the “Big Four” of sites — Google, Amazon, Microsoft, and Facebook — store at least 1.2 petabytes of data.

Note that it’s not clear from my source whether this data includes YouTube and Instagram under the umbrellas of Google and Facebook respectively.

A petabyte is two steps up from a terabyte, meaning that one petabyte is a million terabytes — and the figure above doesn’t even include all the other storage spaces out there for all the other people. Granted, a lot of smaller sites contract to Google, Amazon, or Microsoft for cloud storage, but I’m sure that many of them don’t.

Have an email account that isn’t Google or Microsoft? Multiply the storage they allow by their users and add it in. Toss in all the university and library servers, as well as private industry servers — banking, real estate, finance, healthcare, retail, manufacturing, media.

And then don’t forget publicly accessible government servers which, in every country, go from a national to regional to an administrative to a local level. In the U.S., that’s Federal, state, county, city. The U.S. has 50 states and well over 3,000 counties. A county can have any number of cities, townships, unincorporated areas, boroughs, parishes, or whatever.

It all adds up.

Again, on the one hand, it can be the greatest thing ever. I certainly love it for writing and researching, because when I want to create a link, I just need to tap a couple of keys and do a quick search.

Since I’m a stickler for getting it right if I happen to be writing a period piece, and I want to know what the weather was in a certain place on a certain day, or what was on TV on a certain day and time, boom, done. The information is out there.

Nerd stuff. But that’s what I’m into, that’s what I love to do, and I know how to filter — as in which news and websites to ignore, which to use with caution, and which to trust.

Speaking of which, the main thing that Wikipedia is good for is a broad overview, but if you want the real story, always follow their external links to sources. Yes, I will link to them if I’m only going to give a superficial dash of info, especially if it involves pop culture, but they’re the card catalog, not the book, to dredge up an old analog metaphor.

But we live in a paradox in which information very much fits the old line from The Rime of the Ancient Mariner: “Water water everywhere, nor any drop to drink.”

This referred to being stranded in the ocean, so surrounded by water, but since it’s sea water, it is too salty to be drinkable.

Modern version: “Info info everywhere, oh fuck, I’ve got to think?”

A lot of people, for various reasons, don’t or can’t take the time to filter, and each of us is getting bombarded more in an hour than, say, our grandparents would have been in a week when they were our ages.

The interesting part is that analog media — like checkout lane magazines and TV and radio ads — are still very much a part of the mix. So back to the story I started with.

(To be continued…)

Image source, Old computer that used punchcards, found in the London Science Museum, (CC) BY-SA 4.0, via Wikimedia Commons