Americas

  • United States

Asia

rob_enderle
Contributor

Nvidia’s StyleGAN could up-end a lot of creative industries

opinion
Apr 22, 20225 mins
Artificial IntelligenceAugmented Reality

If you take a lot of images, add in a dose of powerful artificial intelligence, and blend them all together, what do you get?

horizon transformation change future hope bright

Nvidia (a client of the author) has lately been doing a lot of fascinating things, from creating workstations designed to conceive the metaverse to digital assistants that are evolving into human digital twins to tools that could let anyone create compelling art. One of the more interesting tools is Generator StyleGAN, which creates people’s faces by blending pictures.

The training set for this artificial-intelligence-based offering contains 70,000 high-quality PNG images (each at a resolution of 1024×1024 pixels) that allow a user almost unlimited flexibility of source material.

StyleGAN has been around since 2018, became more widely available in 2019 when the source code went open source, and is now in its third permutation. StyleGAN3 was released last October.

The advantages for those of us who work with images include the eventual ability to craft them from large pools of protected source images without facing copyright issues or worrying about copyright infringement. And as the process evolves to include other images (it’s basically an image-blending engine), it could allow you to blend professional photographs from a variety of sources to create uniquely beautiful images or paintings created from memories or imagination with little or no connection to anything real. 

An AI-driven image-blending tool like StyleGAN could dramatically change and improve a number of industries and practices (or be used for more nefarious “deep-fakes”). Let’s explore.

Automated crime-sketch artists?

I watch a lot of crime procedurals on TV; there’s usually a segment where someone sits in front of a sketch artist to create an image of a criminal they observed. That entire process could be automated by a conversational AI. The witness could be shown an evolving picture with examples of features that are blended on command until the picture matches the victim’s memory. The end result would be a photorealistic image that could be used by facial recognition programs to locate the criminal quickly. (The collateral damage would be that there would no longer be a need for law enforcement sketch artists.)

One area where this technology might have a big impact is in locating kidnapped children. The AI could rapidly age the image of the child so they might be better identified later in life. 

Marketing, TV, and movies

A lot of marketing material uses stock images or models in production. The problem with the former is that these same images can be used in other campaigns — inadvertently connecting disparate campaigns. For instance, if the same image is used in a medication ad and for a restaurant, customers might associate the two and avoid the restaurant. The same problem could result from using a live model who later ends up on another campaign, since some actors and models move between competitors. And live models/actors can have personal problems that can damage a brand or ad campaign.  

But using blended images and videos from something like StyleGAN means you can create an image that can be copyrighted by your firm, be unique from any stock image, and not connected to any actor or model, living or dead. The result is lower cost and, more importantly, lower risk. You get results faster and the need for models and actors would be reduced. You might only use actors in 3D-imaging suits that obscure their identities — and with advances in metaverse tools and 3D imagers, you might not even need them. It also takes us a big step closer to not needing actors for movies. 

Human digital twins?

Another area Nvidia is exploring involves the creation of digital twins for the metaverse. And as the AI  behind these twins improves, they’d become more indistinguishable from the source material. When that happens, who owns the result? You can make an argument that an employee should own their digital twin. But if a tool like StyleGAN is used to blend both images and an employee’s skills, that position becomes more tenuous; a company might be able to defend its ownership of the result. (I expect future employees and unions could have significant problems with a something like this being used to displace employees without compensation. 

A blended future

The ability to blend source material that may (or may not) be protected at a scale is compelling — especially if it eliminates potential legal issues. Nvidia’s process uses a vetted source of images that eliminates legal exposure, but tools like this don’t have to rely only on stock photo databases; they could be used on images of public figures taken from social media posts, movies or other advertising material. 

At some point, I expect this technology will force a rewrite of copyright laws dealing with composite images. At the same time, they would reduce the amount of effort and cost that go into creating photorealistic movies and images that can be used in business and entertainment. It’s an early example of major changes coming to current business practices and related income for those engaged as models, actors, or directors, and for artists tasked with creating images that define memorized events. 

Tools like StyleGAN will redefine the future of virtual media for business, government, and entertainment. 

rob_enderle
Contributor

Rob Enderle is president and principal analyst of the Enderle Group, a forward looking emerging technology advisory firm. With more than 25 years’ experience in emerging technologies, he provides regional and global companies with guidance in how to better target customer needs with new and existing products; create new business opportunities; anticipate technology changes; select vendors and products; and identify best marketing strategies and tactics.

In addition to IDG, Rob currently writes for USA Herald, TechNewsWorld, IT Business Edge, TechSpective, TMCnet and TGdaily. Rob trained as a TV anchor and appears regularly on Compass Radio Networks, WOC, CNBC, NPR, and Fox Business.

Before founding the Enderle Group, Rob was the Senior Research Fellow for Forrester Research and the Giga Information Group. While there he worked for and with companies like Microsoft, HP, IBM, Dell, Toshiba, Gateway, Sony, USAA, Texas Instruments, AMD, Intel, Credit Suisse First Boston, GM, Ford, and Siemens.

Before Giga, Rob was with Dataquest covering client/server software, where he became one of the most widely publicized technology analysts in the world and was an anchor for CNET. Before Dataquest, Rob worked in IBM’s executive resource program, where he managed or reviewed projects and people in Finance, Internal Audit, Competitive Analysis, Marketing, Security, and Planning.

Rob holds an AA in Merchandising, a BS in Business, and an MBA, and he sits on the advisory councils for a variety of technology companies.

Rob’s hobbies include sporting clays, PC modding, science fiction, home automation, and computer gaming.

The opinions expressed in this blog are those of Rob Enderle and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.