Generative AI is here, and it will change our lives

DALL·E 2022-10-27 16.29.29 - castle on the edge of a cliff at sunset - 640

The image above was created using generative AI. More specifically, it was created using a tool called Dall-E from a company called OpenAI. It’s a text-to-image tool powered by generative AI. For this image I simply entered the following text – “castle on the edge of a cliff at sunset.” This is one of the four images the tool spit out. The implications for creative work going forward are obvious.

Image-to-text tools like Dall-E and Midjourney were already creating plenty of buzz in late 2022, but the release of ChatGPT on November 30, 2020 created a sensation. It was downloaded by millions within days and has sparked a frenzy around generative AI.

ChatGPT is a chatbot that can write essays, answer questions, write poetry and write code. It’s not always accurate, so it’s flawed, but the results can often be stunning. After playing around with the tool for a few minutes, it becomes clear how this will change the creative process forever.

What is Generative AI?

We’ve been hearing about artificial intelligence, or AI, for years. So why all the fuss now? This paper from Sequoia on generative AI summarizes it nicely in the first few paragraphs:

Humans are good at analyzing things. Machines are even better. Machines can analyze a set of data and find patterns in it for a multitude of use cases, whether it’s fraud or spam detection, forecasting the ETA of your delivery or predicting which TikTok video to show you next. They are getting smarter at these tasks. This is called “Analytical AI,” or traditional AI.

But humans are not only good at analyzing things—we are also good at creating. We write poetry, design products, make games and crank out code. Up until recently, machines had no chance of competing with humans at creative work—they were relegated to analysis and rote cognitive labor. But machines are just starting to get good at creating sensical and beautiful things. This new category is called “Generative AI,” meaning the machine is generating something new rather than analyzing something that already exists.

Generative AI tools are based on large language and large image models. The system learns what a cat looks like by processing millions of images tagged as a cat, and then is able to generate new images of a cat based on prompts. It also learns artistic styles based again on analyzing millions of images, so it can generate an image of a cat in the requested style of an artist or specific style such as impressionism. The same applies to text. You can ask ChatGPT to create a poem using the style of a particular poet, or a short story in the style of a particular writer. The possibilities are endless.

How does Generative AI work?

This Q&A from MIT provides a good overview of how these systems use “stable diffusion” models to create new images or text. In its current state, the creativity from generative AI leverages human creativity from the past, which can be very powerful. But it still has its limits in its current form:

In a sense, it seems like these models have captured a large aspect of common sense. But the issue that makes us, still, very far away from truly understanding the natural and physical world is that when you try to generate infrequent combinations of words that you or I in our working our minds can very easily imagine, these models cannot.

For example, if you say, “put a fork on top of a plate,” that happens all the time. If you ask the model to generate this, it easily can. If you say, “put a plate on top of a fork,” again, it’s very easy for us to imagine what this would look like. But if you put this into any of these large models, you’ll never get a plate on top of a fork. You instead get a fork on top of a plate, since the models are learning to recapitulate all the images it’s been trained on. It can’t really generalize that well to combinations of words it hasn’t seen.

Many of these issues can be resolved with creative prompts and using an iterative process, but the example cited above helps explain both the power and some limitations of these tools. But we know that over time these problems will be solved. It’s just a matter of time.

Early Stages

We’re just at the beginning of this generative AI revolution. The systems are clunky and imperfect. The generated text often contains factual errors, and the image generation tools are raising a host of legal questions around copyright issues.

But the tools are incredibly powerful and they’re here to stay. Yes, creative work will never be the same. Many jobs will become less necessary. But many creative workers and knowledge workers will become better and more productive by learning to leverage these tools.

So check them out and start playing around with them. You’ll have fun, and you’ll begin to appreciate the amazing potential around this transformative technology.

  

Related Posts

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

To use reCAPTCHA you must get an API key from http://recaptcha.net/api/getkey