• 2 hours
  • Easy

Free online content available in this course.

course.header.alt.is_certifying

Got it!

Last updated on 8/19/24

Beyond ChatGPT: Text, Image, and Video Generators

Chapter banner

In this course, we’ve been focusing on ChatGPT, because this is the most famous AI tool at the moment. It was ChatGPT that brought AI into the public eye after surprising everyone (including AI researchers!) with its skills.

The aim of this chapter is to give you an overview of the ongoing AI revolution, particularly in terms of generative AI, like ChatGPT, which generates text. We’re going to talk about:

  1. other AI text generators.

  2. AI image generators.

  3. AI audio, video, slide and website generators.

Yep, there are still more surprises in store!

Explore Other AI Text Generators

ChatGPT is the famous one, but you should know that there are other AI competitors with the same purpose. Some of them are starting to obtain a level of quality that is close or comparable to ChatGPT:

  • Bing AI: Microsoft’s search engine, Bing, lets you chat with an AI, which is in fact the same AI that powers ChatGPT. The reason for this is very simple. Microsoft is by far the largest investor in OpenAI, which it is free to reuse in its own products. Its investment means it owns (at the time of writing) 49% of the stock in the business.

Screenshot showing the Bing main page

Screenshot showing a conversation in Bing

Microsoft Bing AI can do web searches and reuse them in its responses

Bing even provided me with an image generated by DALL-E and gave me some suggestions of where to search next!

  • Google Bard: this is an AI model similar to ChatGPT, but powered by Google. It’s generally considered to be of lower quality than ChatGPT, but it’s likely to evolve quickly. 
    The paradox in this is that Google has been at the cutting edge of AI research for many years. They were the ones who invented the Transformer algorithm (the “T” from GPT), which was made public and used by competitors and the open-source community.

  • Meta LLaMa: this is artificial intelligence developed by Meta, the company behind Facebook. Only part of the AI was open source initially, but following a data breach, the entire system ended up in the public domain. This fueled a whole raft of open source and grassroots AI projects. One of the strengths of LLaMa is that you can use it without having to pay for a load of servers. Some have managed to power the AI on a home computer, which was unthinkable just a few months ago, when a whole server farm was needed!

  • Mistral: a very powerful Open Source AI developed by a French start-up.

As you can see, these AI programs all appear to come from the tech giants sometimes known collectively as GAFAM (the acronym for the Web giants Google, Apple, Facebook, Amazon and Microsoft). With the notable exception of Mistral. We might worry about them being controlled by vast companies. However, you need to be aware that there’s also an AI revolution happening in the open source world.

The open source world evolves extremely quickly and some of these AIs can run on laptop computers or even smartphones, without needing to access server farms within data centers.

Discover AI Image Generators

Beyond AI text generation, the same principle can be applied to generating images. The results, highly visual by their very nature, have even managed to impress the naysayers. 😳

Images generated by AI are professional quality. These might be artistic pictures that look like paintings, 3D images, or realistic “photos.” It’s got to the point where it’s become extremely difficult (if not impossible) to tell if an image is real or not. Let’s take the example of this viral image of the Pope wearing a puffer jacket.

A (fake) image of the pope wearing a luxury puffer jacket
A (false) image of the Pope in a luxury down jacket

It was generated by AI and it never happened. But it looks so realistic!

Using only a simple text command, such as “an astronaut on a horse is riding through space” or “the Pope wearing a puffer jacket,” these AI engines can generate high-quality images.

Again, in the space of just a few months, these images have gone from being pretty uninspiring to highly impressive.  Take a look at the difference between some images generated one year apart by Midjourney, both using the description “country roads take me home”:

The same image generated one year apart by Midjourney AI
The same image generated 1 year apart by Midjourney AI

The progress within one year is just staggering, so imagine what it’ll be like in another year or five! 🧐

Here are some well-known AIs that generate images:

  • Midjourney: AI that generates high-quality images based on a simple text command. A free version is available. However, you do need to sign up to a Discord chat server to use it, which is not particularly intuitive if you’re not used to it (but it’s not super complicated either). Fortunately, a web version is starting to become available, more intuitive.

  • Dall-E: developed by OpenAI, the creators of ChatGPT. It is now integrated into ChatGPT via ChatGPT Plus (and Team and Enterprise).

  • Stability.ai: open-source AI (which sets it apart from the previous ones). This means that you can get hold of the source code and improve it. Some people have really gone to town with this, sharing their templates with configurations for generating images of a certain type.
    If you want to try the online version, you can use the DreamStudio service.

Discover AI Audio, Video, Slide, and Website Generators

AIs can actually generate all kinds of documents today. These are not particularly impressive just yet, but you might find in a few months that the quality will improve and they’ll have their very own “ChatGPT moment” too.

In theory, you can generate:

  • audio: a good quality human-like voice can be generated using tools such as ElevenLabs (the AI can generate a credible voice based on a simple text command). music, using OpenAI Jukebox. Some people have managed to produce credible songs that have gone viral, such as this collaboration featuring Drake and The Weeknd (which never happened, but this didn’t stop people from enjoying the music). Evidently, this raises some future questions about copyright.

  • video: while still in its infancy, video generation is improving all the time. If you want to find out more, take a look at RunwaySynthesiaD-ID, etc. It’s not beyond the realms of possibility that in the future, you’ll be able to create photorealistic videos featuring your favorite artists on demand (if we didn’t have to worry about the key issue of copyright). It would be like Netflix, but all series and films would be created for you on demand based on how you’re feeling at the time. 

  • slides: how much time do people spend creating slides in their professional lives? Formatting them can be time consuming and laborious. AI can generate slides based on a simple command, including carrying out research to work out what to say (going way beyond just formatting). These AI engines can be quickly integrated into PowerPoint and Google Slides, but online services such as GammaTome and beautiful.ai will give you an idea of how much time you could save using AI. Microsoft itself is starting to integrate a slide generator directly into Powerpoint!

  • websites: still rather crude at the moment, it is however possible to generate websites using AI (mostly just the home page). You could try Mixo, for example, to give you an idea of what’s possible.

Is your head spinning? Mine certainly is!

Take a deep breath, go outside and come back once you’ve had a break. Then why not try out one of these AI options? You’ll feel like you’re living in the future. Except it’s actually already here! 🚀

Let’s Recap

  • ChatGPT is just the tip of the iceberg in the AI revolution, with many other generative AI programs being developed.

  • There are other generative AI systems that produce text, such as Bing AI, Google Bard, and Meta LLaMa, each of which has pros and cons. LLaMa in particular has spawned numerous open-source AIs, such as Alpaca and Vicuna.

  • Image-generating AI can create professional quality imaginary or realistic images, with Midjourney, Dall-E, and Stability.ai being notable examples.

  • AI can also generate audio, video, slides, and websites, and, while this raises serious questions around copyright and content creation, it also opens up some astonishing perspectives in terms of productivity.

Congratulations ! You have completed this part. I invite you to test your knowledge in the following quiz and to join me in the next part where we will see how to use ChatGPT according to your profession.

Example of certificate of achievement
Example of certificate of achievement