Navigation

AI – GPT, Midjourney, and DALL.E: The Next Frontier

Software, Media & Technology

 

The next technological evolution is upon us. The long-dormant chrysalis of AI is just starting to crack open and reveal its algorithmically generated contents to the world; technology which, at least according to one of its founders, has the power to “break capitalism”.

Truly disruptive technology comes around very rarely. Think in terms of the steam-powered engine, the computer, or the internet. Innovations such as these not only produce efficiency gains which go on to make the technologies essential, but also irrevocably permeate the social fabric of the societies they enter.

Although this comparison may seem grandiose at first, it is worth remembering that this is technology very much still in its infancy – its effects, however, are already being felt in the real world, for better or for worse.

 

What is OpenAI?

OpenAI is a research and technology company based in San Francisco. Founded in 2015 by Sam Altman, Greg Brockman, Ilya Sutskever, and Wojciech Zaremba – as well as with financing from Elon Musk – the company’s goal is to create the world’s first artificial general intelligence (AGI) and ensure that its inception ‘benefits all humanity’.

In simple terms, OpenAI develops and studies advanced computer systems that can perform tasks that typically require human intelligence, such as understanding language, recognising images, and making decisions.

The Company’s two most famous projects – DALL.E 2 and ChatGPT – both went viral nearly as soon as they were launched. DALL.E, a text-prompt-to-image generator, burst onto the scene last year, stunning the internet with the fidelity of its output from simple text prompts (examples given later). And, more recently, ChatGPT – a conversational chat-bot – has been making waves worldwide with the authenticity of its writing. In some cases, the tech is already indistinguishable from human writing.

And, very recently, OpenAI has launched GPT-4 – the newest iteration of its generative language model. Although currently restricted to premium licence holders, the update increases the length of generated responses from 3,000 words to 25,000, produces much more complex answers, and can now analyse images and provide output based on them.

With these two industry-leading technologies, OpenAI has secured a swarth of lucrative investments, most notably from  Microsoft (to the tune of nearly $14bn according to some sources), and is now poised to implement both ChatGPT and DALL.E 2 technology into the Company’s Bing search engine. Could this spell the end for Google and its problematic Bard alternative? Or will this quest for AGI turn out to be a flop not worth pursuing?

 

How it all works

Here’s how ChatGPT works at a very high level:

  • Pre-training: the model is pre-trained on a large body of text data, such as books, articles, and websites, to learn the patterns and relationships in language.
  • Fine-tuning: the pre-trained model is then fine-tuned for specific tasks, such as answering questions or generating text. In the case of ChatGPT, the model is fine-tuned for conversational language, so it can respond to questions and have conversations with users.
  • Input: when a user inputs a question or statement, the model processes the text and generates a response.
  • Output: The model generates a response based on the patterns and relationships it learned during pre-training and fine-tuning. The response is generated using a process called generation, in which the model creates text one word at a time, using the previous words as context.

ChatGPT is trained using a process called unsupervised learning, which means that it learns patterns in the data without being explicitly told what the patterns mean. Because of this, the Company also employs human reviewers which essentially act as teachers to the software, guiding the software towards more human-like, coherent output.

 

An omniscient AI?

OpenAI’s stated mission is to ensure that this new technology – this distant AGI – “benefits all of humanity”. This sounds nebulous, but the Company is taking actionable steps to reduce the amount of harm its platform can cause. Early on, filters were placed on both its DALL.E and ChatGPT models to cut bigoted rhetoric and insensitive imagery. For instance, you can’t ask for ‘Santa stabbing a child’ on DALL.E 2, and I think most people would agree that that is a good thing.

Additionally, and somewhat more pernicious, is the idea that these AI systems might start to show certain biases, whether these be political, social, or racial in nature. The world is already – and rightly so – concerned about the influence of biased media and social media. Human beings are innately biased creatures – we have tribalistic tendencies passed down to us from our distant ancestors, ill-adapted to the multicultural world we now all inhabit. If a true AGI emerges, and it is treated as omniscient, then will its biases be treated as sacrosanct?

Taking steps to address this, OpenAI is transparent with the guidelines it feeds its system. The Company understands that both the source material the programme is trained on, and the human feedback it receives during this process, is biased. To counteract this, OpenAI says it uses a range of sources, as well as a range of reviewers, to ensure the least biased output possible.

Ultimately, how much trust we put in the system is up to us. While it might present as omniscient and benevolent, it is important to remember that this is still a product manufactured by humans with their own innate biases. That is, until the rise of the AGI.

 

The quest for AGI

Artificial General Intelligence (AGI) and Artificial Intelligence (AI) are two related but distinct concepts in the field of artificial intelligence.

AI refers to the development of computer systems that can perform tasks that would normally require human intelligence, such as speech recognition, image recognition, and decision making. AI systems can be trained to perform specific tasks, but they lack the general intelligence and ability to learn and adapt to new situations that humans possess.

AGI, on the other hand, refers to the development of AI systems that possess a human-like general intelligence, can perform a wide range of tasks, and adapt to new situations. AGI systems are considered to be truly intelligent, capable of understanding the world and solving problems in a similar way to humans.

In short, AI refers to the development of narrow, task-specific intelligence, while AGI refers to the development of human-like general intelligence. The goal of many researchers in the field of AI is to create AGI systems that can truly understand and interact with the world in a human-like manner.

 

AI and art

AI-generate imagery has taken the world by storm and, for many, it was their first foray into generative AI models. DALL.E mini launched in mid 2022 much to the delight of the internet. Although blurry and unrefined, the ability to create whatever zany image your mind can muster with a simple prompt was game changing.

Since the tentative first steps taken by DALL.E mini, many other competitors have arisen. DALL.E mini’s successor, DALL.E 2, leads the pack, but many other solutions have found their niche in the fields of art and design. See below a few examples generated in Midjourney, my personal favourite.

Prompt: Boris Johnson jumping into volcano

 

Prompt: Ancient Rome

 

Prompt: Photo of medium-rare steak and chips at British pub

 

So, if the artists – a profession so subjective and creative many thought it untouchable by AI and automation – are having their careers threatened, what does that mean for those jobs that have been traditionally been associated with a risk of automation?

 

How will it impact the workplace?

The opportunities and threats AI presents come in almost equal measure; for each job AI makes easier, another is threatened. Since the industrial revolution, the threat of automation has loomed over the livelihoods of many workers, especially those carrying out repetitive or highly regimented tasks.

While the threat of automation used to be felt most keenly by those working blue-collar jobs, such as those working in factories or admin, AI now seems to have its sights set on the white-collar world, with solutions now cropping up to able to handle tasks such as content writing, accounting, and even coding.

For example, Google has stated – in theory, at least – that ChatGPT could fill the role of an entry-level coder, while those working at Amazon have found the system to be excellent at writing corporate strategy and replying to customer queries.

In truth, AI’s real impact on the workplace remains to be seen. Much like the rise of the internet, it can be hard to conceptualise the impact of a technology we don’t fully understand yet. As Christina Melas-Kyriazi, partner at Bain Capital Ventures put it: “If ChatGPT is the iPhone, we’re seeing a lot of calculator apps. We’re looking for Uber”.

Overall, the impact of AI on jobs will depend on a variety of factors, including the specific technologies being used, the industries and job roles in question, and how quickly AI is adopted. While some jobs may be at risk of automation, others may be enhanced by AI or lead to new opportunities. It is important for individuals and organisations to understand these changes and adapt accordingly.

 

In conclusion

The promise of AI is self-evident. If you’ve ever used one of these systems then you now the kind of magic they elicit. Conjuring up recipes, landscape paintings, and business plans with a simple prompt is a gimmick that’s hard to see tarnishing any time soon.

And the technology continues to evolve. Midjourney – the software used to generate the images you see in this blog, just launched its newest version, and the results already look leagues above what could be generated just last week. On top of this, OpenAI launched its latest GPT model – GPT-4, which we will cover in more depth in future blogs.

We’re already at the stage where the technology is nearly indistinguishable from (or, in some cases, superior to) human output. Don’t believe us? We’ve peppered paragraphs entirely written by AI throughout this blog – if you didn’t spot them, then perhaps we’ve already crossed the AI Rubicon.

 

 

By Rebecca Garland on 17/03/2023