What’s a LLM and why should we care?

ChatGPT is a LLM, and it sure has been in the news a lot recently! Here are a few recent headlines that got my attention, and that I’ll be using to develop a seminar on this topic for the fall semester:

For starters, let’s define the acronym LLM (source):

Large Language Models (LLMs) are designed to process and understand ‘natural language‘. These models are trained on very large amounts of text data, allowing them to accurately analyze and generate human-like text. LLM models, such as PaLM, ChatGPT, LaMDA, GPT3 have been shown to achieve state-of-the-art performance on a variety of natural language processing (NLP) tasks. They are typically trained using unsupervised learning, which means that they are not explicitly provided with the correct output for a given input, but instead must learn to generate reasonable outputs based on the input data.

OK, so how does a LLM learn? (source):

The core of an A.I. program like ChatGPT is a LLM, an algorithm that mimics the form of written language. Training happens by going through mountains of internet text, repeatedly guessing the next few letters and then grading itself against the real thing. Simple!

And perhaps you’re wondering what a ‘GPT’ is (source):

GPT is an AI term that stands for generative pre-trained transformer:

Generative because it generates words.

Pre-trained because it’s trained on a bunch of text. This step is called pre-training because many language models (like the one behind ChatGPT) go through important additional stages of training known as fine-tuning to make them less toxic and easier to interact with.

Transformers, introduced in a 2017 paper by Google researchers, are used in many of the latest A.I. advancements, from text generation to image creation. Transformers improved upon the previous generation of neural networks — known as recurrent neural networks — by including steps that process the words of a sentence in parallel, rather than one at a time, making them much faster.

What are some recent examples of popular LLMs? (source):

“The current LLM landscape is quickly and constantly evolving, with multiple players all racing past each other to release a bigger, better, faster version of their model. Investors are pouring billions of dollars into NLP companies, with OpenAI alone having raised $11B.”

GPT-4, announced on March 14, 2023, is Open AI’s latest model. GPT-4 is not strictly a language-only model as it can take images as input as well as text. Oh, and wouldn’t you know it, Microsoft’s new AI-powered Bing chatbot has been (secretly) relying on GPT-4 model all along.

ChatGPT (which seems to get most of the press) is a text-only model and was released by Open AI in November 2022. It can perform a lot of the text-based functions that GPT-4 can, though GPT-4 usually has better performance.

LaMDA (Language Model for Dialogue Applications), announced in May 2021, is a model that is designed to have more natural and engaging conversations with users. LaMDA is trained on dialogue and the model is able to discern various subtleties that set open-ended discussions apart from other types of language. LaMDA is built on an earlier Google Chatbot called Meena. The conversational service powered by LaMDA is called BARD.

Well, that covers the big-3 duking it out in the LLM wars, but there are many others wanting to play in this space as well. Meta, AI21 Labs, Tencent, Yandex, DeepMind, Naver, Meta, Amazon, Baidu, Deepmind, Anthropic, Alibaba, Huawei, etc.

You have to admit, this is fascinating technology, but there are all sorts of unknowns and concerns. Threats to our democracy, kids cheating on their school work, ‘facts’ manipulated by bad actors, privacy issues, etc. etc. From my perspective, and from what I have experienced so far using ChatGPT, BARD and Bing (GTP-4), I find these LLMs to be very useful when doing research and preparing material for a new course (or blog post). As with traditional search, you need to know how to phrase your request to a LLM, and you need to fact check the results and verify your sources.

It appears LLMs are here to stay, and they will get much better in the accuracy department. Now it’s our job to understand them better and learn how to live with them. There will certainly be many examples of misuse and abuse with LLMs, but when has a new technology entered our lives where this concern didn’t exist?

The class I’m preparing for the fall will discuss the social and ethical implications of LLMs, so I have a lot more to learn, and I’ll be sure to share that information here…

Update: If you’re still curious about this topic, here’s a great video you might want to watch:

“Playing With Dynamite:” How AI Chatbots Will Change Life as We Know It

What’s a LLM and why should we care?

About Glen

Search

Recent Posts

Search