A quick primer on how LLMs work

Kevin DrumMay 1, 2023 – 8:43 pm24 Comments

In case you're interested, Bob Carpenter has a short technical explanation of how modern Large Language Models (like GPT-4) work:

In a nutshell, language modeling is the simple task of predicting the next subword (“called a token”) based on the previous sequence of subwords. The state-of-the-art had stalled for years on n-gram models that use the previous n subwords (usually with n < 5). In 2017, a team of Google researchers released a paper titled "Attention is all you need," which introduced the current state-of-the-art neural network architecture for language modeling. The breakthrough was in extending the context length into the thousands (GPT 3.5 uses 4K, GPT 4 has 8K and 32K models) with an attention model that figured out which parts of the context to concentrate on. The fundamental bottleneck is that computation is quadratic in context length (though it's all on GPU, so that's a massive numbers of flops for relatively low power).

There's more at the link. It's an interesting short read.

24 thoughts on “A quick primer on how LLMs work”

rick_jones May 1, 2023 – 9:50 pm at

The fundamental bottleneck is that computation is quadratic in context length (though it's all on GPU, so that's a massive numbers of flops for relatively low power).

Quadratic is likely to win. And relatively low power is indeed relative. Upwards of 500-700 watts from the look of things for a high end GPU.
1. heyig60196 May 2, 2023 – 3:10 am at
  
  Google paid 99 dollars an hour on the internet. Everything I did was basic Οnline w0rk from comfort at hΟme for 5-7 hours per day that I g0t from this office I f0und over the web and they paid me 100 dollars each hour. For more details
  visit this article... http://hardincomejob90.blogspot.com
2. xi-willikers May 2, 2023 – 8:15 am at
  
  Quadratic meaning 2 times the context requires four times the work, 4 times requires 16 times the work, etc.
  
  Meaning it scales poorly. Then again Moore’s Law is exponential so maybe who cares right now?
  
  Always thought hardware was the bottleneck not electricity, but maybe that’s for Joe Shmo Rando and not Google, they might be bothered by operating costs more
  1. rick_jones May 2, 2023 – 12:43 pm at
    
    While a GPU pulling 700 watts may seem minimal at an individual level, there’s still the matter of dissipating all that heat.
    And indeed, at scale, it all matters. A megawatt here, a megawatt there, pretty soon you’re talking real power. As it were…
    1. xi-willikers May 2, 2023 – 1:38 pm at
      
      There’s a world where hardware costs are the limiting factor, and a world where electricity costs are the limiting factor. Unless Google or OpenAI tells us, we’re just guessing which it is
      
      Sort of different from the Bitcoin scenario, where there’s no hefty model and it’s just simple cryptographic computations. In that case electricity cost matters above all else. Not sure about these LLMs though
      
      Had thought it was a hardware limitation (since you need to load a gigantic model into memory, and memory is expensive for GPUs) but then again I don’t really know where I got that from, or if it’s true
3. name99 May 3, 2023 – 1:20 pm at
  
  No, it means that we're still missing an important part.
  The main thing LLMs given right now is that we have what appear to be good embeddings (ie the way any single word is connected to other words, similar to how if I say the word "fish" you have connections to water, scales, the sea, Nemo, shark, etc).
  But people think (apparently...) with just 7+/-2 elements of short term storage. They do this by chunking, so that not just words but phrases, then sentences, then paragraphs and even chapters all likewise have embeddings.
  
  To me eyes (whatever that's worth!) the next important research step in LLM's is to move embedding technology from just words to recursively covering phrases, sentences, paragraphs, ... This should allow both better LLMs (not forgetting the plot and veering off into random space after a few thousand words) and much cheaper (log rather than quadratic) scaling of context/attention.
KawSunflower May 2, 2023 – 12:18 am at

Speaking of LLMs, I & who knows how many other people received a message from "Google Assistant Chatter" on April 27th encouraging me to "collaborate with Bard."

Unlike KD, I am more inclined to pay attention to the BBC's lead item today: AI's "Godfather, Geoffrey Hinton quits Google.

I'm not opposed to innovation & progress, but to potential misuse & lack of adequate control in a world increasingly ruled by propagandists supported by gullible people & those who seem incapable of discerning reality from every tool already at the disposal of rabblerousers' hands.
1. Joseph Harbin May 2, 2023 – 11:29 am at
  
  Hinton: "There's an enormous upside from this technology, but it's essential that the world invests heavily and urgently in AI safety and control."
  
  I'm sure he's absolutely right about the need for AI safety and control, but I doubt anyone will listen (in time) when he says, "'We need to take a step back."
  
  All of the incentives are pushing the AI tech community to move faster, not slow down. Slowing down requires cooperation within the industry, along with government leadership and cooperation across nations. That's just not going to happen. It's an arms race.
  
  It's not that humans can't manage new technology more responsibly. Human cloning offers some fascinating and scary possibilities, but there is a field of bioethics that tries to provide some direction and guardrails, and for what it's worth, the field of medicine prides itself on the idea, "first, do no harm." There are at least some people there thinking through the potential downsides of where capabilities are moving.
  
  The guiding ethos of Silicon Valley, otoh, is "move fast and break things." I expect AI tech boys will do whatever they want, then make sincere apologies every time they're called to appear before the grandstanding septuagenarians at a Senate committee.
  
  ETA:
  Here's an article warning about the lack of transparency in the prevailing "black box" model of AI.
  
  The Computer Scientist Peering Inside AI’s Black Boxes
  Cynthia Rudin wants machine learning models, responsible for increasingly important decisions, to show their work.
  https://www.quantamagazine.org/cynthia-rudin-builds-ai-that-humans-can-understand-20230427/
  1. xi-willikers May 2, 2023 – 1:44 pm at
    
    Tough because whatever method you use to make a model “show its work” is probably going to be intrusive, and make the model dumber
    
    If we can only make really stupid models show us how they came up with something, it’s of limited value. I’ve heard of some methods to peek under the hood of really big and high quality AIs without affecting their accuracy, but it’s not that good yet
    
    Though I think there’s an opportunity cost in slowing down too. If AI tools could help us solve other big problems (climate change for example) then taking a break isn’t without cost. Not as simple as “slow down you greedy bastards!” I think
    1. Joseph Harbin May 2, 2023 – 2:14 pm at
      
      The issue is trust. How can you trust a machine to provide an accurate diagnosis and solution? You show the work. If that means there's an opportunity cost, that's not a cost that can be ignored. In the "solving climate change" scenario, let's say the black box declares the best solution is beginning aerosol injection in the atmosphere starting next week. How do we trust that's the right answer and not a mistake?
      
      Or if the doctors say, "Well, Mr. xi-willikers, the black box says we need to amputate." "Why?" you ask. "Because the black box says so" is the response. You know that's not good enough.
      
      AI does not create a world in which there won't be mistakes.
      
      But why does knowing what’s going on under the hood matter?
      
      If you want to trust a prediction, you need to understand how all the computations work. For example, in health care, you need to know if the model even applies to your patient. And it’s really hard to troubleshoot models if you don’t know what’s in them. Sometimes models depend on variables in ways that you might not like if you knew what they were doing. For example, with the power company in New York, we gave them a model that depended on the number of neutral cables. They looked at it and said, “Neutral cables? That should not be in your model. There’s something wrong.” And of course there was a flaw in the database, and if we hadn’t been able to pinpoint it, we would have had a serious problem. So it’s really useful to be able to see into the model so you can troubleshoot it.
      
      On top of that, there's the whole problem of "hallucination," which from what I've read the best AI scientists still can't explain.
      1. name99 May 3, 2023 – 1:28 pm at
        
        FFS, this problem is (in essence) ALREADY solved.
        Look at how ChatGPT hooked up to a deterministic engine (Wolfram) operates:
        https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/
        
        The same sort of plugins will come for other specialized domains (probably medicine and law soon, since in both cases expert systems already exist). The main function of the ChatGPT front-end will be to convert what the human says into what the expert system requires (ie ChatGPT will be essentially a new input device).
        
        But people have been unwilling to accept the results from something like MYCIN even when it shows better performance than human doctors. As always, the problem is less artificial intelligence than natural stupidity.
      2. xi-willikers May 3, 2023 – 2:04 pm at
        
        I don’t think there are any serious people considering handing the keys to the car to AI. Not saying “fix climate change, here’s your factory, do what you want”
        
        More like “find molecules which have X property” or “design a experimental procedure which is most efficient to approach this problem”
        
        Less of a resource like “Jim in accounting down the hall” and more like “Jim’s computer”
        
        There’s end-to-end ways to validate answers without knowing the guts of the reasoning. In fact, most science starts that way, we get a heuristic backed by experiment, and find the explanation later. But the fact that we can’t explain an experimental result doesn’t make it less valuable. We still don’t understand the guts of quantum physics, but we use it every day. So I think AI still has a lot of valid uses, even if it’s spitting out heuristics and predictions if we can’t explain
Pingback: Me, I’m stuck on a notion AI is (supposed to be) ‘sentient’… | Zingy Skyway Lunch
dspcole May 2, 2023 – 6:51 am at

So, how’s your weather out there..?
kahner May 2, 2023 – 7:46 am at

I can, in at least a vague conceptual way, understand how these models can respond to natural language questions with high quality natural language responses, but what's kinda blowing my mind is how they can generate functional code from some very basic prompts. I just used chatgpt to write a 100 odd lines of code in python with a basic front end GUI and with a couple feedback iterations it worked, and was done way faster than I could have figured it out.
1. ScentOfViolets May 2, 2023 – 6:10 pm at
  
  Here's a hint: According to the Zen of Python, there should be one best way to write a code snippet. I wonder how effective ChatGPT would be in oh, say, JavaScript 😉
Eve May 2, 2023 – 9:23 am at

Google paid 99 dollars an hour on the internet. Everything I did was basic Οnline w0rk from comfort at hΟme for 5-7 hours per day that I g0t from this office I f0und over the web and they paid me 100 dollars each hour. For more details visit this article... https://createmaxwealth.blogspot.com
zaphod May 2, 2023 – 9:31 am at

Charming, the games humans play. Our new toy may lead to "game over". In an article titled "Never give AI the nuclear codes", this is explained.

https://www.theatlantic.com/magazine/archive/2023/06/ai-warfare-nuclear-weapons-strike/673780/

"The same seductive logic that accelerated the nuclear arms race could, over a period of years, propel AI up the chain of command."

The danger is that if one side in a conflict fears that their command and control structure is in danger, they may automate decisions with AI.

Furthermore, I think that since conflict always involves two sides, we cannot unilaterally eliminate this danger by our own safeguards, since the other side might not subscribe to the same safeguards.
1. Winnebago May 2, 2023 – 9:44 am at
  
  Don't worry. Matthew Broderick will teach the AI the errors of its way.
2. KinersKorner May 2, 2023 – 9:59 am at
  
  David Mitchell’s Ghostwritten comes to mind. The defense AI runs amok.
Dana Decker May 2, 2023 – 9:41 am at

I get the "previous n subwords" part and how expanding it can improve output. But how far back are subwords relevant? 32K back of what? Wikipedia articles? Message board threads? So many language "packages" are topic-centric for paragraphs, but not much more. Going further back what are you tapping into? Random articles/posts/discussion about the weather, current events, yesterday's geometry test. Are they included? If not, what is the criteria for exclusion?
stilesroasters May 2, 2023 – 10:01 am at

Cal Newport had a nice explanation. It was far less technical, but still correct per AI practitioners he consulted. I found it helpful as a mental model for reasoning about what kinds of things are. LLM canning cannot do ( most likely, of course)

He discussed it on his podcast, but it may have also been in his most recent New Yorker article.
Joseph Harbin May 2, 2023 – 12:48 pm at

AI: what is it good for?
(Absolutely nothing, uhh.)

Well, it must be good for something. In my working days, I'd have been happy to have a chatbot help write proposals for me. That was some of the most soul-sucking work I've ever done. I could cut-and-paste with the best of them, but it still cost me untold lost nights and lost weekends rushing to finish another massive document under deadline. AI? Bring it on. Once AI chatbots become everyday services in the office and other areas of business, there will be no going back. It's a no-brainer.

But I'll be interested to see what happens with the culture at large. In particular, with music, movies, TV, etc. For a glimpse of what's to come, here's Kanye West's cover of "99 Luftballons," a song he has never recorded (and in a language he probably doesn't know):

https://www.youtube.com/watch?v=ZtrousbQmMk

We're at the novelty stage now. People like novelty, so there will be interest. Elvis will duet with Taylor Swift. Sinatra will rap with Jay-Z. Why not. The next Marvel movie will star Humphrey Bogart and Cary Grant. If Disney wants to, they'll be able to do it.

Hollywood is hardly the cradle of creativity as it likes to brand itself. It's the land of remakes and remixes. It's always been interested in new technology, but a few decades ago it switched from tech supporting stories and songs about humans to humans playing a supporting role to technology. I expect AI will only accelerate the trend.

It's a dismal prospect. But then I remind myself that I abhor most of what gets released as new music and new movies. I am an institutionalist when it comes to politics but a revolutionary when it comes to culture. Culture has become a force for control with the oligarchs in charge. We need artists to have freedom again. Maybe the walls of Hollywood will come tumbling down and in the rubble artists, with or without AI, will have the chance to connect with human beings once again. Hey, a guy can dream, can't he?
1. ScentOfViolets May 2, 2023 – 6:50 pm at
  
  Roger that on abhorring most of what gets released as new music and new movies. For that matter, I find almost all the popular music from the late 50's to the present barely listenable. But that's just because 90% of everything is crap, innit? Per Sturgeon's Law.

Comments are closed.