Skip to content

AI video isn’t perfect yet. So what?

OpenAI recently released an AI video creator, and its ouput, though amazing, isn't perfect. Atrios comments:

Not an expert on video/image editing, but I suspect editing out the 6th finger is a lot more work than it's worth. And, sure, these things will always get better, but some things need to be better than pretty good to be useful.

I'm not picking on Atrios here. Lots of people feel exactly this way. But I just don't get it. OpenAI and others have been working on LLM models for a while, but ChatGPT was the first really good one to be released publicly and that was in November 2022. That's only 15 months ago!

I sort of get the skepticism over driverless cars. It really does seem as if they've been "almost there" for the better part of a decade. But LLM models? Their capabilities have skyrocketed beyond belief in just a year or two. Their rate of improvement could be cut in half and they'd still be on track to take over the world by the end of the decade.

And even that presumes that LLMs are the ultimate in AI tech. But they surely aren't. We've had neural nets, which eventually topped out. Then we had deep learning models based on neural nets, but they topped out too. Now we have LLMs based on deep learning, and they'll probably top out too someday. But by then compute power will have increased another thousand-fold, making possible an even better model based on LLMs.

Even if you count from mid-2017, when the seminal Google paper about attention-based LLM models was first published, it's taken less than seven years to get where we are today. That's beyond miraculous. This train is barreling down the track at breakneck speed and there's nothing likely to stop it.

65 thoughts on “AI video isn’t perfect yet. So what?

  1. Crissa

    It does seem like Atrios doesn't care about incremental steps towards anything.

    Also, he's talking about a problem that has already been solved multiple ways, and isn't even in this version.

    1. lower-case

      his hatred of meelon musk colors his view of ai

      i view it as an atrios eccentricity more than a moral or logical failure

      1. name99

        True, things like that don't help.

        Also people consistently underestimate the effects of small incremental improvements. Tune out the "self-driving" hype and consider a Tesla or say its recent Mercedes equivalent.
        - distance-based cruise control is REALLY nice, especially in stop-go traffic
        - I wouldn't rely on lane-following as in "remove my hands from the wheel", but I like having it do the work while my hands just rest on the steering wheel
        - I REALLY like having the car detect vehicles on the left or right when I indicate, so that I am warned about my blind spot and the possibility of turning into a vehicle
        - simply having world intelligence (a nice map, easy ability to add stops of a certain type, knowledge of where I can find a gas station or how busy is an upcoming charger station) make driving a lot easier

        Now you can argue that most of these are not "self-driving" and I won't fight you. They are, however, MORE PLEASANT (and safer) driving, driven by sensors, electronics, and communications.
        And it's these little changes, a small amount each year, that change things, not huge leaps.

        It's easy to say that this year's iPhone is "no different" from the 2007 model. And there's a certain type of idiot who says this routinely. And all I can say is that this is a person who clearly hasn't ever used both the 2007 iPhone and this year's model at the same time...
        Same with cars. You can say that the Tesla (or modern Mercedes) is not "really" different from my Mercedes from 2007 (or my Mercedes before that, from 1992) but 1992 vs 2007 vs 2022 are all very different from each other.

        And if we eventually get to "full self driving" in ten years by incremental improvements, not by one big leap, well, sure that has implications for marketers and hypesters, but it's irrelevant to anyone else.

        Same with AI/ML/whatever it will be called in three years.
        A few years ago it was all hype. Now I use search by "image contents" on my phone all the time.
        Using translation is now basic complaints about aspects of the UI, the functionality is good enough for constant use.
        Synthetic voices used to be harsh (Tesla's still is, IMHO) but Apple has smooth voices down pat.
        OCR in images works well enough that, again, it can be threaded through the OS, and one's main complaints are now not about the functionality but about "god damnit, why doesn't THIS app search within images" (usual answer, because someone refused to follow the OS guidelines...).

        Leading edge AI of three years ago is now stuff we complain about in terms of the UI, and small developers who implement it incorrectly/not at all. That's how it should be!
        And that's where LLMs and generation will be in three years. The end-game is not ChatGPT and Dall-E; it's this stuff so deep in my phone that I just expect it to work (in various appropriate ways) in every app...

    1. name99

      Of course not.
      But this is a vacuous statement since no-one knows what "take over the world" means...

      Want to make an actual prediction, like "fewer than half of all Google queries will route through an LLM" by the end of the decade?

  2. iamr4man

    In other new regarding driverless cars:

    A crowd swarmed a Waymo self-driving car in San Francisco on Saturday night, pummeling the vehicle with cans, a skateboard, and eventually setting it on fire.
    Footage of the incident was captured and posted on social media by Michael Vandi, Chief Shipping Officer of cloud AI firm Addy AI. It’s a wild scene, emblematic of the ever-growing hostility between San Francisco residents and the big tech companies that use the city’s public streets as a testing ground for self-driving cars.
    https://www.vice.com/en/article/3akeev/a-crowd-attacked-a-waymo-self-driving-car-and-set-it-on-fire?callback=in&code=MTU0ZDUXYTATMMNLNY0ZYMI3LWIWOTETOTU5MWMWMMU3YMJH&state=fadb5965fdbf4879b4733f2981578c5f

    I haven’t tried using a Waymo car yet but my daughter has and she said she liked the experience and would do it again. She lives in a part of San Francisco that iI consider difficult to navigate and her vehicle had no problem with it.

    1. kahner

      i think a couple of the big problems with self-driving cars vs LLM's regarding pace of progress is LLM's had a pre-built massive training data set consisting of basically the entire corpus of human writing available. That just isn't available for any AI learning about the real world. You have to build individual cars with a bunch of sensors and have them drive around in real, slow meat time. No amount of processing power can speed up how fast the training data can be acquired. second, you have to be really really careful in how AI gets trained to drive cars because ever mistake has the possibility of killing people, whereas chatgpt making some nonsense up in response to a question mostly doesn't carry any risk. if we didn't care how many people died in the training process AND we didn't expect self-driving cars to far exceed human error rates, i think we'd already have fully self-driving cars.

      1. KenSchulz

        This is a good example of the category mistake that KD constantly makes in regard to AI. There is a large set of problems that are not computation-limited, they are data-limited, and the data must be acquired in ‘meat time’, in your wonderful phrase. In one post that I recall, KD looked forward to AI instruction tailored to the learning style of each individual student. As though scientific research on instructional methods and learning processes can be bypassed if an entity is just smart enough.

        1. shioklah

          Why would AI instruction tailored to the learning style of individual students be 'bypassing' research on instructional methods and learning processes. That research pretty clearly shows that individuals learn differently. Wouldn't it be great if a teacher were able to quickly figure out the strengths and weaknesses of an individual learner and deliver tailor-made lessons and reinforcements? Doesn't it seem more likely that an AI will be able to do this for 25 students simultaneously than a human?

          1. KenSchulz

            No. We’ve been doing research in human learning for over a century, and my guess* is that we have several centuries more to go to get reasonably good at tailoring, You can look back at Kevin Drum’s occasional posts with charts of educational achievement to see the rate of progress. People are vastly more complex than you imagine; modeling complex systems takes a lot of data.

            *Based on a career of research with human subjects

            1. painedumonde

              Totally. The (every changing) set of parameters just relating cognition/learning and nutrition is staggering. Meanwhile, silicon boys just need a steady stream of unflavored electrons.

      2. shioklah

        I wonder, though, how much of the training requires real world successes and failures? Is it not possible to bootstrap your way up to better performance through training on simulated situations? I assume theses companies are actually doing that - maybe not sufficiently well, yet, but I don't see why that shouldn't improve faster than availability of real-world data.

        1. kahner

          Yeah, I'm not expert, but I'd ocncur with Ken. To create a simulation of sufficient fidelity to accurately simulate reality is as hard a task, or perhaps harder, as training the AI driving model. How would you create such a simulation of the real world with the exact same real world data collection you need to train the AI?

    2. D_Ohrk_E1

      That time when a tow truck got hit twice by Waymo vehicles within minutes.

      "[T]he robotaxi “incorrectly predicted the future motion of the towed vehicle” because of this mismatch between the orientation of the tow truck and the pickup, and made contact. The company told TechCrunch this caused minor damage to the front left bumper.

      The tow truck did not stop, though, according to Peña, and just a few minutes later another Waymo robotaxi made contact with the same pickup truck being towed." -- TechCrunch

      It has trouble in high traffic and situations it hasn't experienced before.

    3. SwamiRedux

      What does the "Chief Shipping Officer" of an AI firm do?

      If you look on their website, he's the CEO. Perhaps Vice used AI to transcribe an oral report, and "Executive" came out as "Shipping"?

      TBH, I prefer Chief Shipping Officer to CEO.

      1. hugolive

        In the tech world, "shipping code" means releasing a new version of your software. He calls himself chief shipping officer on his LinkedIn because he thinks it's cute. It's actually annoying as Hell and par for the course for a founder trying to build a personal brand but whatever.

  3. PaulDavisThe1st

    Sometimes incremental improvements get you all the way there, wherever there is.

    Sometimes incremental improvements mean that you're much, much better at climbing a tree, but absolutely no closer to getting to the moon.

    Let's be honest and admit that nobody knows which of these two applies to the journey of "AI" from LLM's towards AGI.

  4. D_Ohrk_E1

    In the field of animation over the next 5 years, in sequence, AI will:

    (a) add voiceover, taken from samples of voice actors and dead people.
    (b) automate the tweens of keyframes and apply physics of motion.
    (c) apply styles from well known iconic animators.
    (d) animate scenes from a given script and character sketches.
    (e) make it a turnkey operation from start to finish.

    1. SwamiRedux

      I agree.

      In the field of blogging, AI will write blog posts in the style of your favorite blogger.

      Then, AI will add comments in the styles of your favorite cranky commentators.

      We're all destined for obsolescence.

    2. Crissa

      My friend's video game uses AI voices for some of the tracks. You can't tell it's AI, and you'd only be able to guess which ones if you know what AI's current constraints are.

    3. iamr4man

      I don’t doubt that AI generated animation will replace much of our current Pixar style animation. But I seriously doubt that it will reach the quality of the full animation of masters like Bill Tytla, Ward Kimball, Norm Ferguson, Preston Blair, Rod Scribner, and many others any time soon. In any event, here is an article that supports your point of view at least with regard to modern animation. I think you will find it interesting:
      https://www.cartoonbrew.com/tools/openais-sora-creates-minute-long-photorealistic-animation-from-text-prompts-238060.html

  5. Ogemaniac

    Have they improved? I keep trying the latest versions, asking them to do simple work tasks like “write a two paragraph introduction to X” or “revise this email for clarity” and what I get is garbage that is redundant, technically inaccurate, and otherwise less than worthless.

    1. pjcamp1905

      But look at how long it takes people to produce inaccurate, redundant garbage. The speed with which AI can do it is a major advance!

    2. Crissa

      It can't be technically accurate, because you're asking a bot, which can't know things, to be accurate.

      All it's going to be good at is grammar and rephrasing.

      If you think it's less than worthless m then you probably think most employees are worthless. Asking someone or something to do something it's not good at will only turn out dreck.

      1. pipecock

        Most employees are worthless. Most everything is worthless. Computers have already been there as well.

        The only things with value are things that require a good human to take time to do.

        No computer will ever change that equation.

    3. Chondrite23

      Harry Shearer on Le Show recently interviewed Gary Marcus about LLMs. He had a couple of good points.

      He said the issue with the fingers is that the models don’t do well “globally”. If you look at the individual fingers they are probably well done. This is an issue they are working on.

      He said that they have trouble with concepts not connected to an image. So if you ask for an elephant camouflaged on a beach it gets the beach and elephant but has trouble with camouflage because it is an abstract concept not a physical thing. Good insight.

      Finally, he pointed out that it is really expensive and time consuming to train a model with a ton of data so they tend to be always backward looking.

      I haven’t tried one lately but I had the same experience as you. It sort of pasted together a few sentences from Wiki and filled in the gaps with bullshit.

  6. pjcamp1905

    "Their capabilities have skyrocketed beyond belief in just a year or two. "

    You said the same thing in the early days of driverless cars and, as I recall, predicted they would be all over everywhere right now.

    Every complex technology moves rapidly while they are picking the low hanging fruit. When all that is left are hard problems, the pace of development slows dramatically. Surely you've noticed this.

    The hard problem with LLMs is obvious -- they are stochastic parrots. They produce what they produce not by understanding what they are saying but by remembering which words have gone with which other words in previous documents. As long as that is true, I'm skeptical that their tendency to lie and hallucinate can really be dealt with.

      1. Austin

        Driverless cars are not “everywhere” yet, if “everywhere” means “ubiquitous” and not just “there’s 1 Tesla in every county.” A good 99.999% of the cars I see around me in metro DC still require drivers. And the few teslas I see also still seem to have drivers (although I can’t prove they’re actually actively driving the car, I guess).

      2. pipecock

        They took driverless cars off the street in Pittsburgh sometime around the era where it took two drivers.

        Boring clown nonsense.

  7. Pittsburgh Mike

    The problem I see is that LLMs don't actually do anything useful yet. They're fun to play with, admittedly. And they can generate reams of text that summarize other peoples' thoughts, albeit while frequently making stuff up, so that they're simply not trustworthy.

    But they really can't generate anything new. I could ask it to write a user manual for some application I'm working on, and it would have no chance of producing anything useful. It's image generation capabilities are mostly useful for generating horror movies.

    And as @PaulDavis said, we really have no idea if they're anywhere close to being useful. My vote is:

    No, they're not close to being useful because they don't have any model of the real world, just a model of how language is generated. And we have no idea how many different models would have to be added to LLMs before they'd have the abilities of a first grader.

    That doesn't mean that other machine learning tools won't be useful. But they're going to be useful in very narrow domains -- recognizing cancer cells in an X-ray, turning speech into text. Alpha Go is supposedly amazing at playing Go, which is pretty cool. Alpha Fold is good at predicting how proteins form into 3D structures. But these are closer to pattern recognizers than anything else.

    But I still think LLMs themselves are closer to a fun parlor trick than anything useful.

    1. Crissa

      They can generate code, as well as any coder giving it a first pass. It can provide alternate rephrasing and find grammar errors, It can sound congenial and be less confusing phrasing to non experts.

      It doesn't replace a person, but it can augment people at this point.

      1. Boronx

        "They can generate code, as well as any coder giving it a first pass."

        This is only true for some tiny domains. There's so much programming where LLMs aren't even in the game.

        1. geordie

          True, but the ability for me to write a quick function and then ask it to refactor it while adding error checking and documentation, frees me up to move to the next thing. That means I usually end up with code that I can come back to a few weeks later and still understand.

          In my usage it seems to be chipping away at both the start of implementing a feature and the end. I write a comment or function name that causes it generate a scaffolding for what I want to do, I flesh that out myself and then I ask CoPilot to tidy everything up. It also acts a decent rubber duck. Several times I have caught a logic error by it creating incorrect documentation... that turns out to have been correct for what I actually wrote.

          When the context window is increased to where it can hold the 10K+ lines instead of <1K I expect things will get even more interesting.

          1. pipecock

            I always thought coding was the most banal, boring nonsense. The fact that a computer can do it better than you, apparently a professional at it, shows just how correct I was.

        2. Scott_F

          For starters, producing code as good as the average programmer ain't sayin' much. I know coders with 10 years experience who have a lot to learn about writing good, enduring code.

          As to productivity, programmers spend only a fraction of their time writing computer instructions. Much of the effort is squeezing the requirements out of the meat puppets that we call users and stakeholders. I can only imagine what would leap off AI's keyboard if I provided it with the requirements I am often given. I tell my team that programming is 50% people.

          I will be impressed when CoPilot asks a clarifying question or points out that we used different address parsing schemas in the ordering and billing modules.

          Otherwise it is a tool to increase productivity by a few points. Meh

  8. golack

    AI will get better and we'll get better at finding uses for it and well as using it....
    But don't expect a Moore's law rate of improvement. The next major steps will require order(s) of magnitude more computing power and energy to power everything. And since the entire internet has already been scrapped for training, new training sets will have to be developed.
    We got the 80% done--but that next 20% will really be a pain.

  9. rick_jones

    This train is barreling down the track at breakneck speed and there's nothing likely to stop it.

    That isn’t exactly a ringing endorsement.

  10. jdubs

    LLMs are still in the all-hype phase of development. Its a bad sign that they havent moved past this stage yet.

    Pulling the same stunt with video that they used on text isnt a sign that they are barreling anywhere useful or productive. But it does keep people onboard the hype train for a bit longer.

  11. Fortheloveofdog

    AI was held as a bargaining chip to break the actor's strike and it's depressing work for designers. It's also creating breakthroughs in cancer detection. There are good and bad sides to it, but we have to keep in mind that both are starting to be exploited. Mindlessly cheering technology is something
    bad. Nikola Tesla warned about it, the "enterprising stock jobbers" getting into a tizzy whenever something new happened in technology, and this creates problems in journalism media today too, that routinely get basic facts wrong.

  12. Anandakos

    It's interesting, isn't it, that Google had essentially the same insight back when it upended Search: Look at what's moving. The rest is furniture.

  13. Chris

    My fear with AI-generated "art" is it will be deemed good enough to replace actual artists, and then the money that would have been spent on humans producing good film/TV/books/music will instead go toward this six-fingered uncanny valley stuff instead.

    1. pipecock

      Have you seen what is “good enough” art to most ppl anyway? The monoculture already killed the rewards system for actually good art, and it’s dead.

      Taste is vanishingly rare. Taste and skills together… basically non existent.

  14. Jim Carey

    "This train is barreling down the track at breakneck speed and there's nothing likely to stop it."

    Agreed, but don't forget about the railroad switch up ahead. The train is barreling down the "AI will serve my interest at the social system's expense" track by default, and the exception is if we get together and throw the "AI will serve the social system at its subjects' short-term expense" switch.

    If we don't, we'll have nobody to blame but ourselves.

  15. frankwilhoit

    The people who complain about vote fraud (stick with me here, this is not off-topic) have never voted. You can tell because they know nothing about the process. They think things are unnecessary that are actually required, that things are easy that are actually difficult, and that things are possible that are actually impossible.

    The people who advocate AI (here it is) have never worked with software -- written it, built it, deployed it, specified it, diagnosed it, corrected it, or assessed its performance. You can tell because they know nothing about the process. They think "computers" can "do anything". They haven't talked to businesspeople who fail, at the deepest intuitive level, to grasp the notion of either-or. They don't know what kinds of things machines are better at versus what kinds of things machines are better at. They haven't learned, the hard way, that in order to automate mature human processes, they must first be very drastically simplified and made deterministic -- see above, re either-or. They don't realize that it doesn't matter how fast you get the wrong answer. They haven't learned, by direct experience, that businesspeople don't understand why right answers are better than wrong ones. They don't realize that every output from every computer program is altered, slightly or completely, by human eyes and hands, before any use is made of it. They do not know the story of THERAC-25. I could go on, but my weariness is as great as yours. It does not matter how fashionably you get the wrong answer.

    1. kahner

      The people who advocate AI have never worked with software? Huh? What about the thousands of researchers and software engineers working on AI and who've spend decades advocating AI? You think none them have written, built, deployed etc software?

      1. Scott_F

        He is talking about the talking-heads and business people who believe that GPT-4 can simply be dropped into your laptop and instantly replace you.

        The COO of OpenAI has stated that there are a limited number of processes where LLMs can be useful: "I think the overhyped aspect is that, in one fell swoop, it can deliver substantive business change."
        The engineers have at least SOME idea.

        Edit: Added quote 🙂
        "There is never a single thing you can do with AI that completely solves a problem."

        1. kahner

          if that's what he's talking about fine i guess (although it's clearly not what he actually said), but it makes the rest of his post pretty irrelevant because there are tons of technically knowledgeable, experienced and smart people advocating for broad AI revolution in the short to medium term. there are always ignorant business types and pundits opining on every tech trend as the next revolution.

          1. frankwilhoit

            He's right, I was excluding the people who have a structural incentive to lie. Your "technically knowledgeable experienced, and smart people" are liars, full stop. They know that their own minds are not up to the task of creating the things that they say are going to be created, and if theirs are not, then certainly no one's else are.

            1. kahner

              i guess "everyone who disagrees with me is a liar or an idiot" is one way of looking at things. doesn't seem like a very smart way though.

          2. pipecock

            Yes there are complete morons everywhere, at every level of “success” in every type of information based business.

            The big secret of the world is that basically everybody is totally competent, even at the things they are supposedly “good” at.

            Even sadder is that people are still better than computers at these things.

            Ask me why I’m not super hyped for AI again? Lol

      1. frankwilhoit

        The future consists of repetitions of it, over and over again. If you doubt, think of the things that would be required to prevent it.

  16. kaleberg

    Have you tried to use an LLM lately? They don't make as many of the stupid mistakes that they used to, so the mistakes tend to be less obvious. They aren't good at summarizing legal documents since they ignore inferences that a first year law student would catch. They aren't good at understanding literature since they don't understand irony or subject-verb-object grammar very well. They aren't good for finding out things on the web since they produce paragraphs of glib filler rather than providing useful information. My take is that they are primarily good at generating glib filler that contains zero information but looks better than simple lorem ipsum.

    P.S. We're ignoring the whole agency problem. When a self driving car druves recklessly and kills someone, who does the jail time? When an LLM makes a representation to a counter-party, who gets fired if it led to corporate disaster?

Comments are closed.