cross-posted from: https://lemmy.ca/post/37011397

!opensource@programming.dev

The popular open-source VLC video player was demonstrated on the floor of CES 2025 with automatic AI subtitling and translation, generated locally and offline in real time. Parent organization VideoLAN shared a video on Tuesday in which president Jean-Baptiste Kempf shows off the new feature, which uses open-source AI models to generate subtitles for videos in several languages.

    • shyguyblue@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 days ago

      I was just thinking, this is exactly what AI should be used for. Pattern recognition, full stop.

      • snooggums@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        10 days ago

        Yup, and if it isn’t perfect that is ok as long as it is close enough.

        Like getting name spellings wrong or mixing homophones is fine because it isn’t trying to be factually accurate.

        • TJA!@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          0
          ·
          10 days ago

          Problem ist that now people will say that they don’t get to create accurate subtitles because VLC is doing the job for them.

          Accessibility might suffer from that, because all subtitles are now just “good enough”

  • m8052@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 days ago

    What’s important is that this is running on your machine locally, offline, without any cloud services. It runs directly inside the executable

    YES, thank you JB

  • billwashere@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 days ago

    This might be one of the few times I’ve seen AI being useful and not just slapped on something for marketing purposes.

  • TheRealKuni@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 days ago

    And yet they turned down having thumbnails for seeking because it would be too resource intensive. 😐

    • cley_faye@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      10 days ago

      Video decoding is resource intensive. We’re used to it, we have hardware acceleration for some of it, but spewing something around 52 million pixels every second from a highly compressed data source is not cheap. I’m not sure how both compare, but small LLM models are not that costly to run if you don’t factor their creation in.

      • TheRealKuni@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        arrow-down
        1
        ·
        9 days ago

        All they’d need to do is generate thumbnails for every period on video load. Make that period adjustable. Might take a few extra seconds to load a video. Make it off by default if they’re worried about the performance hit.

        There are other desktop video players that make this work.

  • Nalivai@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 days ago

    The technology is nowhere near being good though. On synthetic tests, on the data it was trained and tweeked on, maybe, I don’t know.
    I corun an event when we invite speakers from all over the world, and we tried every way to generate subtitles, all of them run on the level of YouTube autogenerated ones. It’s better than nothing, but you can’t rely on it really.

    • TriflingToad@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      9 days ago

      is your goal to rely on it, or to have it as a backup?
      For my purpose of having backup nearly anything will be better than nothing.

      • Nalivai@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 days ago

        When you do live streaming there is no time for backup, it either works or not. Better than nothing, that’s for sure, but also maybe marginally better than whatever we had 10 years ago

    • lukewarm_ozone@lemmy.today
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      9 days ago

      Really? This is the opposite of my experience with (distil-)whisper - I use it to generate subtitles for stuff like podcasts and was stunned at first by how high-quality the results are. I typically use distil-whisper/distil-large-v3, locally. Was it among the models you tried?

      • Nalivai@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 days ago

        I unfortunately don’t know the specific names of the models, I will comment additionally if I will not forget to ask people who spun up the models themselves.
        The difference might be that live vs recorded stuff, I don’t know.

  • renzev@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 days ago

    This sounds like a great thing for deaf people and just in general, but I don’t think AI will ever replace anime fansub makers who have no problem throwing a wall of text on screen for a split second just to explain an obscure untranslatable pun.

    • cley_faye@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      10 days ago

      It’s unlikely to even replace good subtitles, fan or not. It’s just a nice thing to have for a lot of content though.

      • boonhet@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        9 days ago

        I have family members who can’t really understand spoken English because it’s a bit fast, and can’t read English subtitles again, because again, too fast for them.

        Sometimes you download a movie and all the Estonian subtitles are for an older release and they desynchronize. Sometimes you can barely even find synchronized English subtitles, so even that doesn’t work.

        This seems like a godsend, honestly.

        Funnily enough, of all the streaming services, I’m again going to have to commend Apple TV+ here. Their shit has Estonian subtitles. Netflix, Prime, etc, do not. Meaning if I’m watching with a family member who doesn’t understand English well, I’ll watch Apple TV+ with a subscription, and everything else is going to be pirated for subtitles. So I don’t bother subscribing anymore. We’re a tiny country, but for some reason Apple of all companies has chosen to acknowledge us. Meanwhile, I was setting up an Xbox for someone a few years ago, and Estonia just… straight up doesn’t exist. I’m not talking about language support - you literally couldn’t pick it as your LOCATION.