• 2 Posts
  • 978 Comments
Joined 4 years ago
cake
Cake day: January 17th, 2022

help-circle
  • I’m new to Linux from about 3 months ago, so it’s been a bit of a learning curve on top to learning VE haha. I didn’t realize CUDA had versions

    Yeah… it’s not you. I’m a professional developer and have been using Linux for decades. It’s still hard for me to install specific environments. Sometimes it just works… but often I give up. Sometimes it’s my mistake but sometimes it’s also because the packaging is not actually reproducible. It works on the setup that the developer used, great for them, but slight variation throw you right into dependency hell.




  • suddenly it hit me. Im on linux I can do a lot of this easier with the command line.

    Nice, you get it! You have so much to learn so don’t be afraid of taking notes. The CLI and the UNIX philosophy are very powerful. They remain powerful decades after (from desktop to mobile with e.g. adb on Android to the “cloud” with shell via e.g. ssh) so IMHO it still is a good investment. Still discovery can be tricky so be gentle with yourself

    Also few tricks that can help you go further faster :

    • take notes (really! can be a .txt or .md file or a wiki page, entirely up to you)
    • consider aliases or .bashrc to keep your shortcuts and compose
    • stop typing the same commands again, instead reverse-i-search with e.g. Ctrl-r
    • TAB autocomplete (as suggested after)

    Anyway, enjoy it’s an adventure!




  • There’s no getting around using AI for some of this, like subtitle generation

    Eh… yes there is, you can pay actual humans to do that. In fact if you do “subtitle generation” (whatever that might mean) without any editing you are taking a huge risk. Sure it might get 99% of the words right but it fucks up on the main topic… well good luck.

    Anyway, if you do want to go that road still you could try

    • ffmpeg with whisper.cpp (but honestly I’m not convinced hardcoding subtitles is a good practice, why not package as e.g. .mkv? Depends on context obviously)
    • Kdenlive with vosk
    • Kdenlive with whatever else via *.srt *.ass *.vtt *.sbv formats








  • Sad but unsurprising.

    I did read quite a lot on the topic, including “Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass.” (2019) and saw numerous documentaries e.g. “Invisibles - Les travailleurs du clic” (2020).

    What I find interesting here is that it seems the tasks go beyond dataset annotation. In a way it is still annotation (as in you take a data in, e.g. a photo, and your circle part of it to label i.e. e.g “cat”) but here it seems to be 2nd order, i.e. what are the blind spots in how this dataset is handled. It still doesn’t mean anything produced is more valuable or that the expected outcome is feasible with solely larger datasets and more compute yet maybe it does show a change in the quality of tasks to be done.