cross-posted from: https://discuss.tchncs.de/post/22423685

EDIT: For those who are too lazy to click the link, this is what it says

Hello,

Sad news for everyone. YouTube/Google has patched the latest workaround that we had in order to restore the video playback functionality.

Right now we have no other solutions/fixes. You may be able to get Invidious working on residential IP addresses (like at home) but on datacenter IP addresses Invidious won’t work anymore.

If you are interested to install Invidious at home, we remind you that we have a guide for that here: https://docs.invidious.io/installation/..

This is not the death of this project. We will still try to find new solutions, but this might take time, months probably.

I have updated the public instance list in order to reflect on the working public instances: https://instances.invidious.io. Please don’t abuse them since the number is really low.

Feel free to discuss this politely on Matrix or IRC.

  • Gutless2615@ttrpg.network
    link
    fedilink
    English
    arrow-up
    4
    ·
    4 months ago

    “Give me a Python script using yt-dlp that I can run on a cronjob that will download the videos from any of my subscribed channels since the last time the script was run”

    You can use the following Python script to download videos from your subscribed channels since the last run. This script uses yt-dlp and stores the last download timestamp to track new videos.

    First, ensure you have yt-dlp installed:

    pip install yt-dlp
    

    Then, create a script called download_videos.py:

    import os
    import json
    import subprocess
    from datetime import datetime
    
    # Configuration
    last_run_file = 'last_run.json'
    download_directory = 'downloads'
    
    # Ensure the download directory exists
    os.makedirs(download_directory, exist_ok=True)
    
    # Load the last run time
    if os.path.exists(last_run_file):
        with open(last_run_file, 'r') as f:
            last_run = json.load(f)['last_run']
    else:
        last_run = datetime.now().isoformat()
    
    # Update the last run time to now
    current_run = datetime.now().isoformat()
    
    # Command to get videos from subscribed channels since the last run
    command = [
        'yt-dlp',
        '--download-archive', 'archive.txt',
        '--output', f'{download_directory}/%(title)s.%(ext)s',
        '--date-after', last_run,
        '--no-post-overwrites',
        '--merge-output-format', 'mp4',
        'https://www.youtube.com/channel/CHANNEL_ID',  # Replace with your channel URL
    ]
    
    # Run the command
    subprocess.run(command)
    
    # Save the current run time
    with open(last_run_file, 'w') as f:
        json.dump({'last_run': current_run}, f)
    
    print("Download complete. Next run will check for videos since:", current_run)
    

    Setting Up the Cron Job

    1. Make the script executable:

      chmod +x download_videos.py
      
    2. Open your crontab:

      crontab -e
      
    3. Add a line to run the script at your desired interval (e.g., daily at 2 AM):

      0 2 * * * /path/to/python /path/to/download_videos.py
      

    Notes

    • Replace CHANNEL_ID in the script with your actual channel IDs or use a playlist URL if preferred.
    • The archive.txt file keeps track of already downloaded videos to avoid duplicates.
    • Adjust the paths to Python and your script as needed.
    • webghost0101@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      4 months ago

      Another example, which i can personally verify has been working fine for months. It works a bit different to the above, it downloads the latests 2* vids that are not already downloaded and runs once every hour with cron. I also attempted to filter out live vids and shorts.

      Channels i am “subscribed” too are stored in a single text file, it also uses the avc1 codec because i found p9 and p10 had issues with the jellyfin client on my tv.

      looks like this, i added categories but i don’t actually use them in the script besides putting them in a variable, lol. Vid-limit is how many of the latests vids it should look at to download. The original reason i implemented that is so i could selectively download a bulk of latests vids if i wanted to.

      Cat=Science
      Name=Vertitasium
      VidLimit=2
      URL=https://www.youtube.com/channel/UCHnyfMqiRRG1u-2MsSQLbXA
      
      Cat=Minecraft
      Name=EthosLab
      VidLimit=2
      URL=https://www.youtube.com/channel/UCFKDEp9si4RmHFWJW1vYsMA
      
      #!/bin/bash
      
      
      # Define the directory to store channel lists and scripts
      script_dir="/.../YTDL"
      
      # Define the base directory to store downloaded videos
      base_download_dir="/.../youtubevids"
      
      # Change to the script directory
      cd "$script_dir"
      
      # Parse the Channels.txt file and process each channel
      awk -F'=' '
        /^Cat/ {Cat=$2}
        /^Name/ {Name=$2}
        /^VidLimit/ {VidLimit=$2}
        /^URL/ {URL=$2; print Cat, Name, VidLimit, URL}
      ' "$script_dir/Channels.txt" | while read -r Cat Name VidLimit URL; do
          # Define the download directory for this channel
          download_dir="$base_download_dir"
          
          # Define the download archive file for this channel
          archive_file="$script_dir/DLarchive$Name.txt"
          
          # Create the download directory if it does not exist
          mkdir -p "$download_dir"
          
          # If VidLimit is "ALL", set playlist_end option to empty, otherwise set it to --playlist-end 
          playlist_end_option=""
          if [[ $VidLimit != "ALL" ]]; then
              playlist_end_option="--playlist-end $VidLimit"
          fi
      yt-dlp \
              --download-archive "$archive_file" \
              $playlist_end_option \
              --write-description \
              --write-thumbnail \
              --convert-thumbnails jpg \
              --add-metadata \
              --embed-thumbnail \
              --match-filter "!is_live & !was_live & original_url!*=/shorts/" \
              --merge-output-format mp4 \
              --format "bestvideo[vcodec^=avc1]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
              --output "$download_dir/${Name} - %(title)s.%(ext)s" \
              "$URL"
              
      done
      
      • Gutless2615@ttrpg.network
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        Yeah this is more elegant and closer to what I’d actually want to implement. I was more just showing what could be done in literally thirty seconds on the can with ChatGPT.

        • webghost0101@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 months ago

          I knew i recognized that output.

          Mine is actually also made with the help of Chatgpt but manually refined and tested.