Judge, Moderator, Player (200)
Joined: 7/15/2021
Posts: 112
Location: United States
I use Linux as my daily driver and have done so for the past 6 years. I believe it's the best operating system. So I figured out the best way to encode on it. FFmpeg is my tool of choice. Previously, I used to have a bunch of commands to do one encode, which also generated a bunch of intermediate, temporary files, but when it came time to encode [5036] Linux Cyber Shadow by keylie in 42:24.12, I figured that I could just use one script and thus have only one file generated, which would be most convenient. So here's the encode script I used: Download encode.sh
Language: shell

#!/usr/bin/env bash ffmpeg \ \ -t 2 -f lavfi -i anullsrc \ \ -t 2 -loop 1 -r 60 -i /B/P/tasencodelogos/logo_16to9_3840x2160.png \ \ -t 49:01.500 -i 'cybershadow_dump.avi' \ \ -qp 0 -pix_fmt yuv420p \ -filter_complex '[2]ass=cybershadow_dump.ass,scale=3840:2160:flags=neighbor[v];[1]setsar=1[im];[im][0][v]concat=n=2:v=1:a=1' \ -c:a flac \ \ cybershadow_encode.mkv
Make sure to chmod +x ./encode.sh before running it with ./encode.sh (you'll want to change the path to the logo .png too). For the subtitles (cybershadow_dump.ass), I used Aegisub (which on Fedora can be installed with sudo dnf install aegisub). For the dump (cybershadow_dump.avi), I dumped with UTVideo for the video codec and FLAC for the audio codec. Now this might look a bit complicated, but this is actually as simple as I could make it. In shell scripting, ending a line with a backslash (\) escapes the newline so the current line actually continues to the next line, and I use that to make the ffmpeg invocation be spread out across multiple lines. If it was one line, it would be
ffmpeg -t 2 -f lavfi -i anullsrc -t 2 -loop 1 -r 60 -i /B/P/tasencodelogos/logo_16to9_3840x2160.png -t 49:01.500 -i 'cybershadow_dump.avi' -qp 0 -pix_fmt yuv420p -filter_complex '[2]ass=cybershadow_dump.ass,scale=3840:2160:flags=neighbor[v];[1]setsar=1[im];[im][0][v]concat=n=2:v=1:a=1' -c:a flac cybershadow_encode.mkv
which is harder to read. So here's what each line does:
  • -t 2 -f lavfi -i anullsrc: FFmpeg takes a series of inputs. Each input in FFmpeg is specified with -i, followed by the name (in most cases this will be a filename on your disk), and the options preceding it are specifying information about that input. In this case, -t 2 means to only take 2 seconds of the input, and -f lavfi specifies that the format is lavfi. With -i anullsrc, this means to create an input that is 2 seconds of blank audio. As you may have guessed, this is needed to place 2 seconds of blank audio underneath the 2 seconds of logo, which will be the next input.
  • -t 2 -loop 1 -r 60 -i /B/P/tasencodelogos/logo_16to9_3840x2160.png: Again, -t 2 specifies a length of only 2 seconds. -loop 1 means that the resulting video should loop (1 meaning 'true', as opposed to 0). -r 60 specifies that the framerate should be 60 frames per second (matching the dump of Cyber Shadow), and of course -i is followed by the path to the image. As you may have guessed, the image is my logo file.
  • -t 49:01.500 -i 'cybershadow_dump.avi': This is the dump itself. -t 49:01.500 means to cut it off at timecode 49:01.500 (which is right when the ending music loops in the dump).
  • -qp 0 -pix_fmt yuv420p: -qp 0 means that the quality should be lossless (if it was higher, it would be more lossy). This is needed because we haven't specified a video codec for the output file, so FFmpeg will default to x264. -pix_fmt yuv420p specifies that the pixel format should be YUV420p.
  • -filter_complex '[2]ass=cybershadow_dump.ass,scale=3840:2160:flags=neighbor[v];[1]setsar=1[im];[im][0][v]concat=n=2:v=1:a=1': Okay, this is the big line. -filter_complex is basically just a grab bag of letting you use every filter at once. It's what lets me put this all in one script instead of having to have multiple scripts and a bunch of intermediate temporary files laying around. Each entry in the following string is separated with a semicolon (;), and follows the format [input]filter[output]. There can be multiple inputs, and multiple filters are separated by commas (,). So let's look at each entry:
    • [2]ass=cybershadow_dump.ass,scale=3840:2160:flags=neighbor[v]: This takes the stream named 2, hardcodes the subtitles (from cybershadow_dump.ass), then scales it up to 4K resolution (I dumped the game at 1x native resolution to save disk space), and puts the result in a stream named v. In this case, 2 refers to the third -i input from earlier, and FFmpeg automatically numbers each -i input for you (starting from 0).
    • [1]setsar=1[im]: This takes the 1 stream (the logo .png), sets its sample aspect ratio to 1, and then puts it in a stream named im. I'm not actually sure why this is here, but it's probably needed for concatenation.
    • [im][0][v]concat=n=2:v=1:a=1: This takes the im, 0, and v streams (the im and v streams are the same streams we created in previous entries), and concatenates them. n=2 because there are two segments to concatenate (the logo and the dump), and v=1:a=1 because there should only be 1 video and 1 audio track in the end. There is no stream specified at the end because this will end up being the output of the entire FFmpeg command itself.
  • -c:a flac: Sets the audio codec to FLAC.
  • cybershadow_encode.mkv: If there's an argument that isn't preceded by a flag, FFmpeg will interpret it as the name of the output file. So this is the output file.
With this, I could just let my computer encode the file and then upload the file straight to YouTube when it was done. Of course, I tested it by doing -t 30 just to make sure everything encoded fine, before I did the full encode (which took ~2 hours). Here's something slightly more complex, the script I used to make the downloadable uploaded to archive.org: Download downloadable.sh
Language: shell

#!/usr/bin/env bash ffmpeg \ \ -t 2 -f lavfi -i anullsrc \ \ -t 2 -loop 1 -r 60 -i /B/P/tasencodelogos/logo_16to9_800x450.png \ \ -t 49:01.500 -i 'cybershadow_dump.avi' \ \ -f srt -i CyberShadow.srt \ -crf 20 -pix_fmt yuv420p -x264opts keyint=600:merange=64:colormatrix=smpte170m \ -filter_complex '[0]aresample=48000,atrim=start_sample=5060[a];[2]ass=cybershadow_dump.ass,scale=800:450:flags=neighbor[v];[1]setsar=1[im];[im][a][v]concat=n=2:v=1:a=1' \ -c:a aac -vbr 2 \ -c:s mov_text \ \ cybershadow-tas-keylie.mp4
It's similar to the previous script, except there's now a fourth input (the commentary subtitles, named CyberShadow.srt) which is embedded using -c:s mov_text. There's also extra filters on the audio and extra options for the video to cut down on file size. For my future encodes, I'm going to just make copies of these scripts and then tweak things as needed, which is the simplest thing that works for me.
Site Admin, Skilled player (1255)
Joined: 4/17/2010
Posts: 11495
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
Thanks! What would be the -filter_complex for cases when you need to upscale 2x with neighbor and then lanczos width to be 4/3 of height?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Judge, Moderator, Player (200)
Joined: 7/15/2021
Posts: 112
Location: United States
feos wrote:
What would be the -filter_complex for cases when you need to upscale 2x with neighbor and then lanczos width to be 4/3 of height?
Probably something like [2]scale=iw*2:ih*2:flags=neighbor,scale=iw*4/3:ih:flags=lanczos[v], but I haven't tried.
Site Admin, Skilled player (1255)
Joined: 4/17/2010
Posts: 11495
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
At the first glance it looks like scaling width to be 4/3 of itself.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Judge, Moderator, Player (200)
Joined: 7/15/2021
Posts: 112
Location: United States
My bad, it should be ih*4/3:ih then.
Site Admin, Skilled player (1255)
Joined: 4/17/2010
Posts: 11495
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
Just remembered another thing: x264 wants dimensions to be multiples of 4.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Judge, Moderator, Player (200)
Joined: 7/15/2021
Posts: 112
Location: United States
If it isn't then FFmpeg will error and then you can adjust the dimensions accordingly (by adding *4, etc.).
Site Admin, Skilled player (1255)
Joined: 4/17/2010
Posts: 11495
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
Wouldn't that result in upscaling by the factor of 4, as opposed to adding 0-3 pixels so it becomes the nearest multiple of 4? Here's how I do it in avisynth:
Language: avs

# rounds an integer up or down to the nearest multiple of mod function ForceModulo( \ int number, \ int mod, \ bool up \){ return (up \ ? (int(number + mod - 1) / mod) * mod \ : int(number / mod) * mod) }
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Editor, Publisher, Player (47)
Joined: 10/15/2021
Posts: 377
Is it possible to get something close to feos's encoding package for windows by shell scripting with variables and conditionals?
Judge, Moderator, Player (200)
Joined: 7/15/2021
Posts: 112
Location: United States
feos wrote:
Wouldn't that result in upscaling by the factor of 4, as opposed to adding 0-3 pixels so it becomes the nearest multiple of 4? Here's how I do it in avisynth:
Language: avs

# rounds an integer up or down to the nearest multiple of mod function ForceModulo( \ int number, \ int mod, \ bool up \){ return (up \ ? (int(number + mod - 1) / mod) * mod \ : int(number / mod) * mod) }
Hm, looks like there's a force_divisible_by option for scale. So it would be scale=ih*4/3:ih:force_divisible_by=4 then. Though I'm not sure if it will always add or subtract pixels.
despoa wrote:
Is it possible to get something close to feos's encoding package for windows by shell scripting with variables and conditionals?
It should be. feos's encoding package is already just batch shell scripting anyway.
Masterjun
He/Him
Site Developer, Expert player (2047)
Joined: 10/12/2010
Posts: 1185
Location: Germany
feos wrote:
Just remembered another thing: x264 wants dimensions to be multiples of 4.
The scale option in ffmpeg allows floor(), so you can do floor(value/4)*4 to round to a smaller multiple of 4, or even floor((value+2)/4)*4 to round to the closest multiple of 4. (My BizHawk ffmpeg guide uses scale=floor(((ih*4)*(4/3)+1)/2)*2:(ih*4) to round to the closest multiple of 2.)
Warning: Might glitch to credits I will finish this ACE soon as possible (or will I?)