This page outlines the essentials of creating high definition encodes for upload to streaming media sites.
Table of contents
Basic concepts
TODO: Speak about 8x8 resizing.
Rationale
The h.264 video format uses the YV12 colorspace; conversion to this colorspace on a pixel-to-pixel basis is invariably lossy due to chroma subsampling (i.e. the image dimensions of the colour information in the image is half that of the full image resolution). To avoid loss of data to this chroma subsampling, one can ensure that each 2x2 block in image has the same colours as this would mean that the entire 2x2 block is one colour and can therefore be scaled down without loss. Thus, the scaling ratio has to be an even integer[1] and the scaling algorithm has to be nearest neighbour[2] in order to guarantee that each resulting 2x2 block is one colour.
YouTube's maximum high definition size is 1920x1080, though recently support has been added for resolutions greater than this which shows up as "Original". If YouTube is sent an encode with a vertical resolution greater than 1080, it will generate a 1080p encode without image borders.
YouTube also only supports frame rates of up to 30 FPS, though it will automatically
downconvert videos uploaded at a higher frame rate. Several alternatives exist for addressing this, such as simple decimation, interframe blending (partial or total); the most competent at present is "ng_deblink".
Steps
- Dump the movie using an RGB32 lossless[3] codec. Lagarith and Camstudio Codec are popular choices.
- Convert the image to correct colorspace using point scaling (as opposed to the normally-used Lanczos scaling); this ensures that there is no "bleeding" between chroma samples.
- Set your logo's resolution to prevent its resizing by x264.
- Encode the video using x264 (fairly high quality settings are desirable as YouTube will end up re-encoding the video anyway). Resize it by factor of 8, this would reduce the file size when compressing by x264, and activate YouTube's "original mode" processing (above 1080p).
- Use OGG or WAV as your audio track.
- Correct the aspect ration on YouTube.
Note about 3D games
Games that make use of 3D (i.e. those on 3D consoles such as the N64, PSX, Saturn, or DS) are not HD encoded in this fashion; they are instead dumped at higher resolution and then encoded.
Example script
To get the script working you need these plugins for AVISynth:
Both can be obtained by this link.
a = AVISource("movie.avi").Trim(0,-0).ConvertToRGB32().deblink4.selecteven
# Replace -0 with the last frame of the movie to be displayed,
# or leave it as is if no trimming is required.
# Adjust the below items to adjust the subtitles.
# All the required information will be on the movies submission page.
# Be sure to keep the quotes everywhere, even when branch is blank ("").
# When it's not blank, put it into single quotes too ("'any%'")
game = "A Nightmare On Elm Street"
branch = ""
author = "goofydylan8"
time = "13:25.57"
rerecords = "16726"
# This sets the frame number for the subtitles to start displaying
# Preferably, the beginning of the first level
subff = 800
# Adjust height for subtitles not to covar action
height=0
# Set this to true if this is a handheld console, to use logo with aspect ratio other than 4:3
handheld = false
# We use YouTube function to resize to 4:3, and upscaling by 8 reduces the filesize greatly
g = a.PointResize(a.width*8, a.height*8)
# Logo stuffs
# Note that if this is a handheld console, you will need to change
# the .png filename to the corrosponding consoles logo image
# This is so that you don't end up with a distorted logo
# Use one 4:3 logo for TV consoles and logos with custom resolutions for handhelds
handheld ? Eval("""
d = ImageSource(file="handheld.png", start=0, end=119, fps=a.FrameRate).ConvertToRGB32()
""") : Eval("""
d = ImageSource(file="console.png", start=0, end=119, fps=a.FrameRate).ConvertToRGB32()
""")
e = BlankClip(d, audio_rate=a.AudioRate, channels=a.AudioChannels)
f = AudioDub(d, e).Lanczos4Resize(g.width, g.height).AssumeFPS(a.FrameRateNumerator, a.FrameRateDenominator)
last = f + g
# Subtitles
ng_bighalo(\
game + "\n" + ((branch == "") ? "" : branch + "\n") + \
"Played by " + author + "\nPlaying time: " + time + \
"\nRerecord count: " + rerecords, \
y=height, align=8, first_frame=subff, last_frame=(subff + 300), \
size=80, text_color=$00FFFFFF, halo_color=$00000000, lsp=1)
ng_bighalo(\
"This is a tool-assisted recording.\nFor details, visit https://TASVideos.org/", \
y=height, align=8, first_frame=(subff + 301), last_frame=(subff + 601), \
size=80, text_color=$00FFFFFF, halo_color=$00000000, lsp=1)
ConvertToYV24(matrix="Rec709", chromaresample="point")
ConvertToYV12(matrix="Rec709", chromaresample="point")
# Custom functions go here, never change the stuff below if you are not sure what you are doing
# nanogyth's deblink function makes TASBlend obsolete
function deblink4(clip clp, float "ratio", int "level") {
ratio = default(ratio, 2.0 /3)
assert(ratio >= 0.0 && 1.0 >= ratio,
\ "[deblink4] 1.0 >= ratio >= 0.0, it was " + string(ratio))
level = default(level, round(ratio * 257))
assert(level >= 0 && 257 >= level,
\ "[deblink4] 257 >= level >= 0, it was " + string(level))
blink=clp.ng_blinkmask_new
m01=mt_logic(blink.selectevery(4,0),blink.selectevery(4,1),mode="or").converttorgb32
m23=mt_logic(blink.selectevery(4,2),blink.selectevery(4,3),mode="or").converttorgb32
f0=layer(clp.selectevery(4,0),clp.selectevery(4,1).mask(m01),level=level)
f1=layer(clp.selectevery(4,1),clp.selectevery(4,0).mask(m01),level=level)
f2=layer(clp.selectevery(4,2),clp.selectevery(4,3).mask(m23),level=(257-level) )
f3=layer(clp.selectevery(4,3),clp.selectevery(4,2).mask(m23),level=(257-level) )
interleave(f0,f1,f2,f3)
}
function deblink3(clip clp){
blink=clp.ng_blinkmask_new
m01=mt_logic(blink.selectevery(4,0),blink.selectevery(4,1),mode="or").converttorgb32
f0=layer(clp.selectevery(4,0),clp.selectevery(4,1).mask(m01))
f1=layer(clp.selectevery(4,1),clp.selectevery(4,0).mask(m01))
interleave(f0,f1,clp.selectevery(4,2),clp.selectevery(4,3))
}
function ng_blinkmask_new(clip c,int "ml"){
ml=default(ml,128)
src=c.ConvertToYv12
super=MSuper(src, pel=1)
fvec =MAnalyse(super, isb=false, blksize=4)
bvec =MAnalyse(super, isb=true , blksize=4)
fmask=Mmask(src,fvec,kind=1,ml=ml).mt_binarize(u=-128,v=-128)
bmask=Mmask(src,bvec,kind=1,ml=ml).mt_binarize(u=-128,v=-128)
eo0_to =fmask.selectevery(2,1)
oe_from=bmask.selectevery(2,1)
front =mt_logic(eo0_to,oe_from,mode="and")
oe_to =fmask.selectevery(2,2)
eo_from=bmask.selectevery(2,2)
back =mt_logic(oe_to,eo_from,mode="and")
ee_src=src.selecteven
ee_super=MSuper(ee_src, pel=1)
ee_fvec =MAnalyse(ee_super, isb=false, blksize=4)
ee_bvec =MAnalyse(ee_super, isb=true , blksize=4)
ee_fmask=Mmask(ee_src,ee_fvec,kind=1,ml=ml).mt_binarize(u=-128,v=-128)
ee_bmask=Mmask(ee_src,ee_bvec,kind=1,ml=ml).mt_binarize(u=-128,v=-128)
ee_to =ee_fmask.trim(1,0)
ee_from=ee_bmask
ee =mt_logic(ee_to,ee_from,mode="or")
oo_src=src.selectodd
oo_super=MSuper(oo_src, pel=1)
oo_fvec =MAnalyse(oo_super, isb=false, blksize=4)
oo_bvec =MAnalyse(oo_super, isb=true , blksize=4)
oo_fmask=Mmask(oo_src,oo_fvec,kind=1,ml=ml).mt_binarize(u=-128,v=-128)
oo_bmask=Mmask(oo_src,oo_bvec,kind=1,ml=ml).mt_binarize(u=-128,v=-128)
oo_to =oo_fmask.trim(1,0)
oo_from=oo_bmask
oo =mt_logic(oo_to,oo_from,mode="or")
#to e0-o1, from o1-e2, nothing e0-e2
even_blink=mt_logic(front,ee.mt_invert,mode="and")
#to o1-e2, from e2-o3, nothing o1-o3
odd_blink =mt_logic(back,oo.mt_invert,mode="and")
interleave(even_blink, odd_blink).selectevery(1,-1)
}
# HD encodes need big subtitle fonts, but AviSynth can't enlarge the halo.
# Here's the custom subtitle function by nanogyth
# It's slow, but we have subtitles for a limited time only
function ng_bighalo(
\ clip clp,
\ string text,
\ float "x",
\ float "y",
\ int "first_frame",
\ int "last_frame",
\ string "font",
\ float "size",
\ int "text_color",
\ int "halo_color",
\ int "align",
\ int "spc",
\ int "lsp",
\ float "font_width",
\ float "font_angle",
\ int "halo_radius"
\){
x = default( x, -1)
first_frame = default(first_frame, 0)
last_frame = default( last_frame, first_frame + 299)
font = default( font, "Ariel")
size = default( size, 18)
y = default( y, size)
text_color = default( text_color, $20FFFFFF)
halo_color = default( halo_color, $20000000)
align = default( align, 5)
spc = default( spc, 0)
lsp = default( lsp, 1)
font_width = default( font_width, 0)
font_angle = default( font_angle, 0)
halo_radius = default(halo_radius, 8)
invis=blankclip(clp, length=1, pixel_type="YV12")
text_mask=subtitle(invis, text, x, y, 0, 0, font, size, $00FFFFFF,
\ $80808080, align, spc, lsp, font_width, font_angle)
halo_mask=mt_logic(text_mask,
\ text_mask.mt_expand(mode=mt_circle(halo_radius)),
\ mode="xor")
h_alpha=(halo_color >= 0) ? 255 - halo_color/$01000000
\ : -(halo_color+1)/$01000000
hc=blankclip(clp, length=1, color=halo_color)
mm=hc.mask(mt_lut(halo_mask, string(h_alpha)+" x * 255 /").converttorgb32)
clp2=clp.applyrange(first_frame, last_frame, "Layer", mm)
t_alpha=(text_color >= 0) ? 255 - text_color/$01000000
\ : -(text_color+1)/$01000000
tc=blankclip(clp, length=1, color=text_color)
mm2=tc.mask(mt_lut(text_mask, string(t_alpha)+" x * 255 /").converttorgb32)
clp2.applyrange(first_frame, last_frame, "Layer", mm2)
}
Note: To import an animated logo instead of a static logo, you should replace this line...
d = ImageSource(file="logoHD.png", start=0, end=59, fps=a.FrameRate).ConvertToRGB32()
by this line.
d = AVISource("logo.avi").AssumeFPS(a.FrameRate).ConvertToRGB32()
If your logo was made for 60 FPS movies, you should use this line below.
d = AVISource("logo.avi").AssumeFPS(60).ChangeFPS(a.FrameRate).ConvertToRGB32()
Command lines
The following command is used for x264 encoding and implies you named your script
hd.avs
:
x264 --qp 0 --keyint 600 --range tv --colorprim bt709 --transfer bt709 --colormatrix bt709 --output video.mp4 hd.avs
YouTube doesn't seem to support FLAC anymore, so either use Raw PCM (.WAV) directly, or high bitrate Vorbis (.OGG).
oggenc2 -q 10 audio.wav
Muxing is also quite simple:
mkvmerge --engage no_simpleblocks --compression -1:none video.mp4 audio.ogg -o TAS.mkv
After uploading to YouTube, add that tag to resulting video to correct the aspect ratio for TV consoles (not for handhelds):
yt:resize=4:3
[1] Many DOS games, and output from many games played in openMSX when using the
-doublesize
option, have double-scanned output, i.e. the pixels are already in 2x2 blocks; the 'even' requirement is not needed in that case.
[2] Use of a weighted average algorithm is also acceptable, as it results in the exact same image as nearest neighbour if scaling up by integer.
[3] This does NOT include x264 lossless, which performs an RGB32-to-YV12 colorspace conversion (unless specifically set to i444 or RGB mode); this is both irreversible and lossy.
[4] For games meeting the criteria set out in footnote 1, the doubling can be omitted as the source material is already scaled by 2x.
[5] For some consoles (Sega Genesis in particular), this can yield resolutions above and beyond what YouTube can accept. One possible alternative to the formula presented here is to calculate the even integer mentioned for handhelds, scale by half that amount, scale the result to a 4:3 aspect ratio, and scale once more by a factor of two; this yields 2x2 blocks, but not all pixels will be the same size.