Post by feos

Posted: 12/8/2012 5:44 PM

Post subject: archive upload from the command line

Quote

nanogyth

Player (66)

Joined: 4/21/2011
Posts: 232

This is a work in progress. I can create an archive page with this, but it is a "data" entry in community_texts.

curl --header "authorization: LOW access:secret" ^
     -i -X PUT http://s3.us.archive.org/lethalenforcers-tas-phil/

I uploaded the files with this, but it didn't update the metadata.

curl --location ^
     --header "authorization: LOW access:secret" ^
     --header 'x-archive-meta01-collection:speed_runs' ^
     --header 'x-archive-meta-mediatype:movies' ^
     --header 'x-archive-queue-derive:0' ^
     --upload-file lethalenforcers-tas-phil.mkv ^
     http://s3.us.archive.org/lethalenforcers-tas-phil/lethalenforcers-tas-phil.mkv

This is the flag that should create the page and upload at the same time. I tried it a few different ways before resorting to PUT, but I couldn't get it to work.

--header 'x-amz-auto-make-bucket:1'

This is the flag that should enable changes of metadata, but I couldn't get it working either.

--header 'x-archive-ignore-preexisting-bucket:1'

No progress bar, which is frustrating.

Posted: 12/8/2012 5:46 PM

Quote

natt

Editor, Emulator Coder, Site Developer

Joined: 5/11/2011
Posts: 1108
Location: Murka

I think this is a very useful thing to pursue. I don't have any other useful input, though ><

Posted: 12/8/2012 5:52 PM

Quote

feos

Site Admin, Skilled player (1255)

Joined: 4/17/2010
Posts: 11486
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg

In advance, will it be faster than manual uploading? Do you still need to enter filenames (which are different each time)? How to pick the Item name from the submission page automatically?

Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.

Posted: 12/8/2012 6:20 PM

Quote

nanogyth

Player (66)

Joined: 4/21/2011
Posts: 232

feos wrote:

In advance, will it be faster than manual uploading?

Probably no change in kbps, but with a programatic interface you could start this upload as soon as the first encode is done and it will likely finish before the next encode.

How to pick the Item name from the submission page automatically?

It is part of the url. By PUT-ing to http://s3.us.archive.org/lethalenforcers-tas-phil/ I created an item called lethalenforcers-tas-phil. The title can be set separately.

Do you still need to enter filenames (which are different each time)?

Hopefully you'd only enter it once (or maybe scrape it from the #S page) P.S. this is a python script I made a wile ago, that can go from a movie number to the author's nickname.

Language: python
import requests
from bs4 import BeautifulSoup
import re

movie_number = 22
movie_url = "http://tasvideos.org/" + str(movie_number) + "M.html"
movie_page_src = BeautifulSoup(requests.get(movie_url).text)

submission_tag = movie_page_src.find("a", text=re.compile("^Submission #[0-9]+$"))
submission_url = "http://tasvideos.org/" + submission_tag.get('href')
submission_page_src = BeautifulSoup(requests.get(submission_url).text)

nickname_tag = submission_page_src.find("th", text=re.compile("^Author's nickname: $"))
nickname = nickname_tag.next_sibling.text
print(nickname)

Posted: 12/8/2012 6:30 PM

Quote

nanogyth

Player (66)

Joined: 4/21/2011
Posts: 232

The encode name is unique enough for archive. Letting archive convert the title to the item name makes for some long and ugly urls.

Posted: 12/8/2012 6:33 PM

Quote

feos

Site Admin, Skilled player (1255)

Joined: 4/17/2010
Posts: 11486
Location: Lake Chargoggagoggmanchauggagoggchaubunagungamaugg

On the side note, what to do with cross-platform TASes of the same title by the same author? They're allowed now, but how to name encodes for, say, SNES Ghouls'n'Ghosts by Nach and Arcade Ghouls'n'Ghosts by Nach? natt?

Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.

Posted: 12/8/2012 11:40 PM

Quote

Ilari

Emulator Coder, Skilled player (1113)

Joined: 5/1/2010
Posts: 1217

nanogyth wrote:

P.S. this is a python script I made a wile ago, that can go from a movie number to the author's nickname.

I presume Python has JSON decoder. If so, maybe use things like http://tasvideos.org/subinfo/2000M.json and http://tasvideos.org/subinfo/3500S.json. Those things are designed to be machine-parseable. The submission id is the second element of the id field in xxxxM.json files. The submission author nickname is the first element of the player field in xxxxS.json files. Also, until the problem of it uploading to community_texts is resolved, please don't use it. opensource_movies, even if it is wrong category is considerably less problematic (of course, speed_runs is perferred).

Posted: 12/18/2012 4:38 AM

Quote

nanogyth

Player (66)

Joined: 4/21/2011
Posts: 232

The ftp version seems to be more well behaved. (docs) The first curl is used to upload each of the files. The second curl gets archive.org to process them.

set "curl=C:\mf\NGcode\lib\curl.exe"
set "user=jstout.physics@gmail.com"
set "pass=12345"
set "item=deathduel-tas-trask"
set "file1=%item%_files.xml"
set "file2=%item%_meta.xml"
set "file3=%item%.mkv"
set "file4=%item%_10bit444.mkv"
set "file5=%item%_512kb.mp4"

%curl% -v -T %file1% --user %user%:%pass% ^
       --ftp-create-dirs ^
       ftp://items-uploads.archive.org/%item%/

pause

%curl% -v "http://archive.org/services/contrib-submit.php?user_email=%user%&server=items-uploads.archive.org&dir=%item%"

pause

deathduel-tas-trask_meta.xml

Language: xml
<metadata>
  <mediatype>movies</mediatype>
  <collection>opensource_movies</collection>
  <title>Genesis Death Duel (USA) in 08:24.73 by Trask</title>
  <identifier>deathduel-tas-trask</identifier>
  <creator>Trask</creator>
  <rerecord_count>1048</rerecord_count>
  <subject>Tool-Assisted Speedrun; Genesis; Death Duel; Trask</subject>
  <runtime>10 minutes</runtime>
  <description>
''Death Duel'' is a side-scrolling first person shooter where the player controls a mech versus a set of nine other mechs. There is a bonus stage after each battle where the player must qualify for the next battle. The player has to purchase repairs, weapons and ammunition with the credits that are earned for destroying the enemy body parts and completing the rounds quickly.

In this run, Trask takes out the enemies mostly through well placed missiles and the more expensive homing missiles. At one point a missile is used to punch through a wall to open up a path for more missiles.
  </description>
  <contact>
This is a tool-assisted speedrun.
For more information visit http://tasvideos.org/1291S.html
  </contact>
</metadata>

deathduel-tas-trask_files.xml

Language: xml
<files>
  <file name="deathduel-tas-trask.mkv">
    <format>Matroska</format>
  </file>
  <file name="deathduel-tas-trask_10bit444.mkv">
    <format>Matroska</format>
  </file>
  <file name="deathduel-tas-trask_512kb.mp4">
    <format>512Kb MPEG4</format>
  </file>
</files>

If this doesn't prevent the auto derived files, then simply "<files/>" would work as well.

Posted: 12/18/2012 6:13 PM

Quote

Ilari

Emulator Coder, Skilled player (1113)

Joined: 5/1/2010
Posts: 1217

nanogyth wrote:

deathduel-tas-trask_files.xml

Language: xml
<files>
  <file name="deathduel-tas-trask.mkv">
    <format>Matroska</format>
  </file>
  <file name="deathduel-tas-trask_10bit444.mkv">
    <format>Matroska</format>
  </file>
  <file name="deathduel-tas-trask_512kb.mp4">
    <format>512Kb MPEG4</format>
  </file>
</files>

If this doesn't prevent the auto derived files, then simply "<files/>" would work as well.

As far as I can tell from the docs, derivations are disabled using file called "_rules.conf" or something like that (documented on the deriver page).

Posted: 12/18/2012 9:12 PM

Quote

nanogyth

Player (66)

Joined: 4/21/2011
Posts: 232

Then the other question is whether disabling derivatives is beneficial. Deriving from both the primary and the 10bit is a waste, but it might be worthwhile to let archive make one set of derivatives. Next thing I'll try is making the archive with the ftp upload and the 512. Then use the s3 upload for the primary and 10bit with the no derive flag set on the primary.

Posted: 12/23/2012 7:40 PM

Quote

nanogyth

Player (66)

Joined: 4/21/2011
Posts: 232

feos wrote:

On the side note, what to do with cross-platform TASes of the same title by the same author? They're allowed now, but how to name encodes for, say, SNES Ghouls'n'Ghosts by Nach and Arcade Ghouls'n'Ghosts by Nach?

We could refer to the archive item by its submission number: archive.org/download/TASVideos-181/jackiechankungfu-tasv2-jeffc.mkv The old way is ugly: archive.org/download/NesJackieChansActionKungFuusaIn1738.25ByArc/jackiechankungfu-tasv2-jeffc.mkv What I've been doing recently is redundant and possible name clashes: archive.org/download/jackiechankungfu-tasv2-jeffc/jackiechankungfu-tasv2-jeffc.mkv Here is the python I've been working on recently. Turns the submission number into a _meta.xml file.

Language: python
import requests
import json
from lxml import objectify, etree
import re

sub_number = 2136 #1582 #1319 #3772
sub_url = "http://tasvideos.org/subinfo/" + str(sub_number) + "S.json"
data = json.loads(requests.get(sub_url).text)

console = data['system'][0]
console_long = data['system'][1]
game = data['game'][0]
branch = data['game'][1]
version = data['game'][2]
author = data['player'][0]
rerecord = data['movie'][0][1]
secs = data['movie'][0][2]

hours, minutes, seconds = int(secs//3600), int(secs%3600//60), secs%60
if hours > 1:
    time_form = '{0}:{1:02d}:{2:05.2f}'
    runt_form = '{0} hours {1} minutes'
elif hours > 0:
    time_form = '{0}:{1:02d}:{2:05.2f}'
    runt_form = '1 hour {1} minutes'
elif minutes == 1:
    time_form = '{1:02d}:{2:05.2f}'
    runt_form = '1 minute {2} seconds'
else:
    time_form = '{1:02d}:{2:05.2f}'
    runt_form = '{1} minutes {2} seconds'
time = time_form.format(hours, minutes, seconds)

if branch:
    title_form = '{0} {1} ({2}) "{3}" in {4} by {5}'
    ident_form = '{0}-tas-{1}-{2}'
else:
    title_form = '{0} {1} ({2}) in {4} by {5}'
    ident_form = '{0}-tas-{2}'
multi_author = re.sub(', and |, | & ', '_', author)
raw_ident = ident_form.format(game, branch, multi_author)

E = objectify.ElementMaker(annotate=False)
meta = E.metadata(
  E.mediatype('movies'),
  E.collection('opensource_movies'),
  (E.title(title_form.
           format(console, game, version, branch, time, author))),
  E.identifier(re.sub('[^a-z0-9._-]', '', raw_ident.lower())),
  E.creator(author),
  E.rerecord_count(str(rerecord)),
  (E.subject('Tool-Assisted Speedrun; {}; {}; {}'.
             format(console_long, game, author))),
  E.runtime(runt_form.format(hours, minutes, int(seconds))),
  E.description('This is a tool-assisted speedrun.'),
  (E.contact('For more information visit http://tasvideos.org/{}S.html'.
             format(sub_number)))
)
print(etree.tostring(meta, pretty_print=True).decode("utf-8"))

#with open(identifier + '_meta.xml', 'w') as f:
#    f.write(etree.tostring(meta, pretty_print=True).decode("utf-8"))

Forum Encoders' corner archive upload from the command line

Forum

Encoders' corner

archive upload from the command line