I think there's also enough difference between SMS and Gen/MD controllers to justify a different file format.
Unless we're trying to develop a unified Snes/Genesis/Nes/N64/GameBoy/GameBoyAdvance movie format
How fleeting are all human passions compared with the massive continuity of ducks.
@Nitsuja:
I like the idea of a unique identifier to link movies and states. By date do you mean epoch seconds?
@Truncated:
Just for implementation's sake, I'd like to see no compression at all in the format, other than the requirement that Gens speak zip. It's true that there's not much space lost in compressing an already-compressed string, but there's no reason to add the extra complexity to the implementation.
I like the changes, except that the SMS stuff seems premature. I prefer that we make the format extensible enough to allow SMS compatibility, but don't actually do anything until a usable emulator is on the horizon.
State names are a good idea, but sorting by frame can easily be done internally by Gens at runtime.
I see no point in using a multitap when Gens already supports 8 controllers as is. I like less complexity.
I agree that some chunks need to be merged. There are definitely too many small chunks with similar or related data. I'm going to go on a little merging spree.
@upthorn:
I can't tell if the suggestion of a unified format was sarcastic or not, but I knew someone was going to say it eventually. It sounds like a good topic for debate. There's no reason not to make this extensible enough that new emulators can use it, but the formats used by most other rerecording emulators are adequate for their purpose.
Edit: I've got some chunks merged, although the new game requirements chunk looks intimidating. Other than the psychological effect, I don't think this is a bad thing.
I'd also like some more discussion about the text-based format, especially the advantages of the binary format over it.
Not entirely sarcastic, but I was trying to point out, in a round-about way, that adding SMS stuff is premature, and most likely beyond the scope of GMV.
I do like the idea of a unified format, but this discussion isn't about that.
How fleeting are all human passions compared with the massive continuity of ducks.
@upthorn and IdeaMagnate, about SMS:
We can very well leave it out of the specs, but even if it's in it doesn't matter for what needs to be programmed for Gens. Anyway, I think it demonstrates that SMS compatibility could be fit into the format easily.
IdeaMagnate> I see no point in using a multitap when Gens already supports 8 controllers as is. I like less complexity.
I looked trough it, and it appears that the way Gens handles 8 controllers is actually by using two multitaps - this is the setting of "pad" or "teamplay" in the control menu. This variable is not saved in the current GMV format, which is a problem. (Looking back, I realize this must have been why Lost Vikings 3 players had such problems synching.)
Gens does not allow a port to not be connected to anything, but we still want to skip saving data for all the controllers we don't use. I'll change that part in a minute, let me know what you think.
>I'd also like some more discussion about the text-based format, especially the advantages of the binary format over it.
What were you planning? Having the text-based format in the actual GM2 file, instead of a binary format?
I was thinking that the text format would be a direct uuencoding (binary -> non-whitespace ASCII) of everything except frame data and the movie crc. The frame data would be translated to something like upthorn's suggestion or gmv2txt 's output, and the crc would be left out completely until the file movie was saved as a binary GM2.
This would make it relatively easy to write the code to load the movie and would allow for easy manual and automated hexediting.
Before I forget:
It seems that gens internally stores button presses as 0 bits. And although converting it when reading the movie would be trivial, we'd be unnecessarily complicating the process of reading and writing movies for very little gain.
[edit]
Woops, double-posted. Deleted the second.
How fleeting are all human passions compared with the massive continuity of ducks.
Nice changes IdeaMagnate. It's taking shape now.
The intro says that the standard should be no null terminator, but at the moment almost all the strings have it specified anyway. I say we pick either or, less coding and less to think about for reading a string.
I think button pressed = 1 is better, but never mind. It's not important.
About the text format: Do we need to write a complete text format specification too, how every byte is represented in text and which errors can be tolerated from the user when reading it? That sounds like a lot of work.
The text format should be well-defined, but it may be easier to come up with something after implementing the current proposal.
As is, there's a minor question about internal BIOS names and some ambiguity about how frames are stored, but it looks like it's almost ready for implementation.
After reading up some on byte alignment, it seems to me like the spec should be much more careful about alignemt or shouldn't take it into consideration at all.
I also don't understand the purpose of the byte alignment flag in the frame chunk. There are no guarantees about how chunks will be aligned, so it seems pointless to align the data within a chunk. I understand the spec as saying that each frame will have a variable length (but constant within a given movie) and will need to be read bytewise anyway.
The specification seems to have settled down to stable form now. Is everyone satisfied with how it turned out or do you think there is still something that needs to be addressed?
Because if not, perhaps we should start thinking about actually implementing it.
Since I don't have wiki editing permissions, I'll put my suggestions for proposals for the GM2 format here.
I have a suggestion.
Rather than explicitly stating that a controller is inactive, we could record all 8 controllers dynamically, by adding a byte at the beginning each frame stating whether each controller has that frameinput:
Frame format
1 Byte: Controllers recorded bitwise booleans
0x01 = Controller 1
0x02 = Controller 2
0x04 = Controller 1B
0x08 = Controller 1C
0x10 = Controller 1D
0x20 = Controller 2B
0x40 = Controller 2C
0x80 = Controller 2B
1 = recorded, 0 = not recorded
then the actual controller information.
If the controller has no input during that frame, its recorded flag will be set to 0.
Example (3 button controllers for simplicity)
03 F7 FD 02 FD 13 FD FD 7F
would produce the following input
Frame 1: p1 Right, p2 Left
Frame 2: p2 Left
Frame 3: p1 Left, p2 Left, p5 Start
The advantage of this is that it wouldn't necessitate some sort of configuration dialog on the creation of a new movie, asking which controllers to record.
The other advantage is that a player could use the high numbered controllers for occassional input without multiplying the total filesize.
The disadvantages, then, are that framesize is increased from 1-16 bytes to 2-17 bytes (doubling size of single player 3 button controller input), and necessitates at least a single-byte read between each frame, if not simply byte-by-byte input reading.
Of course, the current plan of a variable number of bytes per controller already suggests a slight preference for byte-by-byte read-ins...
I suggest instead either the following;
Button presses are stored as comma separated strings, 1 character for player, 1-12 characters for buttons pressed
(Both the following are valid frames)
42:1DR,2UA,5UDLRABCSXYZM
Hexed this in:1CMLB,2
This is slightly more work from a programming perspective (although it's pretty easy to implement with some StrChr and StrRChr), but it would simplify editing an order of magnitude further than just having an ASCII based format simplifies it over the current system.
How fleeting are all human passions compared with the massive continuity of ducks.
A button configuration dialog is necessary. Some games act differently depending on which controllers are detected. It's not a good idea to depend on the user never accidentally pressing x, y, z or mode to determine a potentially important factor in gameplay. Additionally, Gens needs explicit controller configuration information in Controller_x_Type, so this would generally mean looking through the whole frame chunk to find that information for each controller.
This would also result in more complex frame chunk reading code. This is a bad thing because the frame count can realistically top 200K frames. The code for this chunk should be as highly optimized as possible.
Using odd controllers occasionally (e.g for Tails in Sonic 2) wouldn't hurt the compressed filesize much. I don't see uncompressed file size as a significant issue since it's intended only for local (i.e. user's hdd) storage.
I like your text frame chunk proposal. It will usually result in fewer characters which will make chunk processing faster. I think it also has better potential for optimized C code, which is important. I'll try implementing it and seeing how it fares agains my implementation of the current proposal.
BTW, please be sure to bring up any proposals for significant changes here for discussion before putting them on the wiki. The purpose of the wiki page is to record what's already been agreed on.
No, no, I wasn't advocating removal of the byte for controller types, I was just advocating that we don't use the code for explicitly inactive controllers unless that control port is set to "empty" (not currently possible in gens).
I think this may be based on your misunderstanding of my proposal.
The way I have the code for reading the input data set up is a fairly simple 8 iteration for loop. I can show you the code later if you want, but I find it unlikely to cause any sort of slow down regardless of the number of frames. Unless we're doing a file-seek to frame offset for every frame, which seems rather silly, to me -- it should only be done when loading states, as input data is sequential.
I was planning on using footnotes like bisqwit, truncated, and nitsuja have been doing.
How fleeting are all human passions compared with the massive continuity of ducks.
I think I understand you better now, but I still have an issue with random seeks. If the frame size can only be dynamically determined, either a linear search from the beginning or something more complex will be necessary to find a given frame when playing/recording a movie from a saved state, which is a very common use case. The nice thing about having a constant-length frame is that finding a random frame takes one simple line of code with constant time.
Dynamically-determined frames would require more code with greater complexity, and I don't see any benefits to justify the change.
Take the following (unlikely but hopefully valid) example with several 3-button controllers. Each bold value starts a frame.
03 05 01 03 04 02 02 02 07 06 02 01
When searching for the beginning of an earlier frame, the code would need to either maintain a list of pointers to the start of every nth frame or be smart enough to search backwards, which would mean dealing with a significant amount of ambiguity. This is certainly possible, but is also a fertile source of bugs.
Having variable-length frames in the file for text mode is useful, but in binary mode, isn't that being compressed anyway, eliminating any benefit it might have had? You'll end up needing (or at least wanting for convenience) it to be constant-length per frame in the emulator's memory regardless, otherwise it will take extra conversions or processing to deal with.
Also, I assume it won't work "natively" in text mode, right? As in, it should do a conversion on the file input/output that happens occasionally, instead of parsing text every frame of the movie. That should usually be faster and should definitely be easier to implement.
Yeah, i guess you're right, the variable frame length isn't really that useful. I had forgotten (and somehow skimmed past the place on the GM2 page that says it) that we were going to be compressing the file anyway.
How fleeting are all human passions compared with the massive continuity of ducks.
nitsuja, that's how I've been thinking about it. A binary<->text conversion would only happen when loading a new movie or explicitly saving one as text.
Most re-recording emulators also save in these or similar situations to prevent lost work: Loading a savestate, or recording more than 500 consecutive frames. I don't think that will be too slow, but if it is (perhaps for longer movies), then an alternative is to save a binary backup movie in those cases, and check that file upon loading the text movie to restore lost data.
Other chunks can become large, but manually scrolling to a certain frame is inefficient, especially when even Notepad has a usable search feature. Users will only need to search for "x:" to find frame x, since ':' isn't part of the base64 character set. Saving the file may take a second or two, but that's not a major inconvenience.
I don't know how useful this would be. The way I'd implement it, your example frame would be equivalent to ":1rlacb". If you want to comment out a frame, you could just change the colon to a semicolon and ensure that the implementation silently ignores invalid frames.
I don't like case-insensitivity because it's an extra check on every button character in the frame chunk. If it becomes a problem then it's easy enough to implement, but I see it as unnecessary cruft until then.
I don't see how encountering the same button more than once in a frame could cause a problem either. ;)
Notepad can't load files larger than a few hundred k. And when dealing with filesizes in MB or larger, most text editors (on windows at least), become sluggish with loading, finding, or saving.
Also, I strongly suspect that base64 deencoding a 4 megabyte string will be somewhat sluggish.
ideamagnate wrote:
in the wiki upthorn also wrote:
What happens if it encounters the same controller number in multiple segments? I think that it should only pay attention to the last segment for each controller. EG: "testframe:1r,1lr,1lra,1ra,1rabc,1a" should produce the input "a" on controller 1. This would enable the user to test multiple sequences of input, and easily revert to prior versions if one version turns out slower.
I don't know how useful this would be. The way I'd implement it, your example frame would be equivalent to ":1rlacb". If you want to comment out a frame, you could just change the colon to a semicolon and ensure that the implementation silently ignores invalid frames.
The other reason I prefer it this way is because it's faster to ignore redundant data.
In the implementation that only parses one segment per controller per frame, you can set a boolean flag for each controller, which causes it to skip to the next loop iteration if true.
ideamagnate wrote:
upthorn the prolific wrote:
These [buttons] should probably be case insensitive. Also, the routine should be able to encounter the same button multiple times without barfing. (although I don't see how the routine would barf in that scenario unless it were specifically designed to)
I don't like case-insensitivity because it's an extra check on every button character in the frame chunk. If it becomes a problem then it's easy enough to implement, but I see it as unnecessary cruft until then.
I don't see how encountering the same button more than once in a frame could cause a problem either. ;)
If they aren't case insensitive they ought to be capitals.
and case insensitivity is easily handled with
switch (ChrToParse)
{
Case 'U':
case 'u':
SetButton(Up);
break;
}
How fleeting are all human passions compared with the massive continuity of ducks.
Notepad can't load files larger than a few hundred k. And when dealing with filesizes in MB or larger, most text editors (on windows at least), become sluggish with loading, finding, or saving.
Also, I strongly suspect that base64 deencoding a 4 megabyte string will be somewhat sluggish.
In my XP virtual machine, both Notepad and Wordpad are quite usable for a 14M file. I wouldn't call them optimal, but they get the job done.
Also, b64 encoding/decoding is pretty fast. It took my machine (1.6 gHz Core Duo) about .39s to decode 14M of base64-encoded random data. Either way, I strongly prefer to keep everything self-contained in a single file.
upth wrote:
The other reason I prefer it this way is because it's faster to ignore redundant data.
In the implementation that only parses one segment per controller per frame, you can set a boolean flag for each controller, which causes it to skip to the next loop iteration if true.
This is probably true. Feel free to nuke your comment and add it to the proposed spec.
If they aren't case insensitive they ought to be capitals.
and case insensitivity is easily handled with
switch (ChrToParse)
{
Case 'U':
case 'u':
SetButton(Up);
break;
}
I was thinking of another way of implementing it, but it could also easily be adapted. Feel free to add this to the GM2 page too.
Notepad can't load files larger than a few hundred k. And when dealing with filesizes in MB or larger, most text editors (on windows at least), become sluggish with loading, finding, or saving.
Also, I strongly suspect that base64 deencoding a 4 megabyte string will be somewhat sluggish.
In my XP virtual machine, both Notepad and Wordpad are quite usable for a 14M file. I wouldn't call them optimal, but they get the job done.
Also, b64 encoding/decoding is pretty fast. It took my machine (1.6 gHz Core Duo) about .39s to decode 14M of base64-encoded random data. Either way, I strongly prefer to keep everything self-contained in a single file.
Though horrified by your nonchalant acceptance of multiple megabyte text files, I relent. Keeping the text file size down isn't really thate important, and I agree there is value to having the file completely self contained.
Wiki page updated.
How fleeting are all human passions compared with the massive continuity of ducks.