Right, as you're playing, QuickNES saves state every second to a cache of the most recent two minutes (120 save states), and also saves state to the ongoing movie every 30 seconds. These are both adjustable in the code.
As for movie size on disk and in memory, here are some data for complete movies, keyframes every 30 seconds (on disk they are gzipped and in memory compressed using a custom algorithm):
Time Disk Average Mem Average Game
--------------------------------------------------------------
0:30 46K 1.5K/min 126K 4.2K/min A Boy and His Blob
0:20 54K 2.7K/min 145K 7.3K/min Batman
0:52 370K 7.1K/min 662K 12.7K/min Bionic Commando
1:00 150K 2.5K/min 423K 7.1K/min Blaster Master
0:23 84K 3.7K/min 232K 10.1K/min Castlevania
0:43 182K 4.2K/min 420K 9.8K/min Deadly Towers
0:57 222K 3.9K/min 621K 10.9K/min Duck Tales 2
2:21 430K 3.0K/min 1165K 8.3K/min Esper Dream 2
0:36 60K 1.7K/min 172K 4.8K/min Fester's Quest
0:20 46K 2.3K/min 113K 5.7K/min Gimmick!
0:51 138K 2.7K/min 459K 9.0K/min Metroid
0:20 34K 1.7K/min 113K 5.7K/min Mighty Bomb Jack
0:49 94K 1.9K/min 239K 4.9K/min Ninja Gaiden
0:34 178K 5.2K/min 405K 11.9K/min Rygar
0:43 118K 2.7K/min 355K 8.3K/min Section-Z
0:52 98K 1.9K/min 282K 5.4K/min Solomon's Key
0:59 180K 3.1K/min 518K 8.8K/min Wizards & Warriors
0:19 42K 2.2K/min 116K 6.1K/min Yume Penguin Monogatari
As mentioned on the linked Wiki page, the keyframe rate is entirely adjustable, allowing reduction in file size (you could save a movie with keyframes every 5 minutes, even if you recorded it with them every 30 seconds, for example). The in-memory compression is a custom algorithm I wrote to be extremely fast; using
minilzo results in about 35% less memory usage (I imagine even
zlib would perform decently, reducing memory usage further). The most recently accessed snapshots are kept uncompressed (and un-modified ones aren't re-compressed, similar to virtual memory), so compression doesn't affect performance much.
Implementation is very straight forward; the emulator core exposes functions to emulate one frame, and save/load to/from a memory-based state snapshot. The movie functions are then implemented using these functions. The only reason this scheme might not work for a particular emulator is if it's relatively slow. On a 1.8 GHz Pentium 4 M laptop, my NES emulator runs about 3180 frames per second (5700 with sound and image disabled, as is done when skipping frames during seeking), so speed is not at all an issue in my case.
EDIT: I see you were asking about memory usage, so I added that to the above table.