About the April Fools prank
As I've
explained earlier,
for me as a webmaster the April Fools is a day to try out new controversial ideas
and gauge people's reactions. If it's liked, it can be utilized later. If it's not
worth being utilized later, it can be called a prank. Either way, it's a win-win
situation. Of course the prank must be good-spirited.
I've been wanting to create a comprehensive April Fools prank
to the site for several years, but never really had time to
create one.
This year however, right before Easter, I happened to play with a
speech synthesis system I hadn't tried before (espeak), and I got an
idea -- I'll create an audio-only web site. The next weekend, I started
working on it, and I created most of the new site in just about one day.
Idea was born
My original idea was to create an audio-only site. However, I realized that
I need some HTML on it (to receive the links entered by the user). As I added
it, I started thinking that a bare-bones text-version of the material being
spoken would be good to have.
As I pondered on the style for that text version, and the overall page, it
little by little started taking shape of the redesign I had actually planned
for this site for a year now.
The site redesign idea
I'm not particularly a fan of Digg, but I've paid attention to how simple
its layout is and much meaningful content it manages to represent, all while
being represented in fashionable web-2.0 style. A long time ago, I started
wanting to redesign the TASVideos site in a similar style.
In my opinion, the current site has major design problem that makes its
representation suffer: It is primarily a Wiki, then a movie site.
I wanted to change its focus: Content first, details after that. And of
the content, movies first, articles later. So I want to make the front page
into a list of movies.
Other problems I want to fix with this include the long load times of movie listing
pages containing
everything of
each movie.
I'll post more about the ideas once I dig up some discussion I had with Adelikat a few months back.
How was the speech created?
Each page is first generated into HTML. Then a DOM parser is created, which
will walk through the document, and pick up content that is interesting from
a speech synthesis viewpoint, and construct a SSML message for Espeak to speak.
SSML contains tags that can control the type of the voice. The whisper was
done this way, for example.
Then Espeak will be run, and the output (WAV) is passed to Lame, which will
produce a MP3 file. Then the mp3 file is passed to Mencoder, which will
produce an FLV file. Since MEncoder also requires a video stream, a dummy
video stream is generated consisting of the mushroom icon being repeated
as a PNG file. After MEncoder completes, Yamdi is run to add index into
the FLV file. After the indexing is completed, the audio is ready for streaming.
On the server, a background process is run, to send finished FLV files to
another server with higher bandwidth, to avoid bandwidth exhaustion.
This part of the process didn't work as well as I hoped.
On the client side, a Flash player ripped from Youtube will pick up the FLV
stream and play it back to the user.
Miscellaneous
One interesting part for me was to ensure that it pronounces properly
every non-English word found on the site (or at least on the commonly
accessed pages). This meant that it had to pronounce Russian, Finnish,
French, and even Japanese names correctly according to the rules of those
particular languages. Also words like "TAS", "TASVideos", "Bisqwit",
"rerecord" etc., it has to pronounce according to the official or
commonly accepted pronounciation.
Fortunately, that was rather easy to do in espeak. Over the next few days,
I created a dictionary of some 100 or 200 names and words into IPA.
Examples of words in the list: "Torfi", "Tryczak", "Kyrsimys", "Densetsu",
"Mario", "yoshi", "Lamoureux", "taser", "tasing", "tases", "anime". Each of
them was entered either as an IPA string or a reference to another language's
parser.
Some words written in hiragana I added using the Finnish engine,
since it was the closest to Japanese pronounciation.
I guess it all worked pretty well because nobody complained to me
about something non-English being pronounced wrong.
I was quite giddy about this version. I had hard time shutting my mouth about it for a week.
Things I wonder if anyone noticed
* That the links spoken out by the voice could be entered on the Search bar ("enter number 15 to follow this link" means "type 15 and press enter to follow this link")
* The cool 404 page that included a message in English, Finnish, French and Japanese
* A Tour page
* The "happy april fools day" whisper
* The different language names one wonders how to pronounce
* The "this page is way too long for me to read to you" message
So what's going to stay and what not?
Naturally, the aural version (speech synthesis) was a prank.
Nothing more. It will never be heard again, at least not until
another April Fools spirited day.
The movie lengths were expressed in "nn minutes and mm seconds" because that's
the only way the speech synthesis system would speak them out in a natural manner.
Since the original intention of my prank was to provide audio only and a minimal
text content, this fact stood out in the public version.
Similarly, the console names (NES, N64) were spelled out (Nintendo Entertainment System,
Nintendo 64) for the same reason: They sound better when spoken out in full rather
than as abbreviations.
Since I was short on time (I had only one weekend to work on this, and I spent most
of it tuning the speech synthesis), there are many important features missing in this
demonstration. Things such as lack of wiki functionality (page editing), lack of
rerecord counts on the submission pages, bleak movie pages (I would actually want
help in redesigning them) can be attributed to the lack of time. That, and also
I wanted to minimize the amount of changing page content, so that the speech streams
do not need to be regenerated every ten minutes.
I added a high-pitched prosody for the word "games" out of whim. I think it sounded
funny. Similarly for the phrase "The cake is a lie!" on the submission pages.
Obviously, neither are going to stay.
I appreciate all the feedback you gave and give to me about this redesign.
I hope you had fun and saw behind the joke that there was actually
a plan of a new, more usable site.
I cannot make promises when it will be finished -- it may be next summer, or the
next year, or something, depending on various things. But I have now experience,
and you, the users (those who were alarm on this day), have at least some
perception of what I have had in mind.