First Generation Encoding
Our first set of encodes are based on WMV3 and WMA in ASF with WMDRM10 (Janus). We chose these standards because the Janus components have been widely adopted by our CE partners such as Roku, LG Electronics, Samsung, TiVo, and of course Microsoft Xbox.
We encode most content at 500, 1000, 1600, and 2200kbps VBR, but some titles whose source quality merits it have also been encoded at 3400kbps. The highest bitrate encodes are fit into 720x480 non-square pixels (the usual 1.2 PAR for widescreen content, 0.9 PAR for 4:3), but optimum encoding at lower bitrates is achieved with fewer pixels. Encoded films are normally at 24fps to match the source, while shot-to-video and mixed material is de-interlaced to 30fps (or 25fps for PAL content).
Second Generation Encoding
The new Silverlight player (that some users are helping us test as I write) uses VC1 Advanced Profile encoding with PlayReady DRM. A key property is that each GOP header includes frame size and resolution, which allows us to assemble a stream on the fly from different bitrate encodes as your broadband bandwidth fluctuates. (Another key feature is more coverage, including Intel Macs and Firefox users.) We expect to switch completely to the new player later this year.
The VC1 encoders are more efficient than the WMV3 encoders, so we are currently encoding VC1AP at slightly lower birates: 375, 500, 1000, and 1500kbps, all square pixel. At some point we are likely to add a couple more resolutions of non-square pixel encodes capturing the original pixel-aspect-ratio of the source.
We are also re-wrapping the VC1AP encodes in WMDRM10 for CE devices, which will gradually switch to the more efficient encodes in future firmware upgrades.
High Definition Encodes
Today we have rights to deliver about 400 streams in HD (720p). More titles will be added over time. We experimented with first-generation WMV3 encodes at 4000kbps and 5500kbps, but settled on second-generation HD encodes with VC1AP at 2600kbps and 3800kbps, which extends their accessibility down to lower home broadband connections. As with SD, encodes of film material are at 24fps, and encodes of shot-to-video material are at 30fps (or 25fps for PAL), rather than the 60fps that would come from a Blu-ray disc - we judged the 60fps content as too expensive of bandwidth for now. In general, these encodes are definitively better than SD, but won't challenge well-executed Blu-ray encodes - that would require a bitrate out of reach for most domestic broadband today. We believe Moore's law will drive home broadband higher and higher enabling full 1080p60 encodes in a few years.
Stereo Audio
Today, we cannot use WMDRM to deliver AC3 or DD+ audio, which means that only stereo (delivered via WMA) is available. PCs and Macs decode the WMA, and CE players also transcode to PCM for digital connections to receivers. We could technically include multichannel audio using WMAPro, but essentially no receivers are actually capable of decoding that. We are working on solutions to deliver multichannel audio for all the streams where we have suitable source, but this won't happen in 2008 for sure.
Subtitles, Closed Captions, and Alternate Soundtracks
All these features are desired for future releases. Delivering closed-captions via the Silverlight player is probably closest, but it won't be 2008 either.
Sources
Our best sources are electronically delivered mezzanine files, or high quality D5 tapes, and the highest bitrate encodes of these sources really look as good or better than DVDs. Digibeta tape sources can also generate good encodes, but some sources just are not as good, regardless of the bitrate used for encoding. We also encode from DV tape and even on occasion from DVDs. We get HD sources for many titles, even if we only have the rights to stream SD. The HD sources permit a better SD encode than working from SD soures.
One class of sources has been derived from 24fps film, interlaced to i60 for TV broadcast, and then decimated to p30, and comes with restrictions on reprocessing. This results in frames that are even-odd interlaces of adjacent film frames, and a 4/5 cadence motion jerkiness. We are actively working to re-acquire these sources in better form.
Delivered Quality
Our first-gen PC streaming player uses 1-4 bars to represent the delivered quality, representing 500, 1000, 1600, and 2200 kbps. The 3400kbps encodes are represented as 4 bars. The player measures bandwidth once at the start of the title, and chooses a bitrate for delivery that has at least 40% headroom from the measured speed.
The Roku, LG, and Samsung, players use four dots during buffering in the same way, and Xbox has 4 bars just like the PC player. The TiVo player has a similar display, but with 10 thin bars.
The Silverlight player is currently more opaque, since it picks the stream to deliver dynamically. If your connection slows, as the buffer empties, the player starts buffering a lower bitrate stream and switches seamlessly across. Conversely, if the buffer fills rapidly again, the player can pick a higher bitrate stream. (Note that if Outlook (or some other large application) decides to wake up and refresh your email in the middle of a movie, Silverlight might be starved of CPU and drop some frames; this may cause the player to conclude that it should switch to a lower bitrate stream that won't overload the CPU. Today, we haven't figured a reliable way to determine that the CPU is again underutilized and permit switching back up again, so my advice is to close Outlook and similar periodically expensive applications prior to playing the movie!)
Other Notes
We strive to deliver great encodes, and the encoding recipe evolves as we learn new tricks and engage new encoding technology. Since we have a variety of software and CE players in the field, we have to continue to support existing encodes for a transition period if new versions incompatible with existing players are added, until we can work with partners to upgrade firmware to support the new encodes.
One new feature that I want to add is a post-play screen that appears when a playback is stopped, so that users can easily flag when encodes fall below par, so that we can prioritize identifying and fixing issues.