You won't find a better videotape format in terms of price/performance for standard-definition television than DV or its related formats DVCAM and DVCPRO. Also, DV is the first broadcast-quality format small enough for a camera master to fall into a cup of tea (trust me on this; no need to try it yourself).
I first experienced DV in October of 1995, when I saw a Sony DCR-VX1000 hooked up to a 32" Sony XBR monitor at Fry's Electronics in Sunnyvale. I was impressed by the live pix, but blown away by the off-tape playback, which looked as good as live. I lay awake for three nights, thinking "the world has changed: Digital For The Rest Of Us..." before buying a VX1000, and selling my pro/industrial EVW-300 3-chip, interchangeable-lens Hi8 camcorder to pay for it...
Most people start with the FAQ, and cruise around
from there. The new stuff is listed at the top of the FAQ, so it's a good
place to start.
Technical Details | Comparisons / Reviews | Frequently Asked Questions | Pix | Links |
DV: Technical Details |
|
DV | DVCAM | DVCPRO | Digital8 | |
suppliers | consortium of 60 manufacturers including Sony, Panasonic, JVC, Canon, Sharp. | Sony | Panasonic; also Philips, Ikegami, Hitachi. | Sony |
intended market segment(s) | consumer (although JVC makes a dockable DV VTR for the pro/industrial market) | professional / industrial | professional / industrial / ENG / EFP / broadcast | consumer (Video8 & Hi8 replacement) |
who's actually buying the stuff | consumer / professional / industrial / ENG / EFP | professional / industrial / ENG / EFP | professional / industrial / ENG / EFP / broadcast | consumers |
tape type | ME (Metal Evaporate) | ME (Metal Evaporate) | MP (Metal Particle) | ME, MP (uses Video8, Hi8 tapes) |
DV | DVCAM | DVCPRO | Digital8 | |
track pitch | 10 microns (SP)
6.7 microns (LP) |
15 microns | 18 microns | ??? |
track width | 10 microns (SP)
6.7 microns (LP) |
15 microns (10 microns on some early gear) | 18 microns | ??? |
tape speed | 18.81 mm/sec | 28.215 mm/sec | 33.82 mm/sec | 28.6 mm/sec (estimated) |
cassettes & max. loads | miniDV: 80/120 min (SP/LP)
std: 3.0/4.6 hrs (SP/LP) (4.6/6.9 hrs possible using DVCAM 184 min tape) |
miniDV: 40 min.
std: 184 min. |
small: 63 min. (note: small is larger than miniDV cassette)
std: 123 min./184 min.** |
Video8, Hi8 standard 120 minute tape: 60 min. |
max. camera load | 80/120 min. (SP/LP) | 184 minutes | 63 minutes (AJ-D700/810);
123 min. (AJ-D200/210); 184 min. (AJ-D215)** |
60 min. |
compression | 5:1 DVC-format DCT, intra-frame; 25 Mbps video data rate | 5:1 DVC-format DCT, intra-frame; 25 Mbps video data rate | 5:1 DVC-format DCT, intra-frame; 25 Mbps video data rate | 5:1 DVC-format DCT, intra-frame; 25 Mbps video data rate |
DV | DVCAM | DVCPRO | Digital8 | |
resolution & sampling | 720x480, 4:1:1 (NTSC)
720x576, 4:2:0 (PAL) |
720x480, 4:1:1 (NTSC)
720x576, 4:2:0 (PAL) |
720x480, 4:1:1 (NTSC)
720x576, 4:1:1 (PAL) |
720x480, 4:1:1 (NTSC)
720x576, 4:2:0 (PAL) |
audio recording
(see "locked vs unlocked" below) |
2 ch @ 48 kHz, 16 bits;
4 ch @ 32 kHz, 12 bits; will accept 2 ch @ 44.1 kHz, 16 bits via 1394 I/O; unlocked (but can record locked audio via 1394) |
2 ch @ 48 kHz, 16 bits;
4 ch @ 32 kHz, 12 bits; will accept 2 ch @ 44.1 kHz, 16 bits via 1394 I/O; locked (but some VTRs can be made to record unlocked via 1394) |
2 ch @ 48 kHz, 16 bits; locked, plus one analog audio cue track; plays back 32 kHz, 12 bits and presumably 44.1 kHz, 16 bits. | 2 ch @ 48 kHz, 16 bits;
4 ch @ 32 kHz, 12 bits; will accept 2 ch @ 44.1 kHz, 16 bits via 1394 I/O; unlocked (but can record locked audio via 1394) |
These tapes can play back in... | DV, DVCAM, & DVCPRO VTRs | DV*, DVCAM, & DVCPRO* VTRs | DVCPRO VTRs; DSR-2000 DVCAM VTR | Digital8 camcorders |
These VTRs can play back... | DV & DVCAM* tapes | DV & DVCAM tapes (DVCPRO in the DSR-2000; Oct '99) | DV, DVCAM*, & DVCPRO tapes | Video8, Hi8, Digital8 tapes |
IEEE-1394 I/O
(a.k.a. "FireWire" or "i.link") |
Sony & Canon camcorders and VTRs; newer JVC camcorders (output only) | DSR-V10, DSR-20, DSR-30, DSR-40,
DSR-200/200a, DSR-500, DSR-2000, DRV-1000 |
AJ-D210/215 camcorders and AJ-D230 VTRs with optional adapter. | yes |
DV | DVCAM | DVCPRO | Digital8 | |
SMPTE 259M SDI (serial digital interface) | no | DSR-60/80/85/2000 VTRs with adapter | AJ-D750/650/640 VTRs with adapter | no |
4X digital I/O | no | DSR-85 VTR | AG-D780 VTR; NewsByte NLE with onboard VTR | no |
Analog component I/O | no | DSR-40/60/80/85/2000 VTRs only | AJ-D750/650/640 VTRs | no |
Y/C & composite I/O | yes (DRV-100 & many camcorders: output only) | yes (DRV-1000: output only) | yes (no Y/C on AJ-D750) | yes |
Edit control | LANC & IEEE-1394 (Sony, Canon);
Panasonic 5-pin (Panasonic); J-LIP (JVC) |
LANC & IEEE-1394 (DSR-V10, DSR-20/30, DSR-200/200a);
RS-232 (DSR-20); RS-422 (DSR-40/60/80/85/2000) |
RS-232 (AJ-D230/640/650/750);
RS-422 (AJ-D640/650/750) |
LANC & IEEE-1394 |
DV: Comparisons / Reviews |
|
Where a link for [pix] exists, a separate window will be launched, so that you can continue to read the text in this page while the images are loading. The pix pages' menu banners have links to the other available (on-site) pix pages, so that you can browse pix completely separately from the main text pages. (Of course, if you're using an ancient browser that doesn't understand target="new_window", the separate browsing won't occur... and if you're on a slow link, and/or using lynx or NetHopper, skip the graphics; you don't really need them anyway!)
This work is my own, but has been generated from many sources. I especially wish to thank Jan Crittenden at Panasonic, Earl Jamgochian at Sony, and Jim Miller at JVC for their help in answering a variety of tricky questions and in correcting assorted technical details.
2 May '99: There's a lot of interesting new stuff that was shown at NAB (some of it is even shipping) and a lot of new information: Final Cut Pro, IMC's Incite, Matrox DigiSuite DTV, Canopus RexRT, the DSR-500WS camcorder and DSR-2000 VTR, the updated story on unlocked audio, 100 MBit/sec DV-based HDTV formats, the DV chipsets from C-Cube, divio, and Zoran... but (a) I'm too busy to add all of this right now, and (b) this flippin' page is getting too big: a redesign is needed. I hope to rework all the DV stuff sometime; whenever the work lets up and I have some free time.
I plan to give each major topic its own page to improve load times and I'll have to change the navigation structure to accommodate this. If you have some helpful commentary or suggestions on this, please let me know: the point is to make this information as easy to peruse as possible. Email me at "adam at adamwilt dot com" (but no, I won't wire in a clickable mailto: link since that makes things too easy for spammers and their web-bots. Don't even ask). Thanks!
4 Feb | Updated Ultra DMA info in NLE; a few new links... |
18 Mar | Digital8 added; links rearranged and consolidated. |
2 May | Unlocked audio updated; new links |
23 June | Tech Details updated for DVCPRO runtimes; links tweaked. |
2 July | Mac G3 SCSI update |
DV is an international standard created by a consortium of 10 companies for a consumer digital video format. The companies involved were Matsushita Electric Industrial Corp (Panasonic), Sony Corp, Victor Corporation of Japan (JVC), Philips Electronics, N.V., Sanyo Electric Co. Ltd, Hitachi, Ltd., Sharp Corporation, Thompson Multimedia, Mitsubishi Electric Corporation, and Toshiba Corporation. Since then others have joined up; there are now over 60 companies in the DVC consortium.
DV, originally known as DVC (Digital Video Cassette), uses a 1/4 inch (6.35mm) metal evaporate tape to record very high quality digital video. The video is sampled at the same rate as D-1, D-5, or Digital Betacam video -- 720 pixels per scanline -- although the color information is sampled at half the D-1 rate: 4:1:1 in 525-line (NTSC), and 4:2:0 in 625-line (PAL) formats. (See below for a discussion of color sampling.)
The sampled video is compressed using a Discrete Cosine Transform (DCT), the same sort of compression used in motion-JPEG. However, DV's DCT allows for more local optimization (of quantizing tables) within the frame than do JPEG compressors, allowing for higher quality at the nominal 5:1 compression factor than a JPEG frame would show. See Guy Bonneau's discussion of DV vs MJPEG compression for more details.
DV uses intraframe compression: Each compressed frame depends entirely on itself, and not on any data from preceding or following frames. However, it also uses adaptive interfield compression; if the compressor detects little difference between the two interlaced fields of a frame, it will compress them together, freeing up some of the "bit budget" to allow for higher overall quality. In theory, this means that static areas of images will be more accurately represented than areas with a lot of motion; in practice, this can sometimes be observed as a slight degree of "blockiness" in the immediate vicinity of moving objects, as discussed below.
DV video information is carried in a nominal 25 megabit per second (Mbps) data stream. Once you add in audio, subcode (including timecode), Insert and Track Information (ITI), and error correction, the total data stream come to about 36 Mbps. Roger Jennings' paper on the Adaptec website runs through the detailed numbers.
What's the difference between DV, DVCAM, and DVCPRO?
Not a lot! The basic video encoding algorithm is the same between all three formats. The VTR sections of the US$20,000 DVCAM DXC-D130 or DVCPRO AJ-D700 cameras will record no better an image than the lowly DV format DCR-VX1000 at US$4,000 (please note: I am not saying that the camera section and lens of the VX1000 are the equals of the high-end pro and broadcast cameras: there are significant quality differences! But the video data recorded in all three formats is essentially identical, though there may be minor differences in the actual codec implementations). A summary of differences (and similarities) is tabled in Technical Details.
The consumer-oriented DV uses 10 micron tracks in SP recording mode. Newer camcorders offer an LP mode to increase recording times, but the 6.7 micron tracks make tape interchange problematic on DV machines, and prevents LP tapes from being played in DVCAM or DVCPRO VTRs. Sony's DVCAM professional format increases the track pitch to 15 microns (at the loss of recording time) to improve tape interchange and increase the robustness and reliability of insert editing. Panasonic's DVCPRO increases track pitch and width to 18 microns, and uses a metal particle tape for better durability. DVCPRO also adds a longitudinal analog audio cue track and a control track to improve editing performance and user-friendliness in linear editing operations.
Digital8?
Sony's Digital8 uses DV compression atop the existing Video8/Hi8 technological base. Digital8 records on Video8 or Hi8 tapes, but these run at twice their normal speed and thus hold half the time listed on the label. Digital8 will also play back existing Video8 and Hi8 tapes, even over 1394/i.link, allowing such tapes to be read into NLEs (at least, those for which the lack of timecode is not an issue -- batch capture utilities are unlikely to work, since Video8/Hi8 timecodes are not sent across the 1394 connection).
Digital8 is a camcorder-only format as of Spring 1999; no VTRs are expected. It appears to be the 8mm division's way of keeping its customer base from defecting to DV. By leveraging the massive investments of 15 years in 8mm analog camcorders and transports, the unit cost of Digital8 gear is kept very low, roughly half of what a comparable DV camcorder would cost, and its ability to play back legacy analog tapes is worthwhile for those with large libraries of 8mm.
All Digital8 camcorders can record from the analog inputs (at least outside the EU), and all are equipped with i.link ports for digital dubbing and NLE connections.
How good are the DV formats compared to other formats?
DV formats are typically reckoned to be equal to or slightly better than Betacam SP and MII in terms of picture quality (however, DV holds up better over repeated play cycles, where BetaSP shows noticeable dropout). They are a notch below Digital-S and DVCPRO50, which are themselves a (largely imperceptible) notch below Digital Betacam, D-1, and D-5. They are quite a bit better than 3/4" U-matic, Hi8, and SVHS.
On a scale of 1 to 10, where 1 is just barely video and 10 is as good
as it gets, I would arrogantly rate assorted formats as follows:
D-5 (10-bit uncompressed digital) | 10 |
D-1 (8-bit uncompressed digital) | 9.9 |
Digital Betacam, Ampex DCT | 9.7 |
Digital-S, DVCPRO50 | 9.6 |
DV, DVCAM, DVCPRO | 9.2 |
MII, Betacam SP | 9.1 |
D-3, D-2 (composite digital) | 9 |
1" Type C | 8.9 |
3/4" SP | 6.5 |
3/4", Hi8, SVHS | 5 |
Video 8, Betamax | 4 |
VHS | 3 |
EIAJ Type 1, Fisher-Price Pixelvision | 1 |
And, as we move into the 4:2:0 component DTV era, video will no longer be subjected to the delivery-point lowest common denominator of a single-wire composite feed: color-subcarrier composite was an excellent analog compression technology in the 1950s, but DTV obsoletes it and renders even the DTV consumer receiver essentially a component display.]
For a less biased discussion of DV quality, see the September 1998 SMPTE/EBU Task Force for Harmonized Standards for the Exchange of Program Material as Bitstreams Final Report, Annex C.
Also, Jim Feely of DV Magazine used
the Tektronix Picture Quality Analyzer, a "black box" that calculates before/after
picture differences and evaluates them based on Sanoff Lab's JND analysis
(a whole different topic -- in short, analysis based on modeling of the
psychophysical characteristics of human vision), to evaluate a variety
of formats for the May 1999 issue. The results are posted online as a PDF
file; discussion of the testing process is also available.
What are the DV artifacts I keep hearing about?
DV artifacts [Pix: Artifacts] come in three flavors: mosquito noise, quilting, and motion blocking. Other picture defects [Pix: Defects] encountered are dropouts and banding (a sign of tape damage or head clogging).
The most noticeable spatial artifacts are feathering or mosquito noise around (typically) diagonal fine detail. These are compression-induced errors usually seen around sharp-edged fine text, dense clusters of leaves, and the like; they show up as pixel noise within 8 pixels of the fine detail or edge causing them. The best place to look for them is in fine text superimposed on a non-black background. White on blue seems to show it off best. The magnitude of these errors and their location tends to be such that if you monitor the tape using a composite video connection, the artifacts will be masked by dot-crawl and other composite artifacts.
A spatial quilting artifact can also be seen on certain diagonals -- typically long, straight edges about 20 degrees off of the horizontal. These are minor discontinuities in the rendering of the diagonal as it passes from one DCT block to the next; so minor that they're usually invisible. Watching such diagonals during slow pans is often the only way to see the artifact.
Motion blocking occurs when the two fields in a frame (or portions of the two fields) are too different for the DVC codec to compress them together. "Bit budget" must be expended on compressing them separately, and as a result some fine detail is lost, showing up as a slight blockiness or coarseness of the image when compared to the same scene with no motion. Motion blocking is best observed in a lockdown shot of a static scene through which objects are moving: in the immediate vicinity of the moving object (say, a car driving through the scene), some loss of detail is seen. This loss of detail travels with the object, always bounded by DCT block boundaries.
Finally, banding or striping of the image occurs when
one head of the two on the scanner is clogged or otherwise unable to recover
data. The image will show 10 horizontal bands (12 in PAL countries), with
every other band showing a "live" picture and the alternate bands showing
a freeze frame of a previous image or of no image at all (or, at least
in the case of the JVC GR-DV1u, a black-and-white checkerboard, which the
frame buffers appear to be initialized with). Most often this is
due to a head clog, and cleaning the heads using a standard manufacturer's
head cleaning tape is all that's required. It can also be caused by tape
damage, or by a defective tape. If head cleaning and changing the tape
used don't solve it, you may have a dead head or head preamp; service will
be required.
JVC's Digital-S uses the 1/2" SVHS form factor for tapes and VTRs, although the tape cassette itself is more robust and the transport is equipped with sapphire guide roller flanges and tape cleaner blades and a new scanner design. One of the Digital-S players will also play back analog SVHS tapes, allowing its use for editing existing libraries of SVHS tapes as well as newer Digital-S footage. Head life (so far, in on-air broadcast usage) is well in excess of 4000 hours; equipment cost is very low (comparable to 25 Mbps DVCAM or DVCPRO); and maintenance expenses are well below those of the Betacam decks that Digital-S is typically displacing. So far only JVC is supporting this format, which has resulted in a less-than-headlong rush by the video community to embrace it. Watch it, though; it's hot. If you're doing high-end EFP on a budget, this is the format to use.
Panasonic's DVCPRO50 uses the same DVCPRO tapes and transports as its 25 Mbps DVCPRO products (there is also a 93-minute DVCPRO50 tape due out specifically for the AJ-D950 VTR, which Panasonic says should only be used in DVCPRO50 mode. When using standard DVCPRO tapes, the maximum recording time is about 61 minutes since the P123L cassette is being run twice as fast). DVCPRO50 VTRs will also play back DVCPRO tapes.
The 900-series DVCPRO50 kit is real jack-of-all-trades stuff. The AJ-D900W camcorder (US$39,900) will record either DVCPRO or DVCPRO50, in either 4:3 or true 16:9 modes. The AJ-D950 VTR (US$26,500) records and play back either DVCPRO or DVCPRO50, and additionally is switchable between 525/59.94 (NTSC) and 625/50 (PAL) formats. The only thing you give up is miniDV cassette playback; even with the adaptor the 950 won't read the tiny tapes. Fortunately the AJ-D940 DVCPRO50 player, due out in early 1999 for US$20,000 or so, will play back those miniDV tapes, and is supposed to offer a wider range of slo-mo speeds in the bargain. There's also a more affordable DVCPRO50 camera due early in '99, around US$29,000 or so.
Unlike Digital-S, second-sourcing is available from Philips, Hitachi, and Ikegami.
The DVCPRO50 kit is also a lot more portable and lightweight than Digital-S, so it's the format of choice if you're doing high-end EFP with a somewhat bigger budget and you want to keep your cameramen (and women) from wearing out as quickly!
Panasonic showed a mockup of a switchable DVCPRO/DVCPRO50 portapack
(field VTR that doesn't dock directly to a camera head) at NAB '98, as
well as prototype DVCPRO-P (480-line 60 Hz progressive scan) equipment
using the 50 Mbps payload to handle this interim SDTV format chosen by
Fox and NBC for the start of the DTV transitional era.
Four codecs for HD?
Both JVC and Panasonic showed mockups or prototypes of 100 Mbps DV-derived products at NAB '98 for handling HDTV. Both firms plan to gang four DV codecs together to get the 100 Mbps datastream, while preserving the same equipment form factor and operational methodologies used in the current 50 Mbps products. Panasonic calls their stuff DVCPROHD100, while JVC hasn't yet come up with all the necessary buzzwords.
It should be noted that both of these companies are well-placed to serve the growing DTV market whatever image format a broadcaster selects. Panasonic is selling a switchable 720p/1080i HD-D5 VTR (not based on DV technology), the AJ-HD2700, which has already become the studio standard VTR for the dawn of DTV. JVC's NAB '98 display featured Digital-S variants of most popular ATSC DTV formats -- 480i, 480p/30, 480p/60, 720p, and 1080i -- either in prototype or in simulation. These two companies will be pushing the edge of the DV envelope for quite some time to come...
Sony's HDCAM format uses compression technology "derived from DV and
with certain similarities", but it is not on the main branch of the DV
family tree. Its data rate of 135 Mbps yields beautiful images; it's extremely
rare to see a noticeable artifact in an HDCAM picture.
These are all shorthand notations for different sampling structures for digital video. They are also used for CIF and QSIF and suchlike MPEG frame sizes, but in the discussion that follows, I focus on the numbers for SDTV (standard-definition TV) digitized to the ITU-R BT.601 standards: 13.5 MHz sample frequency and 720 pixels per line.
The first number refers to the 13.5 MHz sampling rate of the luminance: "4" because (a) it's nominally almost approximately sort of four times the NTSC and/or PAL color subcarrier frequencies, and (b) because if it's "4" the other numbers can be integers whereas if it were "1" the formats would be "1:0.5:0.5", "1:0.25:0.25", and "1:0.5:0" respectively, and which would you rather try to read off in a hurry? The 13.5 MHz sampling yields 720 pixels per scanline in both 525/59.94 and 625/50 systems (NTSC and PAL/SECAM). This number applies to D-1, D-5, Digital Betacam, BetaSX, Digital-S, and all the DV formats just the same.
The other two numbers refer to the sampling rates of the color difference signals R-Y and B-Y (or Cr and Cb in the digital domain)
In 4:2:2 systems (D-1, D-5, DigiBeta, BetaSX, Digital-S, DVCPRO50) the color is sampled at half the rate of the luminance, with both color-difference samples co-sited (located at the same place) as the alternate luminance samples. Thus you have 360 color samples (in each of R-Y and B-Y) per scanline.
In 4:1:1 systems (NTSC DV & DVCAM, DVCPRO) the color data are sampled half as frequently as in 4:2:2, resulting in 180 color samples per scanline. The U and V samples are considered to be co-sited with every fourth luminance sample. Yes, this sounds horrible -- but it's still enough for a color bandwidth extending to around 1.5 MHz, about the same color bandwidth as Betacam SP (which, were it a digital format, would be characterized as a 3:1:1 format).
So where does 4:2:0 (PAL DV, DVD, main-profile MPEG-2) fit in? 4 x Y, 2 x R-Y, and 0 x B-Y? Fortunately not! 4:2:0 is the non-intuitive notation for half-luminance-rate sampling of color in both the horizontal and vertical dimensions. Chroma is sampled 360 times per line, but only on every other line. The theory here is that by evenly subsampling chroma in both H and V dimensions, you get a better image than the seemingly unbalanced 4:1:1, where the vertical color resolution appears to be four times the horizontal color resolution. Alas, it ain't so: while 4:2:0 works well with PAL and SECAM color encoding and broadcasting, interlace already diminishes vertical resolution, and the heavy filtering needed to properly process 4:2:0 images causes noticeable losses; as a result, multigeneration work in 4:2:0 is much more subject to visible degradation than multigeneration work in 4:1:1.
"Now how much would you pay? But wait, there's more!" In US implementations
of 4:2:0, the color samples are supposed to be vertically interleaved with
luminance, whereas in European 4:2:0 they're supposed to be co-sited. Practically
speaking, this is a headache for developers of codecs, encoders, and DVEs,
but for DV purposes it's not especially exciting, since only European DV
is 4:2:0.
Why does PAL DV use 4:2:0?
The best explanation I can come up with why PAL DV went with 4:2:0 is that both PAL and SECAM show reduced vertical color resolution and better horizontal color resolution compared to NTSC, so 4:2:0 seemed a closer match to the native display systems in PAL/SECAM countries. As PAL DV was intended as a consumer format for off-air recording or camcorder acquisition, multigeneration losses in 4:2:0 were considered a less important factor than the optimization of first-generation performance. PAL DVCAM also used 4:2:0.
When Panasonic developed DVCPRO, they opted for 4:1:1 even in PAL versions,
specifically for the multigeneration advantage. Thus PAL DVCPRO decks have
the pleasure and responsibility of handling both 4:1:1 DVCPRO playback
and 4:2:0 DV playback; they have extra hardware to digitally resample the
4:2:0 signal and come up with a decently synthesized 4:1:1. Sometimes there
is a reason for the higher prices that the poor Europeans are saddled
with when it comes time to purchase gear...
Can I chroma-key with 4:1:1?
Yes indeed. Many early DVEs were 4:1:1 internally; plenty of digital boxes out there still are (such as the Panasonic WJ-MX50 and Sony FXE-series vision mixers, both of which chroma-key). As previously mentioned, BetaSP could be considered a 3:1:1 format in terms of component bandwidth, and BetaSP is used for chroma-key applications all the time.
True, the chroma performance of 4:2:2 formats is superior to 4:1:1 formats, especially in multigeneration analog dubbing. Part of the standard JVC sales pitch for Digital-S is the superiority of 4:2:2 (which is true), and the utter doom and degradation that awaits you should you try to do anything -- including chroma-key -- with a 4:1:1 format (which is, shall we say, a wee bit exaggerated). But that doesn't mean that you can't do very satisfactory work in 4:1:1. A Bentley may not be as fancy as a Rolls Royce, but it'll still get you there in style. If you're used to the VW Beetle world of color-under analog formats, DV's Bentley should present few problems.
JVC has an excellent Digital-S demo tape showing multigeneration performance
comparisons of DV, Digital-S, and Digital Betacam; watch it if you can.
Just be sure you take the hype with a grain of salt...
Can I use 4:1:1 DV sources for upconversion to HDTV?
All SDTV source material will suffer when upconverted to HDTV, compared with material originated in HD to begin with. 4:1:1 material is reported by some to be problematic in this aspect; certainly a 4:2:2 original will be more forgiving and if upconversion is your primary goal, you may want to look closely at Digital-S or DVCPRO50.
Snell & Wilcox have run DV through upconversion and reports that it look OK, especially if the excessive aperture correction (edge enhancement) in most DV cameras is turned down.
Of more concern is that DV artifacts, especially mosquito noise, may become annoyingly prominent when upconverted. However, the jury is still out on this.
Also, all HD material (at least in the USA) is likely to be 16:9.
The way many DV cameras produce 16:9 by throwing away vertical resolution
is enough to send shudders up my spine for SDTV work; for HD, it'll be
a complete disaster. Perhaps I should add a section on shooting for HD
upconversion; there are lots of issues...
IEEE-1394 is a standard communications protocol for high-speed, short-distance data transfer. It has been developed from Apple Computer's original "FireWire" proposal (FireWire is a trademark of Apple Computer). Check out the white papers on Adaptec's website and check DVCentral's links for pointers to additional 1394 sites for detailed information.
Sony calls their implementation of 1394 "i.Link".
Why are DV and 1394 always discussed together?
They appear to have been developed together. The data stored on DV tape appear to reflect the packet structure sent across a 1394 link to a frightening degree of exactness. Certainly the DV format and 1394 High Performance Data Bus co-evolved, such that the first consumer DV camcorder in the USA (the Sony DCR-VX1000 and its single-chip brother the VX700) was also the first 1394-equipped consumer product available.
What does a 1394 connection do for me?
Plenty of good things:
Is 1394 that much better than Y/C or component analog?
Yes. A 1394 dub is a digital copy. It's identical to the original.
That's really nice.
Yes, you can do almost the same thing with a SMPTE 259M SDI (serial digital interface) transfer. But VTRs with SDI cost big money. 1394 is built into many low-end cameras and VTRs, and the connecting cable -- even at Sony prices -- is only $50. And transferring DV around as baseband video, even digitally, subjects it to the small but definite degradation of repeated decompression/recompression
If a digitally-perfect copy is a 10, and a point-the-camera-at-the-screen-and-pray
transfer is a 1, here's how DV picture quality holds up over different
transfer methods:
IEEE-1394 | 10 |
SDI | 9.8 |
Analog Component (Y, R-Y, B-Y) | 9 |
Y/C ("S-video") | 8 |
Analog Composite | 5 |
Point camera at screen and pray | 1 |
Locked audio is "audio done right": the audio sample clock (the digital time reference used in the sampling process) is precisely locked to the video sample clock such that there is exactly the same number of audio samples recorded per "audio frame" of video (not all TV formats and sound sample rates have a neat integer relationship between audio samples and frames, so an "audio frame" is my term [similar to a "color frame"] for the number of video frames it takes for audio and video to match up in the same phase relationship).
For PAL, 625/50 video, locked audio provides exactly the same number of samples per video frame with either 32 or 48kHz audio, but for NTSC, 525/59.94 video, the 48kHz "audio frame" is 5 video frames: locked audio will provide exactly the same number of audio samples for every five video frames, though not every frame within that 5-frame sequence has an equal number of audio samples. 32kHz locked "audio frames" cover a whopping 15 video frames!.
[There is such a thing as an AES/EBU audio frame, but I'm not sure it that's the same thing I'm referring to. Comments/clarifications welcomed!]
Unlocked audio: theory:
Unfortunately, such precisely-locked audio clocks are expensive. Since DV was designed as a consumer format, unlocked audio was allowed as a cost-saving measure. In unlocked audio, the audio clock is allowed some imprecision, such that there can be a variation from the locked spec of up to +/- 25 audio samples written to tape for every frame, instead of a precise and exact number.
This economy measure is simply one of allowing the audio clock to "hunt" a bit around the desired frequency; the phase-locked loop (or other slaving method) used to keep the audio sampling in sync with the video sampling can have a bit more slop in its lock-up, with the audio sampling sometimes running a bit slower, sometimes a bit faster, but always staying in sync over the long run. The total amount of sync slippage allowed in unlocked audio is +/- 1/3 frame -- not enough to really worry about.
It's the difference between walking a dog on a short leather leash, always forcing the dog to stay right by your side (locked audio), and using a long, elastic leash or one of those "retractable clothesline" leashes that allows the dog to run ahead a bit or lag behind (unlocked audio). In either case both you and the dog will get where you're going at the same time, but along the way the "unlocked" dog has a bit more freedom to deviate from your exact walking pace.
Unlocked audio should not cause audio sync to drift away from video over a long period of time. The audio clock is still linked to the video clock; it's just allowed a bit more oscillation about the desired frequency (more wow & flutter if you will) as it's trying to track the video clock. Like the dog on the springy leash, it can run a bit ahead or a bit behind the video clock momentarily (up to 1/3 frame ahead or behind), but in the long run it'll still be pacing the video clock and on average will be right there in sync with it. I have shot one-hour continuous takes of talking heads with a consumer DV camcorder (DCR-VX1000) and experienced no drift at all between audio and video.
DV cameras and VTRs generate unlocked audio, both in 32 kHz 12 bit and in 48 kHz 16 bit recordings. DVCAM and DVCPRO cameras and VTRs generate locked audio in 48/16 audio format, and DVCAM can also generated locked 32/12 audio.. 44.1kHz, discussed below, is never locked; it has no neat integer relationship with either 625/50 or 525/59.54 frame rates.
Some non-linear DV/1394 editors generate locked audio, some output unlocked, and some allow the choice. DV gear is happy to record locked audio via 1394, as is the DVCAM DSR-20 VTR. The DVCAM DSR-30 VTR can also be made to record unlocked audio with a bit of coaxing (see Tidbits).
Also, many non-linear editors output 16 bit 44.1 kHz audio (at least on PC platforms), which both DV and DVCAM 1394-equipped decks record without any problems. 44.1 kHz is part of the Blue Book spec, so this is not too surprising.
(Many thanks to Earl Jamgochian at Sony for filling in and clarifying many of the details in this section!)
Unlocked audio: real life:
This was revealed at NAB '99 by Randy Ubillos, lead engineer on Final Cut Pro, who has found that while some cameras are pretty good, Canon cameras grab 48kHz sound at around 48.009 kHz, which can result in almost a second of video/audio slippage over the course of an hour (or around one frame every two minutes). Sonys, by contrast, seem to average 48.001 or 48.0005 kHz, resulting in perhaps a couple of frames of slippage over the same time period. Clocking rates for other cameras were not discussed.
In normal playback of the DV tape this isn't seen, since on playback the audio is played back based on its embedded clocking data, in sync with the image. Both the audio and video slave to the data samples in each packet; as these are commingled in the DV datastream, the sound and picture will always play back in sync.
In most DV NLE systems to date (May '99), it was also not a problem, since captures were limited to under ten minutes due to the 2 Gigabyte file size limit and the slippage seen in this short time period was minimal.
Final Cut Pro, however, uses file referencing to span the 2 Gig limit, allowing captures limited only by available disk space, and the QuickTime media format used treats audio and video as separate tracks, each with its own time reference. When capturing long clips, the drift can become apparent; Final Cut can measure this drift and recalculate the audio sample frequency so that QuickTime playback will stay in sync.
As far as I can tell, the AVI file format used in some Windows-based NLEs does not allow this sort of long-term slippage to occur, but I may simply lack sufficient data. I do know that various QuickTime-based DV NLEs have shown certain oddball audio/video sync problems that I have not seen or heard of in AVI-based NLEs; this is not a QuickTime problem per se, merely an artifact of QuickTime's flexible and elegant approach to multiple-track media streams in that such problems can be made to occur.
Will unlocked audio hurt me? How do I deal with it?
When using analog audio I/O, the whole question of locked vs unlocked is moot: it's analog and there are no clocks to worry about. Analog is always safe to use for dubbing or editing. As discussed above, DV audio data are converted to analog in real time as the data come off the tape, and audio slippage simply doesn't occur regardless of the accuracy of the sampling clock.
It should also be of no concern when taking the audio in via 1394 to a DV-based nonlinear editing system. When all the audio samples are stored in a neat memory array, the software doesn't care if there was some timebase instability on the original recording; when non-real-time rendering is occurring, a sample is a sample is a sample.
However, some long-term slippage between audio and video can occur in long clips, at least in QuickTime format, if the capture application doesn't compensate for any audio clock inaccuracy. Fortunately, the problem is understood by those in the business (at least at Apple and Digital Origin), and corrective measures are taken at capture time: Final Cut Pro measures the actual number of samples captured over time vs. the theoretical number, calculates the actual effective sampling rate, and uses that in QuickTime file processing.
Unlocked is only a potential problem when doing real-time audio
and video editing with digital transfer of the audio between source
and recorder. "Digital" means conveyance of the audio using the IEEE-1394
bus, AES/EBU digital audio outputs (on pro DVCAM/DVCPRO VTRs), or SDI embedded
audio (ditto).
As far as DV-based editing is concerned, when you make an edit in
the digital domain between two different DV datastreams using unlocked
audio, you might wind up with a few too many audio samples or not quite
enough, in which case you can get a click or pop on the soundtrack during
playback as the audio subsystem either has to discard some extra data and
resynchronize (an audio buffer overrun), or as it winds up with too few
bits of sound to cover the time available (buffer underrun) and you get
a momentary dead spot or mute effect (depending on the audio circuitry
used, the system may also mute when it's resynchronizing after discarding
samples). In either case the audio glitch will occur in a fraction of a
second; it won't result in several seconds of dead audio or any prolonged
audio noise. Reportedly, it's also only a problem at the out-points of
insert edits, not at edit in-points (unverified).
Interestingly enough the same problem may occur when cutting between two locked audio streams without regard to synchronization of the "audio frames", though here the problem is much smaller in scope since the variation in sample counts will only be +/- 2 samples per video frame. Such errors are typically inaudible, though they may still complicate things if the audio track is then used in real-time digital audio mixing (see below), and they'll only occur in 525/59.94 video, never 625/50 due to 625's 1:1 relationship between video frames and "audio frames".
[It's also worth noting that any hard cut between clips can result in a pop or click if the instantaneous level of the audio at the cut point is mismatched, causing impulse noise. This is true in locked or unlocked audio; it can even occur when working in analog. This is one reason that linear analog audiotape and film fullcoat mag tracks are often spliced at an angle instead of with a straight cut; this mechanically performs a quick crossfade between the two tracks instead of an abrupt transition.]
When all you are doing is editing one generation down from camera originals to an edit master, and then making release copies on an analog format such as BetaSP, SVHS, Hi8, VHS, or the like, all you need to be concerned about is audible popping or muting. The release copies will contain an analog track that records what you hear; there are no hidden gremlins due to asynchronous clocking, jitter, or other nasties that so complicate digital audio.
However, when you take the digital audio datastream from a DV tape and try to integrate it into a larger digital audio system, such as AES/EBU routers, digital audio workstations (DAWs), and/or multitrack digital audio recorders including the Alesis ADAT and Tascam DA-88/98, the sloppy synchronization of unlocked audio can cause glitches, artifacts, and distortion. If the receiving gear is trying to derive its audio clock from the unlocked audio datastream, the entire downstream audio chain can be rendered unstable and disfunctional.
Furthermore, playback of unlocked audio including edit-point glitches as discussed above into a DAW or other digital audio system can cause a major commotion when the edit-point glitch is played back. Ever had a really bad splice go through the gate on a film projector, or past the heads on an analog audiotape recorder? A glitched unlocked audio edit is the digital equivalent of that crummy splice, only worse!
Fortunately it's fairly simple to avoid this. Either convert unlocked audio to locked, or use analog audio connections between your unlocked source and the digital audio chain you're feeding (and if your source tape has 44.1kHz/16 bit or 32kHz/12-bit sound, going analog into the digital system means that you get a rate conversion into 48kHz sound at however many bits are being used courtesy of the A/D converter on the professional digital system; it may actually sound better -- and be easier -- than hooking up digital sample rate converters in the chain).
There are four known ways to convert unlocked audio to locked audio:
1) The DSR-60/80/85 DVCAM VTRs will convert unlocked audio to locked audio on playback. DVCPRO VTRs are also supposed to relock DV audio on playback. This solved your problem at the point of playback. If you need to make a tape with locked audio, then...
2) Dub your DV tape to a DVCAM or DVCPRO tape using analog audio connections between the source and the recorder. Hey presto, locked audio! The video can be dubbed via SDI for minimal if any losses. This is also the recommended route of your source audio is not 48kHz since you want the dub to have 48kHz audio for best compatibility.
3) Play back the DV tape in a high-end DVCAM or DVCPRO VTR, and dub it to a DVCAM DSR-80 or DSR-85 using either the AES/EBU digital audio or the SDI embedded audio options. The recorder will reclock the data and write locked audio to tape (this may also work with high-end DVCPRO machines, but I haven't confirmed this).
4) Transfer your footage into a non-linear editor that allows outputting
locked audio, and use the NLE to write out locked audio, even to a DV-format
tape. Slow and cranky, but it works.
The best thing when doing a linear edit is to use analog audio, or (if the only changes you have are between locked and unlocked audio) use the digital outputs from a high-end VTR as described above. For non-linear editing, capture clips each containing only a single format of audio; when you render the finished project, all the audio will be converted to a common format.
Does unlocked audio explain why my audio loses sync in Adobe Premiere?
Sorry, no! Adobe Premiere 4.2 and earlier versions have a historical
problem with synchronous audio playback from the timeline. As discussed
above, unlocked audio doesn't drift over the long term. Premiere audio
can drift regardless of whether the source was locked or unlocked. This
particular problem is variously attributed to the difference between 30
Hz and the 29.97 Hz that NTSC runs at; the inability of an AVI or QuickTime
file to maintain synchronous audio; the weakness of the Windows VFW subsystem
at really keeping things in sync, and the phases of the moon (if anyone
knows what's really going on, this author would appreciate being
appropriately enlightened).
Reportedly Premiere 5.1 fixes audio sync problems. Certainly I've had
no problems with Premiere 5.1 on Windows editing clips up to 9:30 in length
(the 2 Gig limit of my AVI-based system), nor have I heard of any such
problems in discussions with other people.
Certainly! Much of the fuss that's made over DV formats is in regard
to non-linear editing, but it works fine for linear editing as well.
DV gear interoperates with Hi8, SVHS, Betacam, MII, D-5, and other formats
using composite, Y/C, component analog, and serial digital I/O (see Technical
Details for which VTRs offer what I/Os). It works fine with the editors
and SEGs and DVEs and terminal gear you're used to using.
What sort of linear editing gear can I get in DV? What sort of machine control is there? How accurate is it?
Low end: The Sony and Canon camcorders as well as the DHR-1000 and DSR-30 VTRs are all remote-controllable using the Sony Control-L (LANC) protocol. The Panasonic camcorders (some of them at least) have 5-pin Panasonic ("Control-M") ports. All work fine as edit sources.
The JVC DV camcorders offer "J-LIP" ports for remote control and editing. I haven't seen any editors that support J-LIP protocol directly (but see "mid-range" below).
The DHR-1000 and DSR-30 VTRs have built-in 10-event cuts-only editors as well as separate audio and video insert-edit capabilities, allowing them to be used as the controller in bare-bones cuts-only LANC editing. These decks, while rated at +/- 5 frames accuracy, appear to be frame accurate better than 90% of the time. In-points on the DHR-1000 appear to be frame accurate all the time and there's no reason to expect that the DSR-30 is any different. Out-points may occasionally be off by a frame or two.
If you don't want to use the built-in controllers on these decks, there are a variety of standalone edit controllers that talk LANC and/or control-M. Among these are Videonics' AB-1 Edit Suit and Video Toolkit, and TAO's Editizer, all notable as being control-agnostic systems: depending on the cables used and the setups performed, these will control any mixture of RS-232, RS-422, LANC, and control-M VTRs (great for interformat editing). In my experience TAO Editizer's accuracy is typically +/- 1 frame, with the actual in-point on the DHR-1000 being frame accurate but with the feeder decks being off by perhaps a frame about 20% of the time -- not bad, given that these decks don't capstan-bump and Editizer doesn't varispeed 'em in preroll. (Note that these editors typically only support assemble editing on LANC or control-M recorders; historically, that's all that LANC/control-M machines have been capable of in their Video8, Hi8, and SVHS incarnations.)
Mid-range: you can integrate low-end gear with high-end editing systems by using protocol converters, so that the lowly camcorder or VTR appears to be a standard, RS-422 protocol edit source. Note however that for the most part these protocol converters allow the low-end decks to serve as edit feeders only, not recorders.
LANC: Sony provides the IF-FXE2 LANC Interface Box, while TAO offers the L-Port 422 LANC to RS-422 converter.
Control-M: TAO is coming out with an improved L-Port 422 that also talks control-M (Panasonic 5-pin).
J-LIP: JVC offers the SA-K38U Control Interface, designed to allow the BR-DV10u dockable DV recorder to be controlled by an editor using either the RS-422 or JVC 12-pin interfaces. It probably works with the consumer DV camcorders as well, although I haven't verified this.
With all of these, the accuracy is likely to be in the +/- 1 to +/- 5 frame range depending on the edit controller used and the ballistics of the other decks involved.
High-end: The DSR-60/80/85 DVCAM and AJ-D6XX/7XX series DVCPRO
VTRs use industry-standard RS-422 serial protocols for assemble and insert
editing. They are frame-accurate, no-nonsense machines you'd use in editing
just like BetaSP, MII, DigiBeta, or D-5 VTRs.
Is DV timecode the same as SMPTE timecode?
No, technically speaking; yes, for most practical purposes (!).
There's a great deal of confusion about timecode. There are two different aspects of timecode that people mix up: how is it recorded on tape, and how is it used in editing. The first aspect is where "SMPTE" vs "RCTC" (Hi8) vs "DV TC" vs "Frame Code" (series 7 U-Matics) vs "CTL Time Code" (JVC SVHS) matters; it's largely a concern for historical reasons. The second aspect is what really matters: how does your editor see timecode. Nowadays, for the most part, how it's recorded on tape is irrelevant for this discussion.
Back in the dark, early days of linear editing with analog formats (the 1970s!), frame accuracy was not possible. Some clever folks came up with the idea of recording a unique code on every frame, so that edit controllers could repeatably reference an exact frame on tape. That developed into two timecode recording formats -- LTC and VITC -- that were formally standardized by the SMPTE and EBU, and adopted by manufacturers worldwide. The SMPTE/EBU timecode standards define where the timecode is recorded on tape, what amplitude the signal is, the encoding of the digital data, and so on. The standard also describes the time format of the timecode (HH:MM:SS:FF, two digits each of hours, minutes, seconds, and frames), and the format of "user bits", a separate set of hexadecimal digits the actual usage of which was left up to the individual.
LTC ("litsee") is Linear Time Code, a 1 volt square wave laid down either on a linear audio channel or on a dedicated timecode track. It is comparatively simple to build LTC into a VTR or to retrofit it to a VTR without timecode, as it's technically simple and requires no mucking about with the video signal itself. However, it's difficult to read during some off-speed tape motions (as when shuttling or scanning the tape) and impossible to read when the tape is paused.
VITC ("vitsee") is Vertical Interval Time Code, is a series of black and white pulses encoded into one line of the vertical interval of the video signal itself. VITC can be read even during pause mode (as long as the VITC line in the the video signal is readable) but it requires the rotating video heads to scan the tape, which isn't always possible during high-speed searches or shuttles. It's also more complex to implement, since you need to switch it into the video signal.
Back when proprietary multipin control cables were used to control VTRs, it was important to know that "SMPTE timecode" was used, since you had to use an external box to extract the timecode from the the LTC track or the VITC line, and adherence to the standard way of recording the timecode on tape was necessary to guarantee recovery of the signal.
In the past decade or so, however, most editing systems and most VTRs have been moving to standardized serial control protocols, such as RS-422 or LANC (actually, RS-422 is a wiring and signal spec, not a protocol per se, but most of the "RS-422" gear out there speaks the same language derived roughly from the original Sony BVU-800 control protocol, with minor variations between different machines). In such systems, timecode data flow across the same wires as the control data; it's up to the VTR to read timecode however it's written on the tape and turn it into a simple serial communications byte stream.
Furthermore, the SMPTE-spec timecodes aren't ideally suited to newer generation tape formats with slow tape motions, such as Video8, Hi8, and DV. LTC needs a fast-moving tape for proper data recording and recovery, and these formats just don't move the tape fast enough. Also, these formats already have digital data sectors on tape; why convert digital timecode to analog waveforms when you can record it as digital data to begin with?
Thus we have professional Hi8 with "Hi8 Timecode (but not really SMPTE timecode)" and consumer Hi8 with RCTC: "Rewriteable Consumer TimeCode". These are recorded as digital data in the subcode section of a Hi8 track. But an edit controller doesn't care; when it asks for timecode, it gets back something of the form "HH:MM:SS:FF", never you mind how it was recorded on tape! Likewise, the DV formats do digital magic to store timecode, but when an edit controller asks for it, it gets the same data over the wire that it would from a Hi8 VTR -- or a 1" Type C VTR, or Digital Betacam, or 3/4", or whatever.
Adding to the confusion is the "SMPTE TC" option for the EVO-9800 and 9850 Hi8 decks: This board takes the digital Hi8 TC or RCTC data and formats it into a 1 volt square wave signal as if it were coming off of an analog LTC timecode track. This allows the 9800/9850 to be used with edit controllers that don't understand serial timecode, but do their own recovery of it from the SMPTE LTC signal.
Are we having fun yet? The modern-day DVCAM DSR-60/80/85 and DVCPRO AJ-D6XX/7XX decks have this option built-in, allowing these VTRs to be used with editors that are expecting a noisy, distorted analog timecode and don't want nice, clean, serialized timecode data handed to them on a silver platter... really, though, I shouldn't be so snippy: while it sounds goofy from a technical standpoint, it provides backwards compatibility with a large installed base of very expensive editors, as well as a whole host of ancillary equipment that generates or takes in the SMPTE LTC signal. There are still occasions where having that LTC signal available on a BNC connector can be helpful, or downright necessary.
The bottom line is this: don't worry about whether or not the timecode recorded on tape is SMPTE or not. What matters is whether or not you have timecode, period (and DV does have timecode). Any modern-day edit controller should be able to use the timecode available over a serial protocol connection. For those that don't, or if you need SMPTE LTC I/O for other equipment (i.e., for a chase-lock audio synchronizer, an under-monitor display, or for jam-syncing of timecode from a common reference), there are DVCAM and DVCPRO decks that offer "SMPTE timecode" I/O ports.
Non-linear editing (NLE) is editing using random-access video storage, so that you don't have to wait for tape to shuttle to see a scene at the other end of the reel. Nowadays, this almost always means computer-based editing where you've transferred the video from tape to hard disk, and you assemble a show by arranging the clips along a timeline on the computer screen. When you're done, you output to tape, which happens either immediately (if you've spent a lot of money on gear) or after a rendering operation (if you've spent less money).
The "big names" in NLE are Avid (Media Composers of various flavors, models, qualities, and capabilities), Accom (formerly Scitex, formerly Imix) with its "sphere" products (descended from the VideoCube and TurboCube), Quantel (Harry, Henry, Harriet, EditBox, etc.), Media 100, D-Vision (turned into Discreet Logic Edit, and now is called Discreet edit*), and half a dozen more up-and-coming, hanging-in-there, and/or where-are-they-now companies. These typically supply turn-key systems in the $15,000 to $150,000 range, even though some are built using open platforms such as MacOS, Windows NT, Truevision Targa cards, and the like. Sony and Panasonic each have two DV-native NLEs.
On the PC and Mac, at Prices For The Rest Of Us, the familiar names are Adobe Premiere, ULead Media Studio, Speed Razor, MotoDV, Video Action, and the like. These are software packages that work with (and are often bundled with) a variety of plug-in cards, including DPS Spark Plus, Pinnacle DV300 and DV200, Fast DVMaster, Canopus DVRex and DVRaptor, ProMax FireMAX, and so on.
What's special about DV non-linear editing?
DV is compressed just enough to be able to stream into and out of current-day PCs and Macs, and the availability of inexpensive 1394 I/O cards and fast SCSI-2 hard disks means that high quality video storage and manipulation on desktop computers is now possible for the first time without having to spend a king's ransom on specialized RAID arrays and proprietary codecs.
DV can be stored and manipulated in native form, without transcoding to JPEG, MPEG, Wavelets, or the like. The same high quality seen on DV tape is maintained in the computer.
You can put together a DV editing system with 90 minutes of online storage for under $4000, and have a workable system that produces broadcast-quality output. If you already have an appropriate PC or Mac, you can get into DV editing for under $1200 (a 1394 card with editing software and a 9 Gig A/V hard disk). Of course, you can spend a lot more, adding onscreen, full-resolution scrubbing; more storage; better machine control and the like. But the high video quality is there from the start, even in the sub-$5000 system. This is a watershed moment in the evolution of affordable desktop editing.
Who makes non-linear editing stuff for DV? What gear is available?
The answer to these is changing almost on a daily basis. These are exciting times.
Low end: A variety of "soft codec" systems are available for PCs and Macs, starting around US$500 for the board and editing software. Among these are the Canopus DVRaptor, DPS Spark and Spark Plus (PC), the Pinnacle/Miro DV300 and DV200 (PC or Mac), the Promax FireMax (Mac), and Digital Origin' EditDV (Mac or PC). These systems only accept and output DV using an IEEE-1394 connection, although if you have other formats and a DV VTR, you can first re-record the video on the DV VTR and then bring it into the system (the DVCAM DSR-20 allows real-time composite or Y/C transcoding to DV without first recording the image on tape). By the same token, you can output to analog video using the DV VTR as a digital-to-analog converter.
In mid-November 1998, Sony introduced a standalone DV/analog transcoder box, the DVMC-DA1, which goes for under $400 at ProMax. You can also order them through Akiba Exports in Japan (they accept American Express and wire transfers only, as of November 1998 [Jim Akiba has started working for Canopus as of April '99; he's kept up the export business, but DVMC-DA1s have vanished from sight]). With this box, real-time transcoding of DV to/from analog (composite or Y/C) can be added to a "soft codec" system, making such systems more viable for use with analog sources, and bringing them closer to "hard codec" systems (below) in convenience.
In May/ June of 1999, the DA1 was discontinued; a new model (DVMC-MS1?) is expected to be released at around US$600. In the latter half of 1999, ProMax is expected to have a 1394 to Y/C / YUV / SDI(601) transcoder box of their own design for around US$1500, as well.
The editing software supplied is Adobe Premiere, ULead Media Studio Pro, Speed Razor, Final Cut Pro, or something similar. Often, a separate DV capture/output utility is also provided.
Mid range: "Hard codec" board sets with editing software run around US$3000. The Canopus DVRex and FAST DVMaster are two PC-based examples. These typically allow the use of other formats with real-time transcoding to and from DV; DV is the native format used on-disk. "Hard codec" systems also typically allow better performance during "scrubbing" and other manual editing tasks but are not necessarily any faster at rendering the finished show.
The software supplied may be Premiere, Media Studio Pro, or -- for the DVMaster Pro -- InSync's Speed Razor DV. A separate capture/playback application can also be used.
(I discuss "soft" and "hard" codecs later in this FAQ.)
High end: FAST's "blue." system provides "any format in, any format out" editing for US$60,000 or so. The captured video stays in its native form on disk (DV, M-JPEG, BetaSX, DigiBeta, ITU-R-601, etc.; analog formats are transcoded to a digital format when captured) and is only transcoded when necessary to do effects between streams in different formats or when outputting to a different format. blue. is supposed to ship some time in 1999.
"blue." has its own capture and editing application, developed with the experience gained from FAST's Video Machine line of products and incorporating user feedback. It's quite impressive, but perhaps a bit more expensive than readers of this FAQ are willing to put up with. :-)
Both Sony and Panasonic have Windows-based, turnkey systems for DVCAM (ES-3, using the blue. software; ES-7) and DVCPRO (DV Edit, NewsByte) respectively, that exploit the added features of these higher-end DV formats such as 4x transfer and editing metadata (in DVCAM, "ClipLink" good/no-good shot markers and clip picons provided by high-end DVCAM camcorders; in DVCPRO similar data are stored as "Picture Link" information). Prices start around US$25,000 (no real-time 3D effects, no 4x transfer) and go up from there.
There are also realtime transcoders coming on the market (such as Truevision's Madras and Como's DVbox, or the Sony DVMC discussed above, or the expected ProMac box) that transcode DV into other digital and analog formats for use with Avid, D-Vision, and similar high-end editors, typically using M-JPEG (motion JPEG) formats on-disk.
Avid showed a DV-native editor under Windows NT at NAB '98; it's shipping in 1999.
Can I build a PC- or Mac-based NLE system myself?
Yes. If you don't mind opening the computer case and fiddling with the innards, you can buy one of the low-end or mid-range board sets and do it yourself. But be warned, it's not a trivial task. All of these systems are very new, and most still have some bugs and incompatibilities. Also, DV systems pushed the limits of what you could do with early-to-mid-1997 PCs and Macs. Now, in the spring of 1999, most of the machines being shipped have the horsepower to handle DV (new blue Mac G3s, some Compaqs, and some Sony VAIOs even have 1394 built in), but it's still asking a lot from the computer to move DV data around at 3.6 MBytes/second without a glitch or hiccup. Careful attention to detail and optimization of system configurations and drivers are often required. Also be prepared to download the latest drivers from the Internet; often you'll need new video card drivers as well as newer drivers for the brand-new 1394 board you have just purchased.
Part of the joy of an "open systems" approach to building an editing system is that the list of possible conflicts and incompatibilities between different components of the system is huge and mutable. Scan the vendors' websites for lists of known good and/or known incompatible combinations of chipsets, hard disks, SCSI controller, and the like. If you're still in doubt, ask your local VAR (Value Added Reseller, the fellow you're going to buy the stuff from) about whether the stuff you're considering will all work together, or call the vendors directly and ask 'em if their board will work with your computer. One good tactic, if you're starting from scratch, is to settle on the DV card and software first, then buy a computer and the other components known to work with it.
Better yet, if you're a video producer and not especially interested in fiddling with the innards of PCs and Macs, have your VAR build a system to your specifications. Let them fight IRQ limitations and driver-incompatibility hassles -- and be willing to pay for it. If time is money for you, think about how much time it would take to resolve these hassles yourself (it took me the better part of three days to get my DPS Spark installed, working, and stable enough for my satisfaction, since Windows decided to reshuffle interrupts every time I rebooted, and I had an old Matrox Millenium driver that hogged the PCI bus. During that time I was only half as productive as normal: what's 1.5 days of your time worth?).
If you're Mac-based or at least platform-agnostic, I recommend checking out the turnkey FireMAX systems from ProMax in Southern California. Their systems work with a minimum of hassles and their prices are very aggressive. Good customer support, too. Alone in the DV NLE world, FireMAX offers a low-res "offline" capture mode that fits over 1/2 hour of video in a gig, with automated batch recapture of full-resolution clips for your final "online" assembly; and full support for four-channel audio. The FireMAX "C" board is one of only two DV systems certified by Adobe as Premiere 5.1 compatible as of December 1998 (the other is Radius MotoDV). You can even order a custom-configured JLCooper control panel with jog-shuttle wheel and 20 dedicated Premiere function keys, for the traditionalist button-masher in all of us. Bloody amazing! [Disclaimer: I don't work for ProMax, nor do I get any sort of profit from recommending their stuff. I don't even own a FireMAX system... yet! Nor should I dissuade you from using other VARs or vendors, and/or getting a PC-based system.]
On the other hand, if you're a certifiable lunatic like me, just have at it! Just realize that it's still a "Plug and Pray" world inside that PC's case, and no, it's not an evil conspiracy against you when it doesn't work the first time. That's just the state of the art on the bleeding edge of desktop video technology...
How much DV fits in a Gigabyte?
1 Gigabyte of storage is about 4 minutes 45 seconds of DV video. 2 Gigs is about 9 minutes 30 seconds.
The rule of thumb I use when estimating the storage I'll need for a project is 4.5 minutes per gig. Thus a 9 Gig drive works out to about 40 minutes of storage. An array of four such drives yields 2.7 hours. Not bad for about US$1200 (USA retail prices, March 1999). And nowadays 9 Gig dirves are small; you can get UDMA 16.8 Gig disks for almost the same price...
Why is there a 2 Gig limit? How can I avoid it?
Two things lead up to the dreaded 2 Gig limit: the operating system and the file format.
Operating Systems such as Windows have maximum sizes they allow for a "logical drive". For example, Windows 95 running the FAT16 file system (or Windows 3.1, or MS-DOS) can't access any more than 2 Gigabytes on a drive. That's why you wind up partitioning that nice 9 Gig Drive into five "logical drives", four 2 Gigs and one stubby little 468 Meg drive ("9 Gigs" is specified for 1 billion [1000 x 1000 x 1000] bytes per Gig, whereas the logical drive sizes seen under Windows use 1024 x 1024 x 1024 bytes per Gig -- about a 7% difference in the resultant numbers).
If you have Windows95 OSR 2 or Windows98, you can format the drive with a FAT32 file system and avoid this limit (I don't know offhand what the new limit is). With WindowsNT, you can use NTFS.
MacOS Systems 7.5 and higher also have much larger maximum partition sizes than 2 Gigs; 4 Gigs starting with OS 7.5, and 2 Terabytes (!) starting with OS 7.5.2.
The file format used for the stored video can also have the 2 Gig limit. On PC systems, the most common format is AVI (Audio/Video Interleave), which is limited to 2 Gigs. QuickTime files (Mac or PC) are also limited to a 2 Gig maximum file size at present, even if the disks the files are stored on can be bigger.
There are tricks to get around these limits. Some involve using specialized codecs that use indirection (the AVI or QuickTime file stores pointers to other files instead of raw data, similar to a Premiere "reference movie"); the Canopus DVRex and DVRaptor manage to address 4 Gigs of data in an AVI file through some such sleight-of-hand, while the FAST DVMaster uses a proprietary, non-AVI format for storage with no 2 Gig limit anywhere in sight. Final Cut Pro uses QuickTime reference files for seamless capture and playback without concern for the 2 Gig limit.
Right now, Panasonic's DV Edit and NewsByte editors don't have any 2
GByte limitation. I don't know about Sony's ES-3 and ES-7.
What are SCSI-1, SCSI-2, Ultra-SCSI, etc.? What do I really need?
These are all peripheral buses for connecting hard drives (among other things) to computers.
SCSI-1 is the "original" SCSI. It's an 8-bit bus with a maximum 5 MB/sec transfer rate. As DV requires 3.6 MB/sec sustained, SCSI-1 is generally too close to the edge for reliable DV transfers. Remember, that 5 MB/sec rate assumes no hiccups, and your computer has more to do than just wait around to dump DV data to/from the SCSI bus.
SCSI-2, also known as "fast SCSI" or "fast narrow SCSI", doubles the data rate to 10 MB/sec. This is usually acceptable performance for DV capture and playback
Fast-Wide SCSI uses a 16-bit data path for 20 MB/sec peak transfer rates (for this, you need to use the 68-pin cable, not the 50-pin Centronics or DB25 cable for slower flavors of SCSI). Likewise, Ultra SCSI or SCSI-3 yields 20 MB/sec through faster data clocking. Fast-Wide or Ultra SCSI drives are fine for DV editing.
Wide Ultra SCSI (Fast Wide 20) combines the 20 MHz transfer rates with a 16-bit bus for 40 MB/sec, really quite a bit faster than needed for DV. There are even faster variants of SCSI, but these are exotic and expensive and are definitely overkill.
Want the big picture? Check out http://www.adaptec.com/products/guide/ioposter.html for a comprehensive matrix of I/O technologies. This big table lists all the SCSI flavors and everything else from parallel ports to USB to SSA to Fibre Channel to -- yes -- 1394.
Oh, yes: make sure that your hard drives are capable of the performance
you need; just because a drive plugs into an Ultra-SCSI cable doesn't mean
it can provide the sustained throughput needed for DV capture and playback.
"A/V-rated" drives are a good bet; in general, check for 7200 rpm or faster
rotation rates, plenty (512kB or more) of on-board read/write cache, and
an advertised A/V capability. Faster never hurts: remember that time the
computer sits around waiting to push data onto or read data off of the
drive is time it isn't spending feeding data to/from the 1394 I/O card,
updating the computer screen, reading the VTR's current position, or controlling
the VTR.
What about Ultra-DMA?
Ultra-DMA, also knows as UDMA, Ultra ATA, or Fast ATA-2 , is a further enhancement of the EIDE disk-drive interface, available on the newer G3 Macs and on some PCs (Intel 440TX, LX, BX, and later chipsets; VIA/AMD VPX, VP2/97, AMD-640 chipsets; Promise Ultra33 (FastTrack) controller. Win95 and WinNT require an upgrade to exploit UDMA; Win98 supports it fully). UDMA drives tend to be a lot cheaper that SCSI-3 drives, and are often capable of stutter-free capture and playback of DV data. UDMA allows best-case transfer rates of 33.3 MB/sec, compared with the 16.6 MB/sec best-case transfer rate under EIDE without UDMA (of course, this is only one of the bottlenecks in real-time DV work, which is why a fast raw transfer rate alone is not a sufficient indicator of DV suitability).
The early "blue" G3 Macs captured and played back DV without problems on UDMA drives, but not SCSI -- the PCI chipsets used (as of February 1999) appear to cause problems even with fast SCSI-3 drives. Apple released an update for the SCSI controller code in June 1999 that is supposed to solve these problems.
UDMA drives are backwards-compatible with IDE/EIDE controllers; you can drop a UDMA drive into an older computer and it will work. However, to get the level of performance needed for real-time DV work, you may need to have a UDMA-compatible controller with BIOS and OS support -- though I routinely play 9 minute clips from a Maxtor UDMA drive on a plain old EIDE controller with no dropped frames.
[More info
on UDMA, from Maxtor]
Why doesn't my non-linear editor see timecode if it's already on the tape?
Unfortunately, some current DV NLEs do not capture timecode into the clips stored on disk. This is not a hardware problem, it's a problem with the capture programs used by these editors. As the market matures, expect the software to gain this capability (and if it doesn't, ask your NLE vendor why not).
The 2.00 and later software releases for DPS Spark (PC) do capture
timecode, as does Pinnacle/Miro version 1.6 software. The Canopus plug-ins
for Premiere 5.1 (PC) capture timecode, though the stand-alone Canopus
tools do not. Timecode capture is present in ProMax's FireMAX (Mac) and
Apple's Final Cut Pro (Mac, of course!).
Codecs exist for all kinds of compressed video, including DV, motion-JPEG,
MPEG, Indeo, Cinepak, Sorensen, wavelet, fractal, RealVideo, vXtreme, and
many others. (Indeo, Cinepak, Sorensen, RealVideo, and vXtreme are trademarks
of their respective trademark holders.)
What are "hard" and "soft" codecs?
Hard codecs are hardware codecs, such as the Sony DVBK-1 or "DVGear" chip. You supply power and raw video at one end, and get compressed video out the other end in real time. Flip a switch and pump in compressed video, and raw, uncompressed video comes out.
Soft codecs are software modules that do the same thing, such as the
"DVSoft" codec that comes with the DPS Spark card. Unless your computer
is very powerful, though, and/or the codec is extremely simple (and the
DV codecs aren't that simple), it will take longer than real time
to compress or decompress the video stream, at least if you want the CPU
to do anything else at the same time.
Which codec is better?
That depends on what you're looking for, and what you want to spend.
In the world of nonlinear DV editing as of early 1998, here's how things
break down:
One thing to keep in mind is that "hard" vs "soft" doesn't matter when it comes to picture quality: both give excellent if not identical results. Be aware, though, that minor codec differences can cause accumulated errors over multiple compression/decompression cycles [Pix: multigen with different codecs]. For example, the Sony soft codec used with the version 1.0 release of the DV300 causes a considerable Y/C delay over ten generations, whereas the Adaptec DVsoft codec shows no such problem or a slight leftward chroma drift, depending on the testing done; the Radius codec seems to cause no drift either way. Not all DV codecs are designed the same way, as discussed by codec expert Guy Bonneau.
When capturing from or or outputting to DV VTRs using a 1394 connection, it doesn't matter what kind of codec you have. A DV-based editor stores the same data on disk that travels across the 1394 wire; no compression or decompression occurs. Thus when you're doing capture or playback across a 1394 connection, all you're doing is a real-time data transfer; the codec isn't even in the loop.
The codec comes into play when you need to:
Displaying DV video on the computer screen: A hard codec frees up the computer's CPU to do things like shuffle video-overlay data around, whereas the soft codec takes CPU resources to decompress the DV video for computer display. Thus, all else being equal, the hard codec systems will offer larger real-time video windows on the computer display, and will allow better real-time jogging, shuttling, and scrubbing.
Hard codec systems such as the FAST DVMaster and Canopus DVRex display decently-sized (up to 360x240 or more) onscreen windows with real-time, 30fps scrubbing; the actual window size is only limited by the speed of the graphics card and its overlay capability. By contrast, soft codec systems such as DPS Spark and Pinnacle/Miro DV300 offer near-real-time scrubbing only in tiny, 120x80 windows; if the windows get much larger the frame rate drops off dramatically because the soft codec is taking too much of the CPU's time to allow for timely updates to the screen (though with the proper video card under Windows NT, DV300's software allows much improved performance in this operation). Final Cut Pro running on a 300 MHz G3 Mac can scrub almost full-screen DV to the Mac monitor and out the FireWire port using the soft codec alone, with little stuttering or frame dropping. As processors speed up, so do the soft codecs.
Hard codecs also allow scrubbing to the video (not computer) monitor with no extra equipment: DV data are pumped into the codec as the timeline is traversed, and the codec outputs the raw video and audio to a TV monitor.
Soft codecs require that a 1394-equipped VTR such as a DHR-1000, DSR-20, DSR-30, or 1394 camcorder be used as an "offboard" codec/transcoder, or the Sony DVMC-DA1 standalone DV/analog transcoder box, to see video on a TV monitor. The soft codec can also decompress the DV for display on the computer monitor. While this works (as in ProMax's FireMAX editor, or using Adaptec's DVSoft codec on the PC in Premiere), remember that if at the same time the soft codec is stealing CPU cycles to render things on the computer screen, the ability of the CPU to dump data across the 1394 wire can be compromised, and frame rates can suffer, leading to juddery, stuttering video output, though as processors get faster this is less of an issue.
Rendering transitions, titles, and effects: here the difference between hard and soft codec systems is less pronounced. To add an effect (say, a dissolve or wipe between two clips), the system has to take the two source frames, decompress them, perform the mix, and recompress the resulting frame. The soft codec takes CPU power to run, but the CPU has nothing else to do while waiting for the frames, so it might as well be involved. The hard codec runs in real time, but the CPU, once it has set up the data transfers, has to sit and wait for the output anyway. In early 1998, various vendors claim a 25% speed advantage of hard codecs over soft codecs, or a 30% advantage of soft codecs over hard codecs, or whatever... Too much depends on other factors, like the speed of the computer's CPU, bus and bus interface chipset, to decisively say that one codec will be faster than the other in effects rendering. However, as CPUs and buses speed up over time, the soft codecs (which, unlike their hard counterparts, aren't limited to running at real-time rates) are likely to take the lead in speed for rendering operations.
[Side note: we're discussing "single-stream" operations here: there is one video stream in the system and one codec; so to do a dual-stream effect requires that the available bandwidth and codec be shared between the two source streams of video and the output. Rendering an effect is inherently a non-real-time operation in such systems, no matter whether the codec is hard or soft. To date (unless FAST blue. offers it) the only native DV "dual-stream" systems which allow real-time effects rendering are the rather pricey Sony ES-3 and ES-7 and Panasonic NewsByte and DVEdit systems. This requires three codecs: two to decompress the two source streams used as inputs to the effect, and one to recompress the output back to disk. Currently this is cost-prohibitive and pushes beyond the limits of what can be done on affordable desktop computers without adding expensive dedicated disk controllers and the like, which defeats DV NLE's sweet spot: affordable high-quality video on affordable computers. Dual-stream capability is seen on some higher-end editing systems, mostly using M-JPEG codecs.]
Capturing from or outputting to non-DV VTRs: hard codec systems come with breakout boxes that include analog (composite, Y/C, and sometimes component YUV) connections as well as 1394 connections. You can connect up any VTR format with analog I/O to the box and capture it in real-time or output to it in real-time. This makes it easy, for example, to bring legacy Hi8 or Betacam footage into the editor to intercut with newer DV material. You don't even need to have a DV VTR or camcorder around to use the system, as it has its own hard codec onboard.
Soft codec systems supply a 1394 board for connection to a VTR, but offer no other inputs or outputs. For outputs, any 1394-equipped camcorder or VTR can be used to transcode to analog (composite or Y/C), so you can record the output of your NLE to Hi8, SVHS, BetaSP, or the like in real-time by using a DV VTR or camcorder as a transcoder (of course, you must have your DV machine present to act as the transcoder as there is no non-1394 output available).
However, to bring non-DV material into your soft codec based system, you may first have to dub the material to a DV tape: aside for the DSR-20 none of the 1394-equipped VTRs will transcode from analog inputs to DV "live" without first recording the material to tape. So it can still be done, but it's a two-step process: dub to DV, then capture (or buy a DSR-20 for live transcoding).
In mid-November 1998, Sony introduced a standalone DV/analog transcoder box, the DVMC-DA1. With this box, "live" real-time transcoding of DV to/from analog (composite or Y/C) can be added to a "soft codec" system, making such systems more viable for use with analog sources, and bringing them closer to "hard codec" systems in convenience. These $400 boxes are in short supply in the USA, but ProMax has them or can get them. You can also order them through Akiba Exports in Japan (they accept American Express and wire transfers only, as of November 1998). [Note: the DA1 appears to be discontinued as of June 1999. A more expensive Sony model is in the offing, and ProMax is expected to release a US$1500(?) 1394 to SDI/YUV/Y/C transcoder of their own design in the latter half of the year.]
Buying a system, and paying for it: hard codec systems are not cheap; they run around US$3000 at the present time. You can't just buy them on a whim, and even if you know you're going to use it, it might be difficult to conjure $3000 out of thin air to pay for it (I haven't met anyone getting rich by making video!).
Soft codec systems cost around US$700, which is considerably more affordable for most folks in this market. They're a much better choice if you are cash-poor or aren't sure that DV NLE is for you.
My recommendation? If your time is valuable (you edit video for a living), and looking at tiny onscreen windows is more than just a minor annoyance, you'll be happier in the long run with a hard codec system (you'll be much happier with a realtime, dual-stream system, but these get rater expensive). It's just a bit less tiresome to work with, and faster when you want to import non-DV material. It's also more convenient when sitting there with a client looking over your shoulder, since the onscreen previews are bigger and faster. You don't need to have your camera or DV VTR present to play back to the TV monitor, and if you're a small shop with limited resources and a busy schedule, this can justify the cost of the hard codec: your camera can be out shooting the next show while you edit the current one, and the money you save by not having to buy a VTR as an offboard codec will pay for the hard codec system.
On the other hand, the part-time videomaker, the short-of-cash, and the casual "prosumer" might well be better off with the soft codec systems. $700 is certainly a lot more affordable than $3000, and if you decide that DV editing isn't for you, you're out less money. If you spend most of your time doing something other than editing, then the interactive speed advantages of the hard codec may not matter much compared to the higher cost.
What am I using? A soft codec based DPS Spark. But then, my main job
is software engineering, not video editing; the payback period for the
more expensive product was too far out there (and in the mean time, hard
codec prices are likely to drop). Besides, I'm used to looking at tiny
pix: I used to cut double-system sound, A/B roll shows on Super8 film...
It seems to be generally accepted that JPEG compression at 3:1 is roughly equivalent in quality to DV's 5:1 compression. It's also worth remembering that DV and JPEG are both DCT (Discrete Cosine Transform) codecs; they tend to have similar artifacts and effects on pictures. (DV gets its additional compressive efficiency through block-level optimization of quantizing tables, whereas JPEG uses a fixed quantizing table for an entire image).
Thus, one might venture to guess that whether one is compressing via 5:1 DV or 3:1 JPEG, similar amounts of damage are done to the image, and that transcoding between these two compression schemes might cause less degradation than the initial compression caused.
Indeed, at NAB '96 Panasonic had hidden away in a corner a most interesting demonstration. A D-5 (uncompressed ITU-R-601) signal was fed to a component digital switcher on input #1. It was also taken, compressed via the DVCPRO codec, decompressed, and fed to input #2. The processed signal was further fed through a Tektronix ProFile DDR using JPEG at around 2.5-3:1 compression, and played back to input #3. That signal was again fed through a DVCPRO compression/decompression chain, and brought up on input #4.
A wipe pattern was set up, and by pressing buttons one could see a split-screen of any two signals on the switcher. Remember, this was a digital component switcher, and the monitor was one of those gorgeous Panasonic digital monitors where the image data stay digital all the way to the modulating grid (really, these are amazing monitors; if you haven't seen one, you don't know how good video can look).
The original D-5 image was deep, quiescent, lucent: as good as 525/59.94 images get. The first DVCPRO-processed image showed the usual sorts of DV artifacts we've all come to know and love, but it was still pretty darn good; you had to look closely to see any degradation.
But that was it: the further stages of processing showed no noticeable difference. The initial DV compression had already thrown away the troublesome transients and difficult details. What survived the initial DV codec was a DCT-friendly image that suffered very little from further compression in the ProFile, and the ProFile-processed image ran through the final DVCPRO codec with ease.
I'm not saying the images were identical; there were probably minor truncations and losses occurring in the ProFile's JPEG codec and in the final DVCPRO codec. However, these were very minor and visually imperceptible. Because the entire signal path was digital, the image stayed in registration throughout; there was no shifting of 8x8 DCT block boundaries nor were there level shifts and noise introductions as could occur in analog connections, both of which could degrade further compression. Moreover, the compression on the ProFile was very mild; it was at least as good, visually speaking, as the DVCPRO compression.
So, it can be done. Bear in mind that the level of JPEG compression used is a big determinant of whether you can transcode successfully. If you're using low JPEG compressions of 3:1, 2:1, or less, and transcode in the digital domain (through a serial digital connection or software conversion, rather than via an analog connection to a JPEG codec), you will see very, very little degradation of the image. If you dump your DV data into the JPEG world via an analog connection, or if you use higher compression rates, you will see a progressively higher amount of degradation.
Even so, there's always the risk of some loss. As a fellow said at SIGGRAPH '86, "Dealing with floating-point numbers is like shoveling sand: when you pick up a handful, you get a little dirt, and some sand trickles out..." and the same can be said about moving between different codecs.
Which is better for editing: DV or M-JPEG?
Ahh, now that's the question! And with systems like DraCo's Casablanca, Matrox's DigiSuiteLE, or Pinnacle's ReelTime that work in M-JPEG but offer 1394 I/O, what's going on?
DV is good because if you've shot in DV and stay in DV on disk, there's no transcoding required. DV is ideally suited to desktop editing because the data rates are viable on not-too-exotic SCSI-2 or ultra-SCSI disks and controllers; you can assemble a perfectly usable DV editing system for under US$4000 and produce excellent, broadcast-quality work (well, technically, at least; despite what the manufacturers would have you believe, no format or software guarantees to make you a creative genius).
If you're shooting DV, why not stay in DV all the way? The sweet spot for this format (to borrow Panasonic's DVCPRO slogan) is "faster, better, cheaper", and you can't get comparable M-JPEG quality for DV prices, DV data rates, and DV storage requirements.
On the other hand, DV's fixed data rate means that 25 Mbits/second is what you get: you can't use a DV codec to grab hours of low-res "offline" quality to disk for a rough edit.
M-JPEG is a mature technology used in most high-end (Avid, Accom, Discreet, etc.) editing systems. It offers the ability to capture at different rates, so you can save on disk space for the offline work and redigitize the rough cut for the online clean-up. At the lesser compression levels it offers potentially higher quality; if you're doing a lot of multi-pass or multi-layer effects work, you'll wind up with fewer cascaded compression artifacts with a high-end M-JPEG system.
Whether the difference is a visible one by the time your program hits a VHS cassette or an over-the-air analog transmission is arguable, but it is an issue to be aware of, especially if you need to protect as much quality as possible for DVD or future DTV usage.
M-JPEG will cost you more for the same level of quality, requiring faster disks or RAID arrays, and more of 'em.
Systems like Casablanca, DigiSuite, or ReelTime are M-JPEG at the core, but offer a 1394 connection so that you can pipe your DV data in and digitally transcode it to M-JPEG. As I discuss above, this need not visually degrade the image, assuming the underlying data rate is high enough that low compression levels can be used. It's definitely going to be better than an analog connection between your DV source and the M-JPEG data on-disk; these systems may seem odd, but they make sense from a technical standpoint.
So which one is better? It depends on your needs, your target distribution
methods, and your budget. If you can't make up your mind, get the "blue."
system from FAST, and mix 'n'
match M-JPEG data and DV data as suits your mood!
What about MPEG-2 editing?
Editing systems using "Studio Profile" MPEG-2, otherwise known as 4:2:2P@ML MPEG-2, are starting to appear (namely FAST's "six-oh-one"). This is basically the same compression scheme used in BetacamSX, a flavor of MPEG-2 using 1- or 2-frame GoPs (groups of pictures). I have no experience (yet) with such systems -- nor does just about anyone else in August of 1998 -- so it's hard to really review them in any detail. (FAST has announced the 601 (six-oh-one), and Pinnacle has shown the DC1000 at NAB '99; both are MPEG-2 editors, so we should soon start to see some real-world reports on their performance.)
For what it's worth, some in the industry as of August 1998 are predicting
that before too long there will be only two flavors of compression used
in editing: DV and MPEG-2. Both formats are "native" capture formats (DV,
DVCAM, DVCPRO for DV, and BetacamSX for MPEG-2) and MPEG-2 is the distribution
format for American DTV, whereas M-JPEG introduces a compression
step that's neither native to an acquisition format nor used for distribution.
The European Broadcasting Union, in Annex C of
the SMPTE/EBU Task Force for Harmonized Standards for the Exchange of Program
Material as Bitstreams Final Report, backs this up by recommending that
DV family and MPEG-2 4:2:2P@ML family compression schemes be used in future
networked televison production. We'll see...
16:9 is the widescreen format that the USA has standardized on for future DTV services. It has also been used in the NHK 1125-line analog HDTV standard and the Eureka 1250-line HDTV standard, as well as variety of enhanced SDTV (standard-definition TV) services in Europe and Japan. The screen is 16 units wide by 9 units high, so the "aspect ratio" is called 16:9 because it's easier to remember than 1.78:1 (approximately) which is the "normalized" number.
Currently, most SDTV in the world is 4:3, or 12:9, or 1.33:1.
Why should I care about 16:9?
As the world slowly and painfully switches over to digital broadcasting, it looks to be a 16:9 world we're all moving into. Although it's likely to take ten years or more before 16:9 receivers outnumber 4:3 receivers worldwide, and there will always be a huge legacy of 4:3 SDTV programs in the vaults, "premium" programming in the future will almost certainly be 16:9 material, in both "standard definition" and "high definition" forms.
4:3 program material won't be obsoleted by any means, but many forward-looking
producers are composing and shooting for 16:9 to maintain as high a value
as possible for all future distribution possibilities. Some are actually
shooting 16:9, while others are practicing "shoot and protect" in 4:3,
just by making sure that the material can be cropped to 16:9 without losing
any important content from the top or bottom of the image.
How do you get 16:9 pictures?
You can use the 16:9 switch on your camera (if it has one, and if it does 16:9 "the right way"). Or, you can shoot and protect a 16:9 picture on 4:3. Or, you can use an anamorphic lens.
Many cameras have a 16:9 switch, which when activated results in either a "letterboxed" image and/or an anamorphically-stretched image. But beware; there's a right way and a wrong way to do this.
The "right way" is to use a 16:9 CCD. When in 4:3 mode, the camera ignores the "side panels" of the CCD, and reads a 4:3 image from the center portion of the chip. When in 16:9 mode, the entire chip is used. In either case, the same number of scanlines is used: 480 (525/59.94 DV) or 576 (625/50 DV). You can tell when a camera is capturing 16:9 the "right way" because when you throw the switch, whether the resultant image is letterboxed in the finder or squashed, a wider angle of view horizontally is shown, whereas the same vertical angle of view is present.
The "wrong way" is for the camera to simply chop off the top and bottom scanlines of the image to get the widescreen picture. When you throw the switch on these cameras, the horizontal angle of view doesn't change, but the image is cropped at the top and bottom compared to the 4:3 image (it may then be digitally stretched to fill the screen, but only 75% of the actual original scanlines are being used).
[There are some Philips switchable cameras that do clever tricks with subdivided pixels on the CCDs; when you flip into 16:9 mode, the image's angle of view will get wider horizontally and tighter vertically. So to really be sure, use the change -- or lack thereof -- in the horizontal angle of view to see if your camera is doing 16:9 "the right way".]
The "wrong way" is wrong because the resultant image only uses 360 lines (525/59.94) or 432 lines (625/50) of the CCD instead of the entire 480 or 576. When this is displayed anamorphically on your monitor, the camera has digitally rescaled the lines to fit the entire raster, but 1/4 of the vertical resolution has been irretrievably lost. This is not too terrible for SDTV playback (still, it isn't great), but it's asking for disaster if the image is upconverted to HDTV.
The bad news is that most inexpensive DV cameras (including the VX1000 and XL-1) do 16:9 the wrong way. 16:9 chips are still very costly and the yields are low; in late '98 Sony's DXC-D30WS 16:9-capable DSP camera (which, docked with the DSR-1 DVCAM deck, becomes the DXC-D130WS camcorder) was only available in short supply, and the Sony sales force was encouraged to steer folks to the non-widescreen D30 model unless they really needed widescreen, because the supplies were so limited. Even then, the WS model commands a US$3000 premium over its 4:3-only sibling. The Panasonic AG-D900W, which is switchable between 4:3 and true 16:9 as well as between DVCPRO and DVCPRO50, is a good choice -- but at US$39,000 it's not readily affordable the way a US$3,900 XL-1 is... At NAB '99 the US$15000 DSR-500Ws single-piece DVCAM camcorder was released, it's excellent and I'll post a review of it sometime soon. It's the current entry-level true 16:9 camera in the DV formats.
An anamorphic lens is the way film folks have done widescreen for years, though the video systems to use this (aside from the late, lamented Panacam) are few and far between. A cylindrical element squashes the image laterally, so that you get the tall, skinny pictures like images in a fun-house mirror. This squashing allows the 16:9 image to fit in the 4:3 frame. Century Precision Optics has an anamorphic adapter to fit the VX1000 and DSR-200 camcorders, as does Optex (distributed in the USA by ZGC). Both allow you to use the wider half of the zoom range, and both run about US$800.
In the film theatre, or in the print lab, another anamorphic lens unsquashes the image to yield the original widescreen image. In video, you'd have to use a DVE or an NLE plug-in filter to unsquash the image, or you'd embed the appropriate codes into the data stream or video image (the codes differ in specification between different broadcast standards) to tell the receiver that the image should be displayed as widescreen.
So what's a poor DV shooter to do, if he or she can't afford a true
16:9 camera like the AJ-D900W or DSR-500WS, and can't find an anamorphic
lens? Shoot and protect 16:9 on 4:3. Use the entire, non-widescreen
4:3 image, but protect your future revenue streams by ensuring that all
important visual information is contained vertically in the center or upper
3/4 of the screen. That way you have the full resolution 4:3 image for
use today, and you can always upconvert to HDTV later in the 4:3 aspect
ratio or the 16:9 aspect ratio if you can accept the reduced vertical
resolution. Should you need to repurpose the material into a 16:9 SDTV
format later, you can letterbox it in post by setting up a vertical shutter
wipe, putting black bands at the top and bottom of the screen just like
on MTV. You're no worse off than with 16:9 material shot "the wrong way",
but you have the freedom and flexibility of a full-resolution 4:3 image
that's compatible with today's broadcast and non-broadcast standards.
Several cameras, including the Panasonic AJ-EZ1 and AJ-D210 and the Canon XL-1, have a "frame mode" or "movie mode" switch that appears to change the way the CCD is read out into buffer memory from interlaced to progressive scanning. This gives a 30 fps "film look" frame-based image instead of the 60 fps field-based image we normally see on TV.
A still frame taken from fast pan of a scene shot in frame mode with the XL-1 shows no interlace artifacts when viewed in Premiere; each 720x480 frame shows up as an intact frame-based image in which both the even and odd fields appear to have been captured at exactly the same time (of course, the data stream written to tape still interleaves the even and odd fields for proper interlaced TV display; it's just that both fields appear to have been clocked into the transport-and-hold registers of the CCD simultaneously instead of in even-odd alternation). When shown on TV, frame mode images have had their temporal resolution reduced by half to 30 fps, fairly close to film's 24 fps. For the 625/50 XL-1s sold in PAL countries, the 25fps video frame rate will make for an even closer match.
This is very exciting, especially for anyone wanting to originate on
DV and transfer to film for release. The noninterlaced frame-based images
should yield a much better film transfer. And for those wanting the "film
look" on tape, this is a good start. Better yet, use a PAL camera in frame
mode: the resulting 25 fps images transfer to film very, very nicely...
How do I get "film look" shooting with DV cameras?
Buy a used Arriflex 16BL or CP GSMO, stencil "Canon XL-1 DV camcorder"
on the side, and shoot film!
Seriously, though, the most important way to get a filmlike look is to shoot film style. Light scenes, don't just go with whatever light is there. Use lockdowns or dolly shots, not zooms. Pan and tilt sparingly to avoid motion judder (i.e., if you're using the XL-1's frame mode, you shouldn't compose any shot to call attention to the 30 fps mode). If you're using a camera that allows it (VX1000, most pro cameras), back down the "detail" or "sharpness" control. Reduce chroma slightly. Lock the exposure; don't let it drift. Use wide apertures, selective focus, and "layered" lighting to separate subjects from the background. Pay attention to sound quality. In post, stick mostly to fades, cuts, and dissolves; avoid gimmicky wipes and DVE moves.
Beyond that, you can use "frame mode" on the XL-1, Panasonic AJ-EZ1, or AJ-D210; try 15 or 30 fps on the VX1000. On the Sony it's not the same as frame mode and has other problems, but it may pass as film for some purposes.
On higher-end cameras (DSR-300, DSR-130, AJ-D700, and the like), you may have setup files to adjust gamma, clipping, sharpness, color rendition, and white compression; these can be exploited to give the camera a more filmlike transfer characteristic.
Take the aperture correction (edge enhancement or sharpness setting), if available, and turn it down or off. This also makes a huge difference both in film transfer and in HDTV upconversion.
Try out the Tiffen Pro-Mist filters. I like the Black Pro-Mist #1 or lower (fractional numbers). Jan Crittenden at Panasonic prefers the Warm Pro-Mist 1/2. These knock off a bit of high-frequency detail and add a bit of halation around highlights. Bonus: by fuzzing the light around bright, sharp transitions, these filters have the added effect of reducing hard-to-compress high-contrast edges, resulting in fewer "mosquito noise" artifacts.
In post, there are a variety of filters or processes available to adjust the gamma; or to simulate 3-2 pulldown, gate weave, dust and scratches, film fogging, and so on.
There are also proprietary processes such as "Filmlook" that, for a price, make the video look so film-like that real film looks like video by comparison.
Of course, if you really wanted film, why didn't you shoot film?
:-)
What do the slow shutter speeds do for me?
The slow shutter speeds (those below 60 fps) found on many DV cameras use the digital frame buffer of the camera in conjunction with a variable clock on the CCDs to accumulate more than a field's worth of light on the face of the chip before transferring the image to the buffer and thence to tape. This can do two things for you: more light integration, and slower frame update rates.
More light integration means that you can get usable images in lower light than you might expect. I've shot sea turtles by moonlight at midnight at 1/4 sec shutter speed; the images update slowly but are certainly recognizable, whereas the same scene at 60 fps looked like I had left the lens cap on.
You can also use the long shutter times as a poor man's "clear scan" for recording computer monitors without flicker. As you increase the integration time on the CCD, the computer monitor goes through more complete cycles before the image is transferred, reducing recorded flicker; many computer images have little motion so the slow update rate may not even be noticed. Be aware, however, that at least some cameras (the Sony VX1000 among them) appear to go into a strange field-doubling mode at shutter speeds below 60; vertical resolution is cut in half (while two clearly-interlaced fields are recorded on tape, as can be seen in a NLE, the field-mode flag is set in the DV datastream so that field-doubling is performed by the DV codec during playback to eliminate interfield flicker) so fine detail will be impaired. You'll need to judge this tradeoff on a case-by-case basis.
Slower frame update rates are good for two things: a poor man's "film
look" at 30 fps or 15 fps, and special effects at slower rates. You can
capture a strobing, strangely disturbing image at the lower rates... use
it sparingly, of course; no sense in annoying the viewers.
That smooth friction, alas, plays havoc with autofocus systems which all consumer cameras must have, so goes the conventional wisdom; strong and battery-draining motors are needed to spin such barrels, and they can't obtain the fast focus response that's so useful in optimizing autofocus algorithms.
Thus the autofocus lenses nowadays use lighter, more easily positioned internal focusing elements (which are also advantageous from an optical standpoint) with lighter, faster, more thrifty focus servos.
The "focus ring" you manhandle isn't actually connected to the focusing mechanism. It's a free-spinning ring with an optical or electromagnetic sensor attached: when you spin the ring, a series of pulses is sent to the focus controller. The faster the pulse train, the faster the controller changes focus.
However, it's not perfectly linear. If you turn the ring too slowly, nothing at all will happen since the controller discards all pulses below a certain rate as random noise. If you spin it 1/4 turn very quickly, you'll get more of a focus shift that if you turn it 1/4 turn at a more moderate rate.
As a result of all of this, there's no way for the focus ring to have focus marks -- nor is it possible for you to measure such marks yourself and be able to repeat them.
The same argument applies to the zoom controls on some lenses, such as the 16:1 on the Canon XL-1.
How do I work with these lenses?
Carefully, with patience and understanding. You can't set marks, or focus by scale. Slow, fine adjustments may do nothing. But with practice and perhaps some adjustment of operating style, most people can use if not necessarily love these lenses.
On the XL-1, you'll get better zoom control and smoother operations if you stick to the zoom rocker on the handgrip than if you use the zoom ring on the lens. Some folks are taping over the zoom ring entirely and only using the rocker.
Don't like it? Buy a real camera with a real lens, like
the DSR-300 (US$10,000 and up, with lens) or the AJ-D210 (US$7,000 or so).
Hey, it's only money...
The EIS/DIS controllers look for motion vectors in the image (typically a widespread displacement of the entire image) and then decide how to "reposition" the image area of the chip under the image to catch it in the same place. The actual repositioning is done in one of two ways: one is to enlarge (zoom) the image digitally, so that the full raster of the chip isn't used. The controller can then "pan and scan" within the full chip raster to catch the image as it moves about. The other is to use an oversize CCD, so that there are unused borders that the active area can be moved around in without first zooming the image.
The zoom-style pan 'n' scanner can be detected quite simply: if the image zooms in a bit when EIS/DIS is turned on, then a zoom-style pan 'n' scanner is being used. Unfortunately, such methods reduce resolution, often unacceptably.
All EIS/DIS systems suffer from several problems. One is that, because the actual image is moving across the face of the chip, image shakes induce motion blur. Even though the position of an image may be perfectly stabilized, you can often notice a transient blurring of the image along the direction of the shake. Sometimes it's quite noticeable. To get around this, many EIS/DIS systems close down the shutter a bit to reduce blur. This reduces light gathering capability. You can't have everything, you know.
Another problem is that the motion-vector approach to stabilization can be easily fooled. If the area of the image being scanned doesn't have any contrasty detail that the processor can lock onto, the stabilization can hunt, oscillate, or bounce. This looks like a mini-earthquake on the tape, and it can occur at the most annoying times.
Also, the stabilization can work too well. Often when one starts a slow pan or tilt with EIS/DIS engaged, the system will see the start of the move as a shake, and compensate for it! Eventually, of course, the stabilizer "runs out of chip" and resets, and the image abruptly recenters itself.
The big advantage of EIS/DIS is that it's cheap.
What's optical stabilization?
Optical stabilization such as "SteadyShot" is descended from Juan de la Cierva's 1962 Dynalens design, a servo-controlled fluid prism used to steer the image before it hits the CCDs (in the '60s, of course, it steered images onto film or onto tubes!). In the late '80's and early '90's, Canon and Sony updated this technology for use in consumer gear, and it worked so well that Canon now offers a SteadyShot attachment for some of their pro/broadcast lenses.
The fluid prism is constructed of a pair of glass plates surrounded by a bellows and filled with fluid so that the entire assembly has a refractive index comparable to a glass prism. The angle of the prism is changed by tilting the plates; one plate can be rotated vertically, moving the image up or down, and the other rotates horizontally, steering the picture right or left.
Rotation rate sensors detect shake frequencies and tilt the front and back plates appropriately. Position sensors are also used so that in the absence of motion the prism naturally centers. The position sensors also detect when the prism is about to hit its limit stops, and reduce the corrections applied so that shake gradually enters the image instead of banging in as the prism hits its limits.
Optical stabilization of this sort is expensive, tricky to manufacture and calibrate, and must be tuned to the lens. Adding a wide-angle or telephoto adapter to a SteadyShot lens screws up SteadyShot; the processor doesn't know about the changed angle of view (all it knows is the current zoom setting) and thus over- or under-compensates for shake.
But for all that it works brilliantly: because the image is stabilized on the face of the CCDs, there is no motion blur; because rate sensors are used, the system isn't fooled by motion in the scene or by lack of detail; because a physical system has to move to reposition the image, there are no instantaneous image bounces or resets as can happen with EIS/DIS.
[It's interesting to note that on the XL-1, Canon added image motion-vector
detection to the rate gyros on their optical stabilizer. As a result, the
system seems to "stick" on slow pans and tilts just like an EIS/DIS system,
although the recovery is more fluid and less jarring. On the other hand,
it really does a superb job on handheld lockdowns.]
What about Steadicam/GlideCam?
These mechanical stabilizers work by setting up the camera so that it has large rotational moments of inertia, but little reason to want to rotate: the camera is mounted on an arm or pole that's gimballed at its center of gravity or just above it. The gimbal mount is either handheld, or attached to an arm, often articulated and countersprung, mounted on a body bracket or vest. One steers the camera by light touches near the gimbal; otherwise it just tends to float along in whatever attitude it's already at. The trick is in getting it into an attitude that makes nice pictures, stopping it there, and then not disturbing it.
These systems work very well, but require a lot of practice for best results. It's very easy to oversteer the camera, and off-level horizons are a trademark of suboptimal Steadicam skills. The handheld systems can also be surprisingly fatiguing to use for extended periods.
I find that the Steadicam JR is also a bit wobbly; its plastic arms aren't especially rigid and the whole thing tends to vibrate a bit. Fortunately, the wiggles that get through the JR are neatly compensated for by SteadyShot in the VX1000, resulting in buttery-smooth moving camera shots (complete with off-level horizons!).
When do I use what kind of image stabilization?
Try it; see if it works; if it helps, then use it.
I tend to leave SteadyShot on the VX1000 on most of the time. I'll turn it off when using the wide-angle adapter, or when using the camera on a tripod and needing to conserve power.
If I'm planning to do any significant camera motion during a shot, and I don't have a wheelchair, dolly, car, airplane, or helicopter available (there's never a helicopter around when you need one...), I'll use the Steadicam. Depending on the roughness of the ride in the aforementioned conveyances, and space allowing, I'll use Steadicam there, too.
And don't forget that other, less glamorous form of stabilization: the tripod. Tripods work really, really well. Try one sometime, you'll like what it does for your image!
DV: Tips, Tricks, and Links |
|
Cameras:
Sony VX-1000 Video University's VX-1000 Page and busy VX-1000 User Forum
Canon XL1 Video University's XL1 page
John Beale's stunningly excellent Sony DCR-TRV900 site:
Using the Sony DCR-TRV900
Camcorder
Adam Wilt's Home Page | Adam Wilt's Engineering Services | Adam Wilt's Film and Video Services | Video Tidbits | Top of Page |