4 Frame sizes for newbies

This section covers the resizing part. Which frame you should capture, whether and how you should resize. This section is meant for people who are new to capturing (people who know how to capture or want to know more can follow section 5 instead).

In this section it will be explained how to resize to the correct target size. How this should be done, depends on a number of things, whether you live in a PAL and NTSC country (in subsection 4.1 a few things will be explained about the PAL and NTSC standards), your capture size, your active capture window (discussed in subsection 4.2) and the end format. In subsection 4.3 the minimal recommended capture size will be given. In subsection 4.4 it will be discussed how to determine to correct resizing settings.

4.1 Sample rates of some standard formats

It is important to understand that the size of the frame directly relates to the sampling rate used to create the pixel. If a 53.333 µs ⁽¹⁾ analogue signal was sampled at 14.32 MHz you get 910 pixels. If this same line is sampled at 13.5 MHz, you get 720 pixels. Although they both cover the same area. This concept of sampling can be reversed to understand how digital is turned back into analogue. If a device knows the standard, it knows how to create or what to do with the pixels.

Here are some 'standard' sampling rates: (Note: sample rate (in MHz) * width (in µs) = pixels.)

Standard	Sample rate in MHz	Width of standard in µs	Pixels
DV/DVD	13.5	53.333	720
DVD	13.5	52.148	704
SVCD	9	53.333	480
VCD¹/CVD	6.75	52.148	352
PAL TV on a PC	14.769 ²	52	768
1/2 (or 1/4 ¹) PAL TV on a PC	7.3845 ²	52	384
NTSC TV on a PC	12.3064 ²	52.6555	648
1/2 (or 1/4 ¹) NTSC TV on a PC	6.153 ²	52.6555	324

(1) VCD, 1/4 PAL and 1/4 NTSC have half the height of the other "standards"
(2) There are no devices that sample to or from PC pixels. The rates are derived from the fact that PC pixels are the same height as width, and screens have an aspect of 4:3. Given this, a full frame of PAL TV of 576 lines plays on a PC in a 768x576 frame. NTSC TV plays on a PC in a 648x486 frame.

References:
Digital video resolutions: A Quick Guide to Digital Video Resolution and Aspect Ratio Conversions.

4.2 Approximate active capture window

Capture cards only capture/sample a portion of this signal (this portion is called the active capture window). If you want to know how to determine this portion, you have to look at section 5. In this section, it will only be explained how to get an approximate value for the active capture window.

As stated in the introduction (subsection 3.1), for PAL the active picture is contained in about 52 µs of 576 active scan lines. Looking at various capture chips/drivers (listed in section 5) it turns out that the subrange of the "about 52 µs" which is actually captured lies between 51.56 µs and 53.333 µs. Where the former capture window is missing a part of the image (for 720x576 this results in 6 missing pixels) and the latter contains black borders (for 720x576 this results in a border of 18 black pixels).
Similarly, for NTSC the active picture is contained in about 52.666 µs of 486 active scan lines. Looking at various capture chips/drivers (listed in section 3) it turns out that the subrange of the "about 52.666 µs" which is actually captured lies between 50.96 µs and 53.333 µs. Where the former capture window is missing a part of the image (for 720x480 this results in 23 missing pixels) and the latter contains black borders (for 720x480 this results in a border of 9 black pixels).

Before giving a method to approximate the active capture window, you should check whether your capture chip/driver falls in one of the following groups:

DV devices: 53.333 µs (PAL and NTSC)
NVIDIA Gforce based cards mostly use Philips capture chips
philips chips (PAL): 52.15 µs (hardly any black borders) or 53.333 µs (720x576: much black border)
philips chips (NTSC): 52.15 µs (no black borders) or 53.333 µs (720x576: black border)
BT878 with BTWincap drivers: 52.15 µs (PAL, hardly any black border) and 52.74 µs (NTSC, hardly any black border)
non-btwincap Booktree (BT8x8) or Connexant (CX238xx): 51.56 µs (PAL, no black border), 50.96 µs (NTSC, no black border)

If your chip is not listed above, or you want to check it you should make a test-capture. Make a capture from a suitable ⁽²⁾ source, preferably more then once. Try
to select a frame with something white at the edges (like a paper on the announcers desk).

suitable sources:
- local station news broadcasts (from the studio itself)
- local station test picture

unsuitable sources:
- local sports broadcast
- news with imported footage (example: Al Jazeera footage on Dutch news)
- film broadcast
- foreign series
- cheap video productions
- anything foreign that was converted to local standards

Count the black pixels in your capture. Two ranges are given for several capping sizes in the following two tables. Anything in the first is assumed to be 52 µs for PAL (or 52.666 µs for NTSC), anything above is assumed to be 53.333 µs. Note that these ranges are arbitrary (and defined by the authors of this guide), they are chosen in such a way that a horizontal offset of 5-6 pixels is allowed ⁽³⁾. You will get an error, but it hopefully will be acceptable.

black pixel table - PAL:

PAL	52 µs (no black borders)	53.333 µs (black borders)
768x576	0-10	>10
720x576	0-9	>9
704x576	0-9	>9

black pixel table - NTSC:

NTSC	52.666 µs (no black borders)	53.333 µs (black borders)
640x480	0-8	>8
720x480	0-7	>7
704x480	0-7	>7

Some examples:

PAL (720x576): SAA7134, Terratec Cinergy TV 400 (driver WDM v1.2.0.5): 2 pixels => active capture window ~ 52 µs
NTSC (720x480): Conexant Fusion 878A (driver: BTWincap 5.3.5.0): 4 pixels => active capture window ~ 52 µs

4.3 Recommended capture size

What size should you capture at and what codec should be used at that sizes? Very important to realize is that the card samples at a fixed sample rate and the driver downsizes the picture afterwards, regardless which capture size you request. The basic rules are:

CVD/VCD: you should capture at dvd-size and resize to the target size. If you would capture directly at the target size too much quality gets lost.

DVD/SVCD: you should capture at the target size. If, for example, it isn't possible to capture at dvd-size, you should capture at a lower size and go for a different format like SVCD or CVD. The reason is that in this case the difference between the capture size and the target size is small, and the quality will be degraded too much when an extra resize is necessary. For example, instead of capturing and denoising at 768x576 (for PAL) and resizing to 720x576 it is qualitywise better to capture and denoise at the target size 720x576.

XviD/DivX: Two cases will be considered here: good quality high-size capping and good quality low-size capping. The latter can be used when you have a "slow" pc, or you don't have much hard disk space.

good quality high-size capping:

You should capture using a horizontal size of 768, 720 or 704 (for PAL, with a vertical size of 576) or 720, 704 or 640 (for NTSC, with vertical size of 480). The reason is that in general the bigger the frame size, the better the quality (this is actually not the complete truth, more can be found in section 5). There are some important issues:

For NTSC it is quality-wise better to capture close at 640x480, without the need for additional resizing. Have a look at section 4.4 for some examples.
If you capture at these high sizes, it might be easier to denoise your capture. But then again, it will also contain more noise.

good quality low-size capping:

If you have a "slow" pc, or you don't have much hard disk space, you have to choose a lower capture size. The recommended sizes are

device PAL (with vertical downsize) NTSC (with vertical downsize)

bt8x8 400x576 368x480

cx2388x 384x576 320x480

saa71xx 384x576 320x480

ati cards with theater chip 480x576 400x480

The reasons are explained in section 5.3. In all cases, you have to calculate the resize settings to be done at the post-processing level (as done in section 4.4 with final size 384x288 (PAL) and 320x240 (NTSC)).

Huffyuv/MJPEG (at 18 or 19): If you are using a mjpeg codec for capturing (at quality 18 or 19) and you want good quality, you should do that at high sizes. At lower sizes (SVCD or CVD) you should use Huffyuv instead.

4.4 Resizing

If you care to get the aspect ratio correct, and you capture at a standard size, resizing is almost always required. This is because capture card/driver combinations almost never capture exactly the complete picture but mostly either a bit too little or too much. In addition, they scale (instead of pad ⁽⁴⁾) to the size you requested. This means asking for a size such as 768, 720, 704, 640, etc almost always gets you a slightly distorted picture. If your application/driver allow for a custom capture size, you can simply capture the desired size for your card and skip this step. Regardless, you will have to calculate the 'correct' size. Your destination size should be divisible by 16, since motion estimation uses 16x16 pixel sized macroblocks. It is assumed that your device captures 576 lines for PAL and 480 for NTSC. That's right, all devices crop six NTSC lines to get 480 lines ⁽⁵⁾.

There are two methods for determining the correct size. The first is short and works well for standard destination sizes. It may also require you to add a small border to hit your desired standard. The second will give you the tools to figure out any custom options you may want. In general, this method will only work if you are able to cap at arbitrary sizes. If not, then most of the times you are stuck with the second method.

first method:

Find your approximate active capture width in µs in the table in section 5.4.
Find the sample rate for your destination standard in the table in section 5.1.2.
Multiply the two numbers together.
Round to an even whole number (the reason has to do with the colorformat YUV, more can be found in the footnotes of section 5):
Now you have the pixel size your device captures expressed in the pixels of your destination standard. Resize to this size regardless of your capture frame size. If this would cause your to 'up-size', you may want to consider a smaller frame standard. If you can do a custom capture size, you can use this.
Add borders or crop to exactly hit your standard frame size and round to the nearest 16 mod size.

second method:

Select a suitable capture size (guidelines are given at the start of this section).
Find the capture width of your destination standard (in µs) in the table in section 5.1.2.
Find your approximate active capture width (in µs) in the table in section 5.4.
Divide the two numbers, and multiply the result with the horizontal capture size.
Now you have the capture width of your destination standard (in µs) expressed in the pixels of your destination standard. Add borders or crop to get to this size.
Resize to your standard frame size.

Some examples:

1) Say you want to make a DivX/XviD (PAL). Full PAL TV on a PC gives 52 µs * 14.7692 MHz = 768 pixels, ending with 768x576. Your "SAA7108, Nvidia with a WDM driver" has a capture window of 53.33 µs. So, how many PC sized pixels fit in the capture window? 53.33 µs * 14.7692 MHz = 788 pixels.

So you have two options here (however, if you can't capture at arbitrary sizes you only have one option):

a) Cap pixels that are at or close at the target size. That's not always possible.

b) Capping at another resolution and resize to the correct pixel size afterwards. If you want that, cap high, say 720x576.
However, your card caps only 53.33 µs, so you need less pixels to make up the difference with the 52 µs a PC needs for correct AR. How many?
(52 / 53.333) * 720 = 702 in total.

So, remove 18 black pixels of your 720x576 cap. You now have 52 µs of info in 702 pixels. Resize the resulting 702x576 to 768x576 or a scaling of it.

2) Say you want to make a DivX/XviD (NTSC). Full NTSC TV on a PC gives 52.6555 µs * 12.3064 MHz = 648 pixels, ending with 648x480.
Your "SAA7108, Nvidia with a WDM driver" has a capture window of 53.33 µs. So, how many PC sized pixels fit in there? 53.33 µs * 12.3064 MHz = 656 pixels.

So you have two options (if you can't capture at arbitrary sizes you only have one option) here:

a) Cap pixels that are at or close at the target size. So, try capping at 656x480 and remove 8 pixels to get 648x480. Add or remove 8 pixels to obtain a width which is divisible by 16.

b) Capping at another resolution and resize to the correct pixel size afterwards. If you want that, cap high, say 720x480.
However, your card caps only 53.33 µs, so you need less pixels to make up the difference with the 52.6555 µs a PC needs for correct AR. How many?
(52.6555 / 53.333) * 720 = 711 in total (712 rounding it to an even number).

So, remove 8 black pixels of your 720x480 cap. You now have 52.666 µs of info in 712 pixels. Resize the resulting 712x480 to 648x480 or a scaling of it.

3) Say you want to make a SVCD (NTSC). NTSC SVCD gives 53.333 µs * 9 MHz = 480 pixels, ending with 480x480.
Your card has a capture window of 51 µs. So, how many SVCD sized pixels fit in there? 51 µs * 9 MHz = 459 pixels. That is not a nice number to cap. Depending on the codec, you should always cap mod 2, 4 or 8.

So you have two options (if you can't capture at arbitrary size you only have one option) here:

a) Cap pixels that are at or close at the target size. So, try capping at 460x480 and pad to 480x480.

b) Capping at an other size and resize to the correct pixel size afterwards. If you want that, cap high, say 720x480.
However, your card caps only 51 µs, so you need extra pixels to make up the difference with the 53.333 µs a SVCD needs for correct AR. How many?
(53.333 / 51) * 720 = 752 in total.

So, add 16 extra black pixels to each side of your 720x480 cap. You now have 53.333 µs of info in 752 pixels. Resize the resulting 752x480 horizontally to 480x480 and again, you have SVCD with correct AR.

4) Say you want to make a CVD (NTSC). NTSC CVD gives 52.148 µs * 6.75 MHz = 352 pixels, ending with 352x480.
Your card has a capture window of 53.33 µs. So, how many CVD sized pixels fit in there? 53.33 µs * 6.75 MHz = 360 pixels.

So you have two options (if you can't capture at arbitrary size you only have one option) here:

a) Cap pixels that are at or close at the target size. So, try capping at 360x480 and remove 8 pixels to get 352x480. However, since 360 is pretty low, it is recommended to capture at a higher size. Have a look at the option (b).

b) Capping at an other size and resize to the correct pixel size afterwards. If you want that, cap high, say 720x480.
However, your card caps only 53.33 µs, so you need to less pixels to make up the difference with the 52.148 µs a CVD needs for correct AR. How many?
(52.148 / 53.33) * 720 = 704 in total.

So, remove 18 black pixels of your 720x480 cap. You now have 52.148 µs of info in 704 pixels. Resize the resulting 704x480 horizontally to 352x480 and again, you have CVD with correct AR.

References:
Der Karl's Capture Karten aspect ratio fuer Dummies: Der Karl's Capture Card Aspect Ratio for Dummies (in German).
Capture-Cards and aspect-ratio for Dummies: Der Karl's Capture Card Aspect Ratio for Dummies (translated by Arachnotron).

Footnotes:
(1) µs = microseconds
(2) For PAL, a source is suitable if the full 4:3 image is contained in 52 µs (52.666 µs for NTSC). In general this will only be the case in local broadcasts.
(3) It can be that a capture window of 52 µs for PAL (for NTSC similar) still contains black borders due to a horizontal offset.
(4) Padding to a certain size means "adding black pixels until the size reached".
(5) Evidence can be found in the footnotes of section 5.

Next step: capturing with VfW or WDM drivers

Back to the Index: HOME

English version last edited on: 06/10/2004 | First release: n/a | Authors: Version4Team | Content by Doom9.org

device	PAL (with vertical downsize)	NTSC (with vertical downsize)
bt8x8	400x576	368x480
cx2388x	384x576	320x480
saa71xx	384x576	320x480
ati cards with theater chip	480x576	400x480