|
XviD FAQ: Explanation of many advanced options of XviD codec
|
1. What can you tell me about options in general?
When you open the XviD codec configuration tab in a program like Vidmex
for the first time, you will see all kinds of buttons, sliders,
subsections and subpanels featuring options, options and.... guess
what? More options.
Basically, all the options are what makes XviD so much different from
most other MPEG-4 codecs. They can make (or break) your encode and can
also make a huge encoding speed difference. You can choose certain
options that will make your result
playable with codecs/decoders other than XviD (like for instance the
DivX or 3ivx codec or ffdshow) , or you can choose all those extra
options
that make XviD so unique and versatile. You can choose options that
will allow for realtime encoding of a running TV- or Camera-capture, or
you can choose options that will make the codec slow down to a crawl
but will give really great results.
(The following description is for the Vfw(Video For
Windows)-interface for the RC4-build. In other builds the interface
could be different, but you will find many if not all options present
in one way or another)
The Main configuration panel
The Main configuration panel has the subsections 'Main Settings',
'Zones' and 'More' (Note: 'More' is '...' in the betas). In 'Main
Settings' You see 'Profile @ Level' , 'Encoding type', a button called
'Target Quantizer' and a slider ranging from 1 to 31 for target
quantizer or 16 to 8000 kbps for target bitrate. Profile@level and the
number of passes have to be set.
In the subsection 'Zones' you can assign different parts of the clip
you want to encode unique zones, and set specific options for them with
the 'Zone Options'-button.
The 'More' subsection holds the 'Advanced options'-button which will
pop up another configuration tab containing.... more options of course.
:)
Beneath these subsections you find three buttons called 'Load Defaults', 'Decoder Options' and 'OK'.
The first one will reset all options to their original default
settings, the second button will allow you to tweak the decoder's
post-processing settings. 'OK' is for when you're all done configuring
and you want to switch back to the videotool.
2. What are all these profiles about?
A Profile basically corresponds to certain MPEG-4 standards meant specifically for certain use-scenarios.
In each Profile you have different levels, which limit that particular
profile to a specific bitrate-scenario. A bitrate scenario can set
maximum limitations to bitrate, framerate, framesize and other
settings. For example, Simple@Level 2 limits you to an upper bitrate of
128 kbps and a maximum framerate of 15p/s. Both Profiles and levels
make hardware support easier because it defines certain maxima that
have to be supported.
Note: The DivX hardware profiles are roughly equivalent to this. In
fact some betas of XviD had DivX-profile-compliant-profiles. ;)
The 'Simple'
profile is exactly what it says, 'simple'. It's fully equivalent and
compliant to the MPEG-4 standard 'Simple' profile. It has many major
limitations, not only on resolution and framerate but also on the kind
of encoding options and optimizations you can use. It's primarily meant
for low quality, low bitrate solutions (like mobile applications). Use
it only if you have a specific need for it, like you need playback of
your clips on a hardware device that only supports MPEG-4 Simple
profile.
ARTS stands for 'Advanced RealTime Simple' and is basically the
'Simple' profile optimized for streaming applications. One extra option
only allowed with ARTS is the 'Reduced Resolution' option, which is
next to useless for most people anyway. It basically does nothing with
the resolution but reduces detail to 1/4 of the orginal in certain
cases. Use ARTS only if you have a very specific need for it.
AS stands for 'Advanced Simple' and is the most often used
profile. It allows for nearly all available options to be used and
allows for resolutions up to and including DVD-resolution of 720x576 at
30 fps, which is good enough for most people.
It also allows for all sorts of 'advanced' options (not te be mistaken
for the options in XviD's 'advanced options' tab, they are not
the same). These include among others B-frames, Qpel, GMC, and also the
use of other Quantization Matrices than H.263. Simple and ARTS only
allow for H.263 to be used, whereas AS gives you the MPEG QM and
MPEG-Custom, with which you can define your own Custom Matrix or load
someone else's.
Finally you have the 'Unrestricted'
profile that will give you unrestricted access to all options,
resolutions and framerates. So if you want to experiment with a
grayscale interlaced Cartoon with RRV at 2048x1024 @120 Fps, there's
your chance. (Just don't expect it to work on anything else)
3. Profile@Level "More..." panel - Profile tab
Here you can set options ranging from top to bottom:
-Profile@Level setting
-Quantization Type
-Adaptive Quantization
-Interlaced Encoding
-Quarter Pixel
-Global Motion Compensation
-Reduced Resolution
-B-VOP's
-B-VOP fine tune settings
-Packed Bitstream
-Closed GOV
3a. Quantization Type: What is a Quantization Matrix?
The quantization matrix is the 8 by 8 matrix of step sizes (sometimes called quantums)
- one element for each DCT coefficient. It is usually symmetric. Step sizes will be small
in the upper left (low frequencies), and large in the upper right (high frequencies);
a step size of 1 is the most precise. The quantizer divides the DCT coefficient by its
corresponding quantum, then rounds to the nearest integer. Large quantums drive small
coefficients down to zero. The result:
many high frequency coefficients become zero, and therefore easier to code."
To put it short: It's the lossy filter by which you achieve higher compressibility by losing detail. The amount of detail
lost is determined by the values of each individual quantum.
More theory about the technical process behind it:
http://www.ece.purdue.edu/~ace/jpeg-tut/jpegtut1.html
http://www.mpeg.org/MPEG/MSSG/tm5/Ch7/Ch7.html
http://www.cmlab.csie.ntu.edu.tw/cml/dsp/training/coding/jpeg/jpeg/encoder.htm
http://www.acm.org/crossroads/xrds6-3/sahaimgcoding.html
Note: Most of these links are about Jpeg but the process is nearly the same for MPEG-4
3b. Which quantization matrix should I use?
It is best to think of the MPEG quantizer as "sharpening" the image, and the H.263
quantizer as "softening" it. For high bitrates, you will find that MPEG
quantization preserves more detail - for low bitrates, the smoothing of
H.263 will give you less block noise.
You can also choose 'MPEG-Custom' which will allow you to choose other
Quantization types of your own choice. When you do, the 'edit
matrix...' option will become active which will allow you to modify,
load or save a custom Quantization Matrix.
Custom matrices are considered 'for advanced users only' so as a newbie
it is best to stick with the two default choices, which are pretty good
anyway.
3c. What is Adaptive Quantization/Lumi masking?
Adaptive quantization, called 'Lumi masking' in older builds, is
a first 'psychovisual' innovation in XviD; it is supposed to make use
of the fact that the human eye tends to notice encoding errors less if
they happen in very dark or very bright parts of the picture. XviD is
capable of using different quantizers for each macroblock. Lumi-masking
compresses very dark or bright areas stronger than medium ones. So it
will use less bits on some frames in the second pass than in the first
pass. The saved bits are of course spent again and that way we gain a
bit of quality in the medium-brightness part of the picture. As it is
experimental, you may sometimes notice more blocks than when it is
disabled.
Note that Lumi masking is broken in Koepi's 26062003-1 build, and that
it is a bit buggy in many older builds.
AQ works in the newer beta and RC builds but is still not very
extensively tested.
3d. Interlaced Encoding
This allows you to encode the entire clip in interlaced mode. Note
however that your source has to be genuinely interlaced for this to
give the proper results. Most interlaced video is not really interlaced
but mutilated progressive content. Especially most interlaced DVD's and
VHS video's are (sometimes poorly) interlaced versions of (progressive)
film.
Typical examples of true interlaced content are live TV-shows and Camcorder footage.
3e. How does Q-Pel work and when should I use it?
Q-pel (or Qpel) is the short name for Quarter Pixel motion search precision and this option activates the use of quarter pixel precision.
Motion search tries to capture all the motion between one frame and the
next, so that the macroblocks (MB's) can get the right motion vectors
assigned to them. If the motion is properly captured then there will be
no need for extra alterations to the MB other than a motion vector,
sparing quite some bits. The more precisely the motion is captured, the
less bits have to be assigned to the content of the MB's, and the more
MB's can just consist of a motion vector.
So, theoretically,
a more precise motion capture would save in altered texture
information, thereby saving bits, and increase precision of the overall
compression, thereby increasing quality. (We will soon see why this is
just theoretically)
Normally XviD uses half-pixel motion search precision. This means that
it can 'see' movement in a sub-pixel precision; if a MB moves from a
width,height-position of 200,300 to 201, 300 in the next two frames, it
can detect that movement correctly and can give the MB a motion vector
that says "move me half a pixel to the right this frame please" in
those next two frames. Motion will be captured correctly and no texture
bits get altered.
Now with Qpel you can capture motion that is only a quarter of a pixel per frame, effectively doubling precision.
Example:
A MB that moves (fluently) from position 200,300 to 201,300 in the next four
frames only moves one quarter of a pixel per frame. With normal
half-pixel precision this motion would appear 'jumpy' and the codec
might have to compensate for this by altering the texture bits of the MB.
This of course takes space, and the MB would no longer consist of only
a motion vector; it would have to be assigned additional bits for the
altered texture information, thereby decreasing compressibility.
With Qpel that motion still gets captured correctly and not extra
texture bits would have to be assigned, reducing the number of bits
used for that frame.
Easy eh? But wait, there's a catch....
So, what's the catch?
The catch is, that just using the Qpel precision alone already uses additional bits, whether it helps saving bits or not.
This is caused by the additional precision that requires more bits to
be allocated to the motion vectors. Instead that a motion vector could
just be something like 0.5,0 (half a pixel width movement, no height
movement) it would no have to be 0.25,0 (quarter of a pixel width
movement, no height movement). So instead of one decimal after the
point it now requires two decimals behind the point, requiring the
codec to throw more bits at it.
(Please note that this is an oversimplification of the actual process, but it is accurate enough to get the point)
Instead of another decimal Qpel actually uses another extra bit (set
either to 0 or 1) for every axis, which is enough to achieve double
precision. There are two axes, one for width and one for height, so
each motion vector requires two extra bits for Qpel.
If we assume that there is one vector for every macroblock (there might
be 4 or 0), at the resolution of 640x272 and 24fps and P-frames only,
two bits for every macroblock take 40 x 17 x 2 x 24 = 32640 bits or 32.5 kbps.
So, basically, no matter the outcome, Qpel always takes a sizeable chunk of the bitrate just for itself even if it doesn't help compression one damn bit.
Now usually it does help, but the texture bits saved by the better precision have to outnumber the added bits to the motion vectors before Qpel increases compressibility at the same size.
If the saved texture bits outnumber the extra motion vector bits then
you will have increased compressibility (and quality) at the same size.
If the saved texture bits do not outnumber the extra motion vector bits you will have wasted space and the end result might look worse.
So how can I tell if Qpel will increase or decrease compressibility?
That's the other catch: You can't.
Not in advance. There's no way to tell from looking at the source if
Qpel will help or not. It doesn't matter if it's a fast motion scene or
a low motion scene, a panning scene or a zooming scene...there's just
no way to tell beforehand. A fast motion scene could be 90% Qpel
movement or 90% Half-pixel movement, or any percentage in
between...making any prior assumptions about the benefits of Qpel
ridiculous.
The only real way to find out is to try encoding both with and without Qpel and see which result looks better.
(Now you can see why there is a difference between theory and practice...)
Some additional Notes:
-Because of the increased precision, Qpel significantly increases
encoding time, and requires more processing power to decode. Encoding
time can be almost doubled and decoding can require as much as 30-60%
more processing power.
-Current 3ivX implementation normally uses half-pixel precision. As an 'advanced' option, you can tell it to use full-pixel resolution.
-In some older (alpha) builds Qpel could introduce artifacts, but the
current implementation has no known bugs. It's safe to use.
3f. What's GMC and what is it good for?
GMC stands for Global Motion Compensation
and that pretty much tells the story of what it does. If used, it will
look at the whole frame and see if there is an amount of motion that
all the parts of the frame
have in common. It will then take this amount of motion and put it in a
single value.
The parts of the frame are the macroblocks, and the amount of motion is
called a 'motion vector' which has both a direction and a value (as a
sort of two-dimensional X,Y value) .
All the macroblocks normally have their own motion vectors, but with
GMC the one motion vector that they all have in common (that's why it's
called 'Global') will be compensated and put into a single motion
vector. Some macroblocks' movement will be completely compensated for
by the GMC vector, getting completely nullified by the compensation
process. These macroblocks' motion vector will then be removed, as it
is the same and is only extra information. The possible benefit is that
you can remove many or all the motion vectors of the macroblocks (or
even the blocks themselves if there is no altered texture information)
in a frame by a single value, thereby making it much smaller.
Note however that this is for one-warppoint GMC. With Multiple
warppoints the proces is much more complex, but the principle is the
same.
3g. Warppoints, hmm...what's a warppoint?
A warppoint is a motion vector that defines a displacement of one *edge* of the video.
Take a piece of paper and move it by its edges and you'll see what I mean.
-The first warppoint defines displacement of top-left edge. If it's the
only warppoint, the rest of the picture just has the same vector and
the whole picture moves. Think panning.
-A second warppoint defines displacement of top-right edge (not
*precisely* true but close enough without getting too technical).
Together with the first warppoint, this is enough to define panning
*and* zoom. Note that it could be used to define panning and rotation
instead, but isn't.*
-A third warppoint means displacement of down-left edge and three warpoints are enough to define panning, zoom and rotation.
-A fourth warppoint would create perspective-like movement.
Note that XviD's GMC uses 3 warppoints, whereas DivX's GMC uses only
one. Warppoints are stored in the frame's header, and only if they are
used.
3h. What is this Reduced Resolution ?
Reduced Resolution (also called RRV for Reduced Resolution VOP)
is in fact reduced resolution. Macroblocks become 32x32 pixels big, and
all DCT data - once decoded - is scaled up to be 16x16 (DCT itself is
still 8x8). Motion is also severly restricted, with vector components
restricted to either zero or odd values (odd values point to halfpixel
positions). Also, an in-loop deblocker kicks in.
Everything described in the standard.
RRV is part of the realtime profile and only works with the ARTS and
unrestricted profiles in XviD. It was designed to keep the picture
"visible" if bandwidth can suddenly become very limited. Reduced
resolution VOPs are dynamic and can be turned on and off at any time,
but XviD does not offer this - at least not by VfW.
I don't see any good reason to use RRV, because if the decision is not
dynamic (like now in XviD), you'd be better off if you just encoded
with halved resolution. Also, XviD is the only decoder that can play
such files (90% of MPEG-4 decoders can't).
The only realistic use one could possibly have for this feature is for
encoding for low-bandwidth mobile devices using GPRS/UMTS.
It's still not fully mature and it is to be used only by adventurous people.
3i. BVOPs: B-frames
B-frames (or BVOPs in techie talk) are so-called bi-directionally
encoded frames and are part of the Advanced Simple Profile definition.
Without B-frames you just have the Keyframe making a clear definition
of the content of the frame every XXX frames, and all other frames are
P-frames referring to the previous frame for description. A B-frame
also takes the next frame into account, so it refers to other frames in
two directions (ergo the bi-part).
The advantage of B-frames is that they are usually encoded with a
higher quantizer and take less space per frame, while the loss in
quality is less than equal to the loss in used bits. Basically you use
the inherently smaller and lower quality b-frames to save space that
will be used for improving quality all over the clip. The netto effect
is usually a quality gain, depending on the b-frame settings and the
type of source.
3j. B-frame settings
-Max Consecutive BVOPs:
Here you can limit the number of B-frames in a row. Recommended
settings are 0 for off, 1 for DivX 5 compatibility, 2 for best effect
and 3 for intensive use.
-Quantizer ratio: Multiplying
the (average) quantizer of the surrounding non-B-frames with this value
will give you the Quantizer of the B-frame. So if the two adjacent
frames have quantizers of 2 and 4, the average quantizer will be 3.
Multipying this with a quantizer ratio of 1.50 will give you a B-frame
with a quantizer of 4.5.
-Quantizer offset: Take the result
of the calculation above and then add this value. With a quantizer
offset of 2.00 you will end up with a quantizer of 6.5.
As a rule of thumb, upping the latter two values will give you lower quality B-frames.
3k. Packed Bitstream and Closed GOV
-'Packed Bitstream' is an option that can deliver mixed results during playback, depending on what you use for playback.
It's meant to solve frame-order issues when encoding to container formats like avi that can't cope with out-of-order frames.
And while it's meant to solve playback issues that occur without it, lots of people have reported playback issues with it. That goes for playing back with ffdshow, DivX 5 decoder, and several standalone (hardware) players.
Unless you know precisely
what you're doing, it's best to keep it turned off until further
notice. If you encounter problems with choppy playback, try turning
this feature on and see if it helps. With the latest XviD version some
of the problems involving Packed bitstream are said to have been solved
but more feedback is needed on this. So if you have any particular info
please send me a Personal Message at the Doom9.org forum.
Note: If you only play your files with the XviD codec, you never ever have to use packed bitstream.
-'Closed GOV' (Quote from Doom9):
Closed GOV ensures that a p-frame is used before every new I-frame.
This option should always be checked (otherwise you might end up with a
frame sequence like PBIP where the B frame has a forward reference to
an I-frame which makes no sense).
4. 'More' Profile options - Level tab
This tab is informative only and shows you the different bitrate limitations of each profile@level setting.
You can clearly see that maximum rates go up with each higher level. It
should be no surprise then that AS@Level 5 is currently the recommended
setting.
'Unlimited' is exactly what it says, and is there for more adventurous
people who want to experiment with all the available features.
Please also realize that right now the actual limiter in the codec
itself is not implemented yet,
as it's hard to do properly. You have to check for yourself if your
file adheres to the limits defined by the profile you chose.
5. 'More' Profile options - Aspect Ratio tab
-'Aspect Ratio' (AR) (sometime called DAR, for Display Aspect Ratio)
is the ratio between height and width of the picture. In general you
can use any AR you'd like, but for most encodes that have something to
do with encoding from a standard format (like TV, DVD or VCD) or
decoding to one, you can use this tab to select an AR for the entire
encode or an AR for the pixels (PAR).
Why you would want to do this? To properly capture the original resolution of the video you want to encode.
Consider a 720x576 DVD. With a square pixel that DVD would have quite a
different look when played than if the pixel was of a standing or lying
oblong size.
And yet that is what happens when you convert from NTSC to PAL, from
DVD to VCD or from TV to computer. NTSC, PAL, VCD and computer monitors
all have different pixel sizes, and some formats that where created
before the digital age (like all TV standards) have different ways in
which they treat pixels.
In reality however, the difference between them generally is small and
few people have noticed the difference. Also,
support for proper handling of the AR information in clips is still
very rare. So while this option is here to enable the purists among us
to completely recreate the original AR, you don't have to use it if you
don't want to. In fact, it's full and correct implementation requires a
rather thorough understanding of the underlying problems of different
Video AR standards, and a discussion of these problems is far beyond the scope of this faq and you should consider the AR tab for advanced users only.
If you want to know more about proper AR implementation, follow these links:
Doom9 forum discussion about this feature: New Feature: Display Aspect Ratio
A document outlaying the differences between pixel sizes:
Square and Non-Square Pixels
A document outlaying the differences between video AR standards:
A Quick Guide to Digital Video Resolution and Aspect Ratio Conversions
6. Encoding type: Single pass or two pass, what should I do?
-Single pass
will take your clip and encode it at once. It takes each frame of the
clip, checks that frame's compressibility, and then encodes it.
-Two-pass
uses the first pass to make an estimation of how well your clip
compresses and then uses the compressibility data gathered during the
first pass to really encode the clip during the second pass.
Which one to choose depends on what you desire from the result.
Two-pass does a much better job at evenly distributing bits where they
are needed and therefore gives you a much better looking end result.
Single pass is really for those type of uses that can only be done with
single-pass, like for instance real-time encoding a live feed, like a
TV-capture or a security camera.
Unless you absolutely have to go for single pass for a specific reason
there really is no other way but two-pass.
Note that DivX 5 nowadays has a 'multi-pass' option, allowing for more
than two-passes. This is meant to tweak the bit-distribution even more
(sort of by averaging between a large number of passes), but many users
report nearly zero benefit after the third pass. XviD really doesn't
need a technique like this because the bit-distribution-decision it
makes is more intelligent and produces better results.
7. Single pass 'more' settings explained
These options are used only when encoding with a fixed bitrate. When
using a fixed quantizer/quality mode the codec goes into
'imbecile-mode' and just plainly encodes everything using a very simple
set of rules. Needless to say fixed quantizer/quality has very limited
use for most of us.
-Reaction Delay Factor: When we talk about fixed bitrate, it isn't really
fixed. Instead, the bitrate is averaged between a number of frames.
Every next frame gets altered by a value that is the average of the
last X number of frames. X is determined by the RDF. So, the bigger
this value, the slower the codec will react to quick alterations in
complexity and vice versa. Quicker alterations do more accurately
reflect the complexity of the source, but slower alterations give a
less extreme bitrate distribution.
-Averaging period: This
setting does something similar to above, but instead it does it with
the quantizers. If this value is 100, it takes the average quantizer
computed from the last 100 frames, and then assumes that average
quantizer for the next frame.
-Smoothing: The codec keeps a
record of how much each frame differs in size from the 'average' size.
This is called deviation and it recomputed after each frame. This value
is used in conjunction with that deviation. It has a small averaging
effect on the whole clip, and the bigger this value is, the smaller the
influence.
As a whole, the quality gains from these options are minimal. You really don't want to use single pass. Trust me.
8. Two-pass first pass 'more' settings explained
Here you can alter three settings: One is the name and path of the stats file (this is the data file with the compression data gathered by the first pass).
-'Full quality first pass':By
default XviD switches certain options off in the first pass to speed it
up a bit. The options that are switched off are not really necessary
for a 'normal' first pass (normal as in: you don't keep it afterwards)
and turning them off can increase encoding speed considerably.
However this is a relatively new option and some people are concerned that the second pass might
have a little bit less quality because of the options turned off in the
first pass. Since the jury is still out on this one you can set this
option to make the codec use all the options for both passes, if it
makes you feel comfortable. Recommended setting is off so the codec can speed things up a bit.
-'Discard first pass'
determines whether or not the first pass should be kept or not. It is a
proper video file, and you can play it but it is not recommended that
you keep it. You should think of it as no more than a rough
approximation of the end result and even though you can play it, it
might not be MPEG-4 compliant. Always discard it unless you turned on
'Full quality first pass'. (Then you may choose to keep it, as the file
created by the first pass will then be completely normal)
9. Two-pass second pass 'more' settings explained
Most of these settings are meant for tweaking of the allocation of bits
during the 'real' encoding pass. They are for advanced users only and
are best left alone.
-The first setting here is the name and path of the stats file to be
used. Normally you should keep this the same between first and second
pass. of course you can experiment with this by using the stats file
from a pass with other options, but for normal encoding it's best left
alone.
Intra-frames Tuning 'I-frame boost(%)' can be
used to give some extra bits to Keyframes. The amount is an extra
percentage, so a value of 10 will give your keyframes 10 % more bits
than normal. (Note: I know of no circumstance that would justify setting this to any other value than default)
-'I-frame closer than ... frames' and '... are reduced by %'
can be used to adjust the size of keyframes that you consider to close
to the first (in a row). The first setting sets the range in which
Keyframes are reduced, and the second settings determines the bitrate
reduction they get. The last i-frame will get treated normally.
Next you have the overflow treatment and the Curve compression
settings.
They differ from each other in that overflow treatment is about the
treatment of abrupt differences in bitrate from one frame to the next,
and Curve compression is about adjusting the bitrate distribution over
the clip as a whole.
Overflow treatment
'Overflow treatment' is the technique used to obtain a properly sized end result. Usually you specify a target filesize and the codec can either overshoot it's target, creating a file too big, or it can undershoot
it's target, creating a file too small. Too counter this, overflow
treatment can either allocate more bits than abolutely necessary,
increasing filesize, or allocating less bits than really necessary,
decreasing filesize. Obviously, the second process involves
compromising quality.
-'Overflow control strength %'
This is best described as the 'aggresiveness' of the overflow control.
Mental picture: Imagine the 'Overflow control mechanism' as a little guy in your computer running around with glue and a chainsaw redistributing bits all over your clip. Now imagine him getting more agitated. :^)
Higher settings will make changes in bit redistribution more abrupt,
possibly too abrupt if you set it too high, creating artifacts.
Settings that are too low have a bigger risk of breaking filesize
prediction.
0=Default from core (let XviD decide). I know of no obvious reason to experiment with this setting.
-'max overflow improvement %' sets in % how much each frame may grow if the file would become undersized. Think of it as 'adding fat to your clip' if it's not as fat as you asked it to be.
-'max overflow degradation %' sets in % how much each frame may shrink if the file would become oversized. Think of it as 'trimming fat off your clip' if it's a bit too far on the porky side. :^)
Obviously, experimenting with these settings may break filesize prediction completely as you're altering the basic settings of the underlying process.
(Personally, I couldn't care less about the first setting because I consider undersized files to be a good thing)
Curve compression:
Normally the internal curve adjustment values (determined by the XviD
developers after much feedback from users) are capable of delivering
very nice results (I should say 'excellent' really), but if for one
reason or another you want to, you can use these values to adjust the
lows and highs of the bit allocation.
If you make a mental image of the curve allocation, you see a graph with 'highs' and 'lows', sorta like hilltops and valleys.
The hilltops are scenes with high bitrates and the valleys are scenes with low bitrates.
-The 'High bitrate scenes %' setting will take bits away from high bitrate scenes and give them back to the bit reservoir. (Think of a bit reservoir
as a bucket full of bits wherefrom the codec hands out to each frame)
Ergo, it will lower the hilltops, and the bits gained by this will be
divided equally across the entire landscape. This is useful if you
really need to keep your encode within certain maximum parameters, like
the maxima for a specific profile@level setting. You could also use
this if you have a clip with so many bits allocated to high-bitrate
scenes that the low(er)-bitrate scenes start to look bad.
-The 'Low bitrate scenes %' setting will give extra bits to the low bitrate scenes,
sort-of-like filling the valleys with sediment. The bits have to come
from somewhere, so the codec takes all the frames in the entire encode
and scrapes a few bits of them all. This might come in handy if you
have a few low-bitrate scenes that are still blocky.
So, basically, each setting favors compression of one of two possible extremes towards the average of the entire encode. The first takes down the hilltops, the second fills the valleys. The bits lost or gained are allocated accordingly to average out the entire clip.
Another way of seeing curve compression is seeing it as a way of redistributing quality
from both bitrate extremes to the average. If for instance you're not
happy about low bitrate scenes you can adjust the Low bitrate scene
setting to give them more quality, at the cost of a tiny amount of
quality of all frames of the clip.
10. What's this 'target quantizer' and what I should I do with it?
The 'Target Quantizer' button allows for different things depending on the number of passes you selected.
In 'Single pass' it allows you to switch between 'Target Quantizer' and 'Target Bitrate'. The input field next to it and
the slider below alter accordingly showing the possible settings.
In 'Twopass - 1st pass'
mode the button, input field and slider are inactive. This is logical
since the first pass is basically an expedition to gather data about
your clip, without making any real decisions on the resulting file.
In 'Twopass - 2nd pass' mode you can choose between 'Target Bitrate' and 'Target Filesize' instead of 'Target Quantizer'.
This reflects the fact that most twopass encoding scenarios are aiming for a specific filesize to be reached.
Please note that filesize is in Kilobytes not -bits or Megabytes. So if
you're aiming for a file of 35 MB you have to enter 35840 (35x1024)
instead of 35.000.
Also, when you set the button to 'Target Bitrate' you also have to keep
in mind that the maximum bitrate is to be limited by the Profile and
level settings. Currently XviD does not do that automatically (mainly
because it is difficult to implement) but you should be aware of the
limitations of the profile and level you use.
11. Zones?
Zones allow you to assign different settings to different parts
of your clip. You can add as many zones as you want, and each zone can
have it's own unique settings.
It's most useful function is to allow for certain parts of your clip to
be considered 'less important' than the rest by defining zones and how
'important' they are to you. You can define this importance in the tab
accessible through the 'Zone Options' button.
To assign specific settings to a zone, select it in the 'main' panel and then click on 'Zone Options...'.
There are two different ways to define the importance of a zone:
-Weight defines your 'importance' as a value relative to the quality
of the whole clip. If you define a zone with a weight of 0.30 then that
zone's bitrate will use 30% of the quality of the whole clip.
-Quantizer
defines the 'importance' as an absolute average. What that means is
that a Zone with a Quantizer of 5 will absolutely have an average
quantizer of 5. How this saves bits depends of course on what settings
the other zones have.
Other zone options include:
-'Begin with Keyframe' will make each zone start with a Keyframe. Recommended, especially if you plan on using chapters.
-'Greyscale Encoding' tells the codec to encode without color information. Great for saving bits on boring end credits.
-'Chroma Optimizer enabled'
will do some extra magic on color information to minimize the
stepped-stairs effect on edges. It will improve quality at the cost of
encoding speed. It reduces PSNR by nature. The mathematical deviation to the original picture will get bigger - but the subjective
image quality will raise (as mentioned, the "stair step artifacts" get
less). Since it works with color information, you might want to turn it
off when encoding in greyscale.
-BVOP Sensitivity will
allow you to tweak the amount of B-frames in each zone. A positive
value increases the amount of B-frames; a negative value decreases it.
Currently the minimum is -35, anything below will not make any
difference. There is no maximum, but value is pretty sensitive; 5 will
give you a reasonable boost. I'd say that practical maximum is about 25.
It's advised to be conservative with zones, as it's still a relative new development, and pushing it's limits might give unexpected results.
Current bugs (Xvid-1.0-Final): Weight zones with weight lower than 0.2 break filesize prediction.
The 'Advanced options' window
12. 'Motion' tab
A) - Motion Precision
Here you will find the options defining the motion search precision.
Motion search is the process in which the codec is trying to figure out
how every part of the original clip was moving. The more it searches,
the more precise it's estimation of the original motion will be, and
the better the resulting clip will capture the original motion.
Why capture motion you might ask?
Let's take a look at a simple example, a clip of a white block moving
to the right. Each frame a part of the picture where
the block no longer is becomes the background color while another part
has to become the block color. The change that would have to be encoded
each frame would constitute a significant amount of bits.
Instead the codec just takes the block and checks to see if it is
moving. If it moves, the codec captures that movement with it's motion
search, and then uses the found value as a motion vector for that
specific block. In reality the process is more complex but the basic
idea is that most altering of color and texture information is caused
by motion and therefore a great amount of color and texture bits can be
saved if that motion is captured by other means.
-Motion search precision:
This is the most basic search precision and it hardly takes any
processing time, so it's recommended to leave it at 6. Only go as low
as 5, and only if you're in a hurry. BTW: It only works in the
luminance plane, i.e. it only looks at changes in brightness values,
not color.
-VHQ mode: VHQ is more intensive search and
takes a wider approach. Higher settings will slow down encoding
significantly. Setting 1 has a relatively small impact and it is
recommended for all encodes. Using higher values will give you better
quality at the cost of encoding speed.
-Use Chroma motion:
Chroma motion is basically a sort of 'Motion search precision setting
7' whereby it also takes color information into account (from the
Chroma plane, ergo the name). Recommended.
-Turbo ;-): This
setting skips some search techniques when using Qpel or B-frames to
speed it up a bit. Without those options on it has no effect at all.
The impact on quality is negligible.
B) - Other
-Frame drop ratio: Experimental/For experts only. Don't set it to anything other than 0 unless you really really know what you're doing.
-Maximum I-frame interval:
This setting tells the codec to insert a Keyframe (I-frame) every
{value} frames. If a Keyframe is needed before that number is reached,
the codec starts counting again. So while you can have Keyframes with
lower intervals than the number you defined, you can't have higher
intervals.
Standard recommended settings are 10x the framerate, i.e. 250 for 25
fps PAL clips, 300 for 29.979 NTSC clips etc. However, there is a
visible effect called Keyframe-pumping. This resembles a slow
degradation in quality in consecutive P- and B-frames with a sudden
'jump' in quality when a new Keyframe is inserted. Lowering the maximum
I-frame interval in these cases can help. Setting it too high can cause
bad seeking in files, because the seeking process uses only Keyframes,
and less Keyframes=less accurate seeking.
-'Cartoon mode' activates two different techniques, both designed to help with cartoons:
- detect_static_motion is a motion estimation flag and it works as a
limit. If the motion found by the motion search process is beneath this
limit the macroblock is considered static, and no motion
information is encoded. When Cartoon mode is enabled this limit below
which a macroblock is concidered static is increased so that (more)
small movements are lost. Since A LOT of these 'small movements' is
actually noise (especially with cartoons) it really helps saving many
bits which would otherwise be used to code noise on a static picture.
- vop_cartoon is about quantization - when a block is
motion-compensated well enough (with total error below the limit) it's
just not coded at all. XviD doesn't drop any data in normal mode (limit
= 1), but drops quite a lot in cartoon mode.
Again, this usually means that noise is ignored. It might also remove
some small details, but small details shouldn't really happen in
"proper" cartoons.
So, while the first technique helps with removing movements that are so
tiny that they can be considered not-to-be-part-of-the-source, the
second helps compressibility of the cartoon by removing texture detail
that's considered too-small-to-be-part-of-the-source.
This is exactly what you need for cartoons like Futurama or Simpsons, but it might not be optimal though for high quality animes with lots of fine detail; try and see if it helps.
13. 'Quantization' tab
Here you will find the settings for minimum and maximum Quantizer for
I-, P- and B-frames. With all alpha builds they where set in the range
2-31 and setting the minimum to 1 would just reset it to 2. The reason
for this is that Quantizer 1 really has little use and just inflates
the filesize, with hardly any visible gain in quality.
With the Release Candidate builds the range has changed to 1-31 so that undersized files will become a thing of the past.
Note: With RC1 quantizer 1 would break filesize prediction again (this time creating oversized files) but this has been fixed since RC2.
(Also, with older builds it was sometimes adviced to lower the
maximum settings as well, to prevent frames getting lower quantizers
than they really need and thereby creating visible artifacts. Currently
I do not have enough information to say if this still holds up, but my
guess is it's been fixed because I haven't heard much about it lately.)
Look here for a recent discussion about Quantizer use.
-Trellis
Trellis based R-D quantisation (Rate-Distortion quantisation) is more
advanced and it works only with the quantization type H.263 in alpha
builds. In beta and Release Candidate builds it also works with MPEG
and MPEG-custom. It is a sort of intelligent second quantization pass,
in which a more thorough DCT quantization examination is done by the
Trellis process. In this process it can drop some coefficients
(removing detail) and bring back other coefficients that were removed
by the simpler standard quantization routine. Dropping coefficients
hurts detail, but if bitrate savings are higher, it means that the
codec will be able to use lower quants and retain higher quality.
14. 'Debug' tab
The Debug tab holds some miscellaneous options that are meant for...hang on...debugging!
The 'Performance optimizations'
part allows you to force optimizations for your CPU if it isn't
correctly recognized by the codec. of course if you enable the wrong
optimizations you will get, let's say, interesting results...
Typically MMX should allways be on unless you use an old pentium 1, SSE
and SSE2 is meant for newer Intel chips like the Pentium 2-4, and
3Now!(2) is for AMD chips.
CPUs and their instruction sets
|
MMX |
MMX etx. |
SSE |
SSE2 |
3DNOW! |
3DNOW! Pro |
AMD64 |
| Intel Pentium |
- |
- |
- |
- |
- |
- |
- |
| Intel Pentium MMX |
Yes |
- |
- |
- |
- |
- |
- |
| Intel Pentium II |
Yes |
- |
- |
- |
- |
- |
- |
| Intel Celeron |
Yes |
- |
- |
- |
- |
- |
- |
| Intel Pentium III |
Yes |
Yes |
Yes |
- | - | - | - |
| Intel Celeron II |
Yes |
Yes |
Yes |
- | - | - | - |
Intel Pentium IV, Celeron PIV |
Yes |
Yes |
Yes |
Yes | - | - | - |
| AMD K6 |
Yes |
- |
- |
- |
- | - | - |
| AMD K6-2/K6-3 |
Yes |
- |
- | - |
Yes | - | - |
Athlon, Duron |
Yes |
Yes |
- |
- |
Yes | - | - |
| Athlon XP |
Yes |
Yes |
Yes |
- |
Yes | Yes | - |
Opteron, Athlon 64, Athlon FX |
Yes |
Yes |
Yes |
Yes |
Yes | Yes | Yes |
-'FourCC used:' - Here you can alter the
FourCC used by the resulting file. FourCC is basically a content
identifier code that is contained in the resulting video file. It tells
the media application (like WMP, MPC or DivX Operational Player) what type of codec
should be used to open the video correctly. You can set this for
instance to DivX or DX50 to play with the DivX 5 codec. If you do, you
have to take into account the limitations of that codec, so you can't
use certain XviD features (like more than 1 B-frame or GMC). Not
recommended really unless you want your files to play on a hardware
player capable of DivX but not XviD.
The options 'OutputDebugString debug level:' and 'Print debug info on each frame'
are really only useful for developers, so best to leave them alone.
These will print all sorts of debug information on the resulting video.
-'Display encoding status'
will popup an interesting window once you start encoding, showing you
all sorts of statistical info like Quantizers used, types of frames
used and amount of data encoded. Consider it a very fancy progress bar.
If you don't like it you can turn it off here.
15. Things that work but are not MPEG-4 compliant.
There are some things that you can quite easily do with XviD that are not MPEG-4 compliant.
Not being MPEG-4 compliant means that it can quite possibly work with
any build of XviD but is 'not meant to be' according to official MPEG-4
specifications.
Now, since one major goal of the XviD project is to be fully MPEG-4
compliant this means that support for these non-compliant possibilities
could be stopped/los/broken at any time never to return. So it's higly recommended
that you refrain from using any of these possibilities if you want
proper support for all your encoded material in the near and far future.
Currently known non-compliant situations are:
-Using non-mod16 resolutions: encoding clips to resolutions of which both
width and height are not a multiple of 16 is considered a no-no and is
not compliant. It is quite possible to do mod-4 resolutions with many
builds (don't ask me how) but is not vey good for compressibility. The
smallest MPEG-4 building block is not the 8x8 blocks but the 16x16
macroblocks, and using lower-than-mod16 resolutions will make the codec
use macroblocks 'outside' the frame border....which is 'A Bad
Thing'(tm).
-Using Modulated Quant or Modulated HQ. These were present in older
builds and unfortunately many people did use them now and then, as they
were new features back then and (as usual with new features) a request
was made to test them. Basically the technique makes the codec switch
Quantization Matrices from H.263 to MPEG and back whenever it suited
it's compressibility goals (defined by the user).
Although the option has been removed a long time ago, clips encoded this way somehow still work though.
-The same goes for clips joined/pasted together in video editing tools like Vdub that use different QM's. A clip encoded with
one QM joined with a clip encoded with another QM may work, but is not considered MPEG-4 compliant.
-Clips encoded with certain options that are joined with clips encoded
with those options off can sometimes be non-compliant.
A typical example of this is Qpel. I discovered a while ago that while
Qpel usually doesn't help a lot in the main part of a movie, it can
help a LOT in your average-day scrolling end credits. This
unfortunately turned out to be non-compliant.
-The 'unrestricted' profile makes you build clips that are completely
beyond any official MPEG-4 profile and gives you the full power of the
XviD codec. It's quite obvious that you may break many official rules
using this profile.
-Encoding beyond AS@Level5, especially in higher resolutions (like HDTV & SVGA) is not compliant to any MPEG-4 standard.
(As a sidenote: While many clip-joining stuff is not compliant, I
consider it really weird that they should work at all in the first
place. Perhaps someone more in the know could explain this. I wouldn't
be surprised if some of these non-compliancies turned out to be
unforeseen possibilities within the MPEG-4 specifications (especially
the use of multiple QM's))
16. Where do I find more documentation?
VHQ Manual by Syskin
location
"The" Lumi Masking Thread
location
Please note that many of these documents/topics are either for a
specific build or rather outdated. They are still very informative
though and it is recommended that you read them.
17. I'm a newbie. What settings do you recommend?
Since this is beyond a doubt one of the most often asked questions, it certainly deserves an answer.
The default settings will very likely get you good to excellent results on 'normal' source material like DVD's.
If you like a more thorough answer than 'just stick with the defaults' you can follow some of the guidelines mentioned below.
However, they are not the 'ultimate' settings for everything as
there's simply no such thing. Feel free to deviate (yes, become a
deviant!) and experiment with the settings once you feel more confident
using XviD.
Using XviD always requires you to use your own judgement up to a
certain point. The settings below are very safe and are here just to
get you on your way.
-If something's not mentioned, don't mess with it.
-Always use two-pass encoding. Don't mess with settings between passes, keep them the same for both passes.
-Use Doom9's guides. They're especially targeted at newbies.
-Only use target size, not target bitrate or quantizers.
-Profile: Use AS@Level5, nothing else.
-Motion Search Precision: use 6 - Ultra High
-VHQ: Use VHQ 1 on everything, possibly higher settings if you want 1 CD rips. VHQ 4 is perfectly safe, just slow.
-Quantization Type: use H.263 for a softer image, or MPEG for a sharper
image (at the expense of bitrate, not recommended for a 1CD rip).
Either way, use the same one for both passes.
-B-frames are good, but don't go above 2 Maximum B-frames (so put the
number 2 in the Maximum B-frames box (putting in 1 will result in the
same as DivX's Max number of B-frames)). Keep all other B-frame
settings at default.
-Don't use Packed Bitstream.
-Don't mess with Interlaced encoding, greyscale, Debug settings, Curve compression, overflow treatment, GMC,
Reduced Resolution, and Aspect Ratios. These settings are simply not for newbies; you need to understand them
to use them properly.
-Don't use the Zones, they'll just complicate things for you.
-set 'Maximum I-frame interval' to 10 times the clip's framerate, nothing else.
-Qpel is safe but slow. Always use it if you get undersized files, it will increase quality.
-Chroma Optimizer/Motion is also good and recommended. Always use it.
-When Encoding with XviD make sure the encoding program uses Resize
values that are divisible by 16. That means that the end result has to
have both a width and a height that are divisible by 16. For instance
640x480 = (40 x 16) x (30 x 16).
|
See Also:
The best video codec ? Microsoft WMV9, DivX, XviD comparison.
|
|
|
|