Spaces:
Running
Running
'\" t | |
.\" | |
.\" Author: Lasse Collin | |
.\" | |
.\" This file has been put into the public domain. | |
.\" You can do whatever you want with this file. | |
.\" | |
.TH XZ 1 "2022-10-25" "Tukaani" "XZ Utils" | |
. | |
.SH NAME | |
xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files | |
. | |
.SH SYNOPSIS | |
.B xz | |
.RI [ option... ] | |
.RI [ file... ] | |
. | |
.SH COMMAND ALIASES | |
.B unxz | |
is equivalent to | |
.BR "xz \-\-decompress" . | |
.br | |
.B xzcat | |
is equivalent to | |
.BR "xz \-\-decompress \-\-stdout" . | |
.br | |
.B lzma | |
is equivalent to | |
.BR "xz \-\-format=lzma" . | |
.br | |
.B unlzma | |
is equivalent to | |
.BR "xz \-\-format=lzma \-\-decompress" . | |
.br | |
.B lzcat | |
is equivalent to | |
.BR "xz \-\-format=lzma \-\-decompress \-\-stdout" . | |
.PP | |
When writing scripts that need to decompress files, | |
it is recommended to always use the name | |
.B xz | |
with appropriate arguments | |
.RB ( "xz \-d" | |
or | |
.BR "xz \-dc" ) | |
instead of the names | |
.B unxz | |
and | |
.BR xzcat . | |
. | |
.SH DESCRIPTION | |
.B xz | |
is a general-purpose data compression tool with | |
command line syntax similar to | |
.BR gzip (1) | |
and | |
.BR bzip2 (1). | |
The native file format is the | |
.B .xz | |
format, but the legacy | |
.B .lzma | |
format used by LZMA Utils and | |
raw compressed streams with no container format headers | |
are also supported. | |
.PP | |
.B xz | |
compresses or decompresses each | |
.I file | |
according to the selected operation mode. | |
If no | |
.I files | |
are given or | |
.I file | |
is | |
.BR \- , | |
.B xz | |
reads from standard input and writes the processed data | |
to standard output. | |
.B xz | |
will refuse (display an error and skip the | |
.IR file ) | |
to write compressed data to standard output if it is a terminal. | |
Similarly, | |
.B xz | |
will refuse to read compressed data | |
from standard input if it is a terminal. | |
.PP | |
Unless | |
.B \-\-stdout | |
is specified, | |
.I files | |
other than | |
.B \- | |
are written to a new file whose name is derived from the source | |
.I file | |
name: | |
.IP \(bu 3 | |
When compressing, the suffix of the target file format | |
.RB ( .xz | |
or | |
.BR .lzma ) | |
is appended to the source filename to get the target filename. | |
.IP \(bu 3 | |
When decompressing, the | |
.B .xz | |
or | |
.B .lzma | |
suffix is removed from the filename to get the target filename. | |
.B xz | |
also recognizes the suffixes | |
.B .txz | |
and | |
.BR .tlz , | |
and replaces them with the | |
.B .tar | |
suffix. | |
.PP | |
If the target file already exists, an error is displayed and the | |
.I file | |
is skipped. | |
.PP | |
Unless writing to standard output, | |
.B xz | |
will display a warning and skip the | |
.I file | |
if any of the following applies: | |
.IP \(bu 3 | |
.I File | |
is not a regular file. | |
Symbolic links are not followed, | |
and thus they are not considered to be regular files. | |
.IP \(bu 3 | |
.I File | |
has more than one hard link. | |
.IP \(bu 3 | |
.I File | |
has setuid, setgid, or sticky bit set. | |
.IP \(bu 3 | |
The operation mode is set to compress and the | |
.I file | |
already has a suffix of the target file format | |
.RB ( .xz | |
or | |
.B .txz | |
when compressing to the | |
.B .xz | |
format, and | |
.B .lzma | |
or | |
.B .tlz | |
when compressing to the | |
.B .lzma | |
format). | |
.IP \(bu 3 | |
The operation mode is set to decompress and the | |
.I file | |
doesn't have a suffix of any of the supported file formats | |
.RB ( .xz , | |
.BR .txz , | |
.BR .lzma , | |
or | |
.BR .tlz ). | |
.PP | |
After successfully compressing or decompressing the | |
.IR file , | |
.B xz | |
copies the owner, group, permissions, access time, | |
and modification time from the source | |
.I file | |
to the target file. | |
If copying the group fails, the permissions are modified | |
so that the target file doesn't become accessible to users | |
who didn't have permission to access the source | |
.IR file . | |
.B xz | |
doesn't support copying other metadata like access control lists | |
or extended attributes yet. | |
.PP | |
Once the target file has been successfully closed, the source | |
.I file | |
is removed unless | |
.B \-\-keep | |
was specified. | |
The source | |
.I file | |
is never removed if the output is written to standard output | |
or if an error occurs. | |
.PP | |
Sending | |
.B SIGINFO | |
or | |
.B SIGUSR1 | |
to the | |
.B xz | |
process makes it print progress information to standard error. | |
This has only limited use since when standard error | |
is a terminal, using | |
.B \-\-verbose | |
will display an automatically updating progress indicator. | |
. | |
.SS "Memory usage" | |
The memory usage of | |
.B xz | |
varies from a few hundred kilobytes to several gigabytes | |
depending on the compression settings. | |
The settings used when compressing a file determine | |
the memory requirements of the decompressor. | |
Typically the decompressor needs 5\ % to 20\ % of | |
the amount of memory that the compressor needed when | |
creating the file. | |
For example, decompressing a file created with | |
.B xz \-9 | |
currently requires 65\ MiB of memory. | |
Still, it is possible to have | |
.B .xz | |
files that require several gigabytes of memory to decompress. | |
.PP | |
Especially users of older systems may find | |
the possibility of very large memory usage annoying. | |
To prevent uncomfortable surprises, | |
.B xz | |
has a built-in memory usage limiter, which is disabled by default. | |
While some operating systems provide ways to limit | |
the memory usage of processes, relying on it | |
wasn't deemed to be flexible enough (for example, using | |
.BR ulimit (1) | |
to limit virtual memory tends to cripple | |
.BR mmap (2)). | |
.PP | |
The memory usage limiter can be enabled with | |
the command line option \fB\-\-memlimit=\fIlimit\fR. | |
Often it is more convenient to enable the limiter | |
by default by setting the environment variable | |
.BR XZ_DEFAULTS , | |
for example, | |
.BR XZ_DEFAULTS=\-\-memlimit=150MiB . | |
It is possible to set the limits separately | |
for compression and decompression by using | |
.BI \-\-memlimit\-compress= limit | |
and \fB\-\-memlimit\-decompress=\fIlimit\fR. | |
Using these two options outside | |
.B XZ_DEFAULTS | |
is rarely useful because a single run of | |
.B xz | |
cannot do both compression and decompression and | |
.BI \-\-memlimit= limit | |
(or | |
.B \-M | |
.IR limit ) | |
is shorter to type on the command line. | |
.PP | |
If the specified memory usage limit is exceeded when decompressing, | |
.B xz | |
will display an error and decompressing the file will fail. | |
If the limit is exceeded when compressing, | |
.B xz | |
will try to scale the settings down so that the limit | |
is no longer exceeded (except when using | |
.B \-\-format=raw | |
or | |
.BR \-\-no\-adjust ). | |
This way the operation won't fail unless the limit is very small. | |
The scaling of the settings is done in steps that don't | |
match the compression level presets, for example, if the limit is | |
only slightly less than the amount required for | |
.BR "xz \-9" , | |
the settings will be scaled down only a little, | |
not all the way down to | |
.BR "xz \-8" . | |
. | |
.SS "Concatenation and padding with .xz files" | |
It is possible to concatenate | |
.B .xz | |
files as is. | |
.B xz | |
will decompress such files as if they were a single | |
.B .xz | |
file. | |
.PP | |
It is possible to insert padding between the concatenated parts | |
or after the last part. | |
The padding must consist of null bytes and the size | |
of the padding must be a multiple of four bytes. | |
This can be useful, for example, if the | |
.B .xz | |
file is stored on a medium that measures file sizes | |
in 512-byte blocks. | |
.PP | |
Concatenation and padding are not allowed with | |
.B .lzma | |
files or raw streams. | |
. | |
.SH OPTIONS | |
. | |
.SS "Integer suffixes and special values" | |
In most places where an integer argument is expected, | |
an optional suffix is supported to easily indicate large integers. | |
There must be no space between the integer and the suffix. | |
.TP | |
.B KiB | |
Multiply the integer by 1,024 (2^10). | |
.BR Ki , | |
.BR k , | |
.BR kB , | |
.BR K , | |
and | |
.B KB | |
are accepted as synonyms for | |
.BR KiB . | |
.TP | |
.B MiB | |
Multiply the integer by 1,048,576 (2^20). | |
.BR Mi , | |
.BR m , | |
.BR M , | |
and | |
.B MB | |
are accepted as synonyms for | |
.BR MiB . | |
.TP | |
.B GiB | |
Multiply the integer by 1,073,741,824 (2^30). | |
.BR Gi , | |
.BR g , | |
.BR G , | |
and | |
.B GB | |
are accepted as synonyms for | |
.BR GiB . | |
.PP | |
The special value | |
.B max | |
can be used to indicate the maximum integer value | |
supported by the option. | |
. | |
.SS "Operation mode" | |
If multiple operation mode options are given, | |
the last one takes effect. | |
.TP | |
.BR \-z ", " \-\-compress | |
Compress. | |
This is the default operation mode when no operation mode option | |
is specified and no other operation mode is implied from | |
the command name (for example, | |
.B unxz | |
implies | |
.BR \-\-decompress ). | |
.TP | |
.BR \-d ", " \-\-decompress ", " \-\-uncompress | |
Decompress. | |
.TP | |
.BR \-t ", " \-\-test | |
Test the integrity of compressed | |
.IR files . | |
This option is equivalent to | |
.B "\-\-decompress \-\-stdout" | |
except that the decompressed data is discarded instead of being | |
written to standard output. | |
No files are created or removed. | |
.TP | |
.BR \-l ", " \-\-list | |
Print information about compressed | |
.IR files . | |
No uncompressed output is produced, | |
and no files are created or removed. | |
In list mode, the program cannot read | |
the compressed data from standard | |
input or from other unseekable sources. | |
.IP "" | |
The default listing shows basic information about | |
.IR files , | |
one file per line. | |
To get more detailed information, use also the | |
.B \-\-verbose | |
option. | |
For even more information, use | |
.B \-\-verbose | |
twice, but note that this may be slow, because getting all the extra | |
information requires many seeks. | |
The width of verbose output exceeds | |
80 characters, so piping the output to, for example, | |
.B "less\ \-S" | |
may be convenient if the terminal isn't wide enough. | |
.IP "" | |
The exact output may vary between | |
.B xz | |
versions and different locales. | |
For machine-readable output, | |
.B \-\-robot \-\-list | |
should be used. | |
. | |
.SS "Operation modifiers" | |
.TP | |
.BR \-k ", " \-\-keep | |
Don't delete the input files. | |
.IP "" | |
Since | |
.B xz | |
5.2.6, | |
this option also makes | |
.B xz | |
compress or decompress even if the input is | |
a symbolic link to a regular file, | |
has more than one hard link, | |
or has the setuid, setgid, or sticky bit set. | |
The setuid, setgid, and sticky bits are not copied | |
to the target file. | |
In earlier versions this was only done with | |
.BR \-\-force . | |
.TP | |
.BR \-f ", " \-\-force | |
This option has several effects: | |
.RS | |
.IP \(bu 3 | |
If the target file already exists, | |
delete it before compressing or decompressing. | |
.IP \(bu 3 | |
Compress or decompress even if the input is | |
a symbolic link to a regular file, | |
has more than one hard link, | |
or has the setuid, setgid, or sticky bit set. | |
The setuid, setgid, and sticky bits are not copied | |
to the target file. | |
.IP \(bu 3 | |
When used with | |
.B \-\-decompress | |
.B \-\-stdout | |
and | |
.B xz | |
cannot recognize the type of the source file, | |
copy the source file as is to standard output. | |
This allows | |
.B xzcat | |
.B \-\-force | |
to be used like | |
.BR cat (1) | |
for files that have not been compressed with | |
.BR xz . | |
Note that in future, | |
.B xz | |
might support new compressed file formats, which may make | |
.B xz | |
decompress more types of files instead of copying them as is to | |
standard output. | |
.BI \-\-format= format | |
can be used to restrict | |
.B xz | |
to decompress only a single file format. | |
.RE | |
.TP | |
.BR \-c ", " \-\-stdout ", " \-\-to\-stdout | |
Write the compressed or decompressed data to | |
standard output instead of a file. | |
This implies | |
.BR \-\-keep . | |
.TP | |
.B \-\-single\-stream | |
Decompress only the first | |
.B .xz | |
stream, and | |
silently ignore possible remaining input data following the stream. | |
Normally such trailing garbage makes | |
.B xz | |
display an error. | |
.IP "" | |
.B xz | |
never decompresses more than one stream from | |
.B .lzma | |
files or raw streams, but this option still makes | |
.B xz | |
ignore the possible trailing data after the | |
.B .lzma | |
file or raw stream. | |
.IP "" | |
This option has no effect if the operation mode is not | |
.B \-\-decompress | |
or | |
.BR \-\-test . | |
.TP | |
.B \-\-no\-sparse | |
Disable creation of sparse files. | |
By default, if decompressing into a regular file, | |
.B xz | |
tries to make the file sparse if the decompressed data contains | |
long sequences of binary zeros. | |
It also works when writing to standard output | |
as long as standard output is connected to a regular file | |
and certain additional conditions are met to make it safe. | |
Creating sparse files may save disk space and speed up | |
the decompression by reducing the amount of disk I/O. | |
.TP | |
\fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf | |
When compressing, use | |
.I .suf | |
as the suffix for the target file instead of | |
.B .xz | |
or | |
.BR .lzma . | |
If not writing to standard output and | |
the source file already has the suffix | |
.IR .suf , | |
a warning is displayed and the file is skipped. | |
.IP "" | |
When decompressing, recognize files with the suffix | |
.I .suf | |
in addition to files with the | |
.BR .xz , | |
.BR .txz , | |
.BR .lzma , | |
or | |
.B .tlz | |
suffix. | |
If the source file has the suffix | |
.IR .suf , | |
the suffix is removed to get the target filename. | |
.IP "" | |
When compressing or decompressing raw streams | |
.RB ( \-\-format=raw ), | |
the suffix must always be specified unless | |
writing to standard output, | |
because there is no default suffix for raw streams. | |
.TP | |
\fB\-\-files\fR[\fB=\fIfile\fR] | |
Read the filenames to process from | |
.IR file ; | |
if | |
.I file | |
is omitted, filenames are read from standard input. | |
Filenames must be terminated with the newline character. | |
A dash | |
.RB ( \- ) | |
is taken as a regular filename; it doesn't mean standard input. | |
If filenames are given also as command line arguments, they are | |
processed before the filenames read from | |
.IR file . | |
.TP | |
\fB\-\-files0\fR[\fB=\fIfile\fR] | |
This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except | |
that each filename must be terminated with the null character. | |
. | |
.SS "Basic file format and compression options" | |
.TP | |
\fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat | |
Specify the file | |
.I format | |
to compress or decompress: | |
.RS | |
.TP | |
.B auto | |
This is the default. | |
When compressing, | |
.B auto | |
is equivalent to | |
.BR xz . | |
When decompressing, | |
the format of the input file is automatically detected. | |
Note that raw streams (created with | |
.BR \-\-format=raw ) | |
cannot be auto-detected. | |
.TP | |
.B xz | |
Compress to the | |
.B .xz | |
file format, or accept only | |
.B .xz | |
files when decompressing. | |
.TP | |
.BR lzma ", " alone | |
Compress to the legacy | |
.B .lzma | |
file format, or accept only | |
.B .lzma | |
files when decompressing. | |
The alternative name | |
.B alone | |
is provided for backwards compatibility with LZMA Utils. | |
.TP | |
.B raw | |
Compress or uncompress a raw stream (no headers). | |
This is meant for advanced users only. | |
To decode raw streams, you need use | |
.B \-\-format=raw | |
and explicitly specify the filter chain, | |
which normally would have been stored in the container headers. | |
.RE | |
.TP | |
\fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck | |
Specify the type of the integrity check. | |
The check is calculated from the uncompressed data and | |
stored in the | |
.B .xz | |
file. | |
This option has an effect only when compressing into the | |
.B .xz | |
format; the | |
.B .lzma | |
format doesn't support integrity checks. | |
The integrity check (if any) is verified when the | |
.B .xz | |
file is decompressed. | |
.IP "" | |
Supported | |
.I check | |
types: | |
.RS | |
.TP | |
.B none | |
Don't calculate an integrity check at all. | |
This is usually a bad idea. | |
This can be useful when integrity of the data is verified | |
by other means anyway. | |
.TP | |
.B crc32 | |
Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet). | |
.TP | |
.B crc64 | |
Calculate CRC64 using the polynomial from ECMA-182. | |
This is the default, since it is slightly better than CRC32 | |
at detecting damaged files and the speed difference is negligible. | |
.TP | |
.B sha256 | |
Calculate SHA-256. | |
This is somewhat slower than CRC32 and CRC64. | |
.RE | |
.IP "" | |
Integrity of the | |
.B .xz | |
headers is always verified with CRC32. | |
It is not possible to change or disable it. | |
.TP | |
.B \-\-ignore\-check | |
Don't verify the integrity check of the compressed data when decompressing. | |
The CRC32 values in the | |
.B .xz | |
headers will still be verified normally. | |
.IP "" | |
.B "Do not use this option unless you know what you are doing." | |
Possible reasons to use this option: | |
.RS | |
.IP \(bu 3 | |
Trying to recover data from a corrupt .xz file. | |
.IP \(bu 3 | |
Speeding up decompression. | |
This matters mostly with SHA-256 or | |
with files that have compressed extremely well. | |
It's recommended to not use this option for this purpose | |
unless the file integrity is verified externally in some other way. | |
.RE | |
.TP | |
.BR \-0 " ... " \-9 | |
Select a compression preset level. | |
The default is | |
.BR \-6 . | |
If multiple preset levels are specified, | |
the last one takes effect. | |
If a custom filter chain was already specified, setting | |
a compression preset level clears the custom filter chain. | |
.IP "" | |
The differences between the presets are more significant than with | |
.BR gzip (1) | |
and | |
.BR bzip2 (1). | |
The selected compression settings determine | |
the memory requirements of the decompressor, | |
thus using a too high preset level might make it painful | |
to decompress the file on an old system with little RAM. | |
Specifically, | |
.B "it's not a good idea to blindly use \-9 for everything" | |
like it often is with | |
.BR gzip (1) | |
and | |
.BR bzip2 (1). | |
.RS | |
.TP | |
.BR "\-0" " ... " "\-3" | |
These are somewhat fast presets. | |
.B \-0 | |
is sometimes faster than | |
.B "gzip \-9" | |
while compressing much better. | |
The higher ones often have speed comparable to | |
.BR bzip2 (1) | |
with comparable or better compression ratio, | |
although the results | |
depend a lot on the type of data being compressed. | |
.TP | |
.BR "\-4" " ... " "\-6" | |
Good to very good compression while keeping | |
decompressor memory usage reasonable even for old systems. | |
.B \-6 | |
is the default, which is usually a good choice | |
for distributing files that need to be decompressible | |
even on systems with only 16\ MiB RAM. | |
.RB ( \-5e | |
or | |
.B \-6e | |
may be worth considering too. | |
See | |
.BR \-\-extreme .) | |
.TP | |
.B "\-7 ... \-9" | |
These are like | |
.B \-6 | |
but with higher compressor and decompressor memory requirements. | |
These are useful only when compressing files bigger than | |
8\ MiB, 16\ MiB, and 32\ MiB, respectively. | |
.RE | |
.IP "" | |
On the same hardware, the decompression speed is approximately | |
a constant number of bytes of compressed data per second. | |
In other words, the better the compression, | |
the faster the decompression will usually be. | |
This also means that the amount of uncompressed output | |
produced per second can vary a lot. | |
.IP "" | |
The following table summarises the features of the presets: | |
.RS | |
.RS | |
.PP | |
.TS | |
tab(;); | |
c c c c c | |
n n n n n. | |
Preset;DictSize;CompCPU;CompMem;DecMem | |
\-0;256 KiB;0;3 MiB;1 MiB | |
\-1;1 MiB;1;9 MiB;2 MiB | |
\-2;2 MiB;2;17 MiB;3 MiB | |
\-3;4 MiB;3;32 MiB;5 MiB | |
\-4;4 MiB;4;48 MiB;5 MiB | |
\-5;8 MiB;5;94 MiB;9 MiB | |
\-6;8 MiB;6;94 MiB;9 MiB | |
\-7;16 MiB;6;186 MiB;17 MiB | |
\-8;32 MiB;6;370 MiB;33 MiB | |
\-9;64 MiB;6;674 MiB;65 MiB | |
.TE | |
.RE | |
.RE | |
.IP "" | |
Column descriptions: | |
.RS | |
.IP \(bu 3 | |
DictSize is the LZMA2 dictionary size. | |
It is waste of memory to use a dictionary bigger than | |
the size of the uncompressed file. | |
This is why it is good to avoid using the presets | |
.BR \-7 " ... " \-9 | |
when there's no real need for them. | |
At | |
.B \-6 | |
and lower, the amount of memory wasted is | |
usually low enough to not matter. | |
.IP \(bu 3 | |
CompCPU is a simplified representation of the LZMA2 settings | |
that affect compression speed. | |
The dictionary size affects speed too, | |
so while CompCPU is the same for levels | |
.BR \-6 " ... " \-9 , | |
higher levels still tend to be a little slower. | |
To get even slower and thus possibly better compression, see | |
.BR \-\-extreme . | |
.IP \(bu 3 | |
CompMem contains the compressor memory requirements | |
in the single-threaded mode. | |
It may vary slightly between | |
.B xz | |
versions. | |
Memory requirements of some of the future multithreaded modes may | |
be dramatically higher than that of the single-threaded mode. | |
.IP \(bu 3 | |
DecMem contains the decompressor memory requirements. | |
That is, the compression settings determine | |
the memory requirements of the decompressor. | |
The exact decompressor memory usage is slightly more than | |
the LZMA2 dictionary size, but the values in the table | |
have been rounded up to the next full MiB. | |
.RE | |
.TP | |
.BR \-e ", " \-\-extreme | |
Use a slower variant of the selected compression preset level | |
.RB ( \-0 " ... " \-9 ) | |
to hopefully get a little bit better compression ratio, | |
but with bad luck this can also make it worse. | |
Decompressor memory usage is not affected, | |
but compressor memory usage increases a little at preset levels | |
.BR \-0 " ... " \-3 . | |
.IP "" | |
Since there are two presets with dictionary sizes | |
4\ MiB and 8\ MiB, the presets | |
.B \-3e | |
and | |
.B \-5e | |
use slightly faster settings (lower CompCPU) than | |
.B \-4e | |
and | |
.BR \-6e , | |
respectively. | |
That way no two presets are identical. | |
.RS | |
.RS | |
.PP | |
.TS | |
tab(;); | |
c c c c c | |
n n n n n. | |
Preset;DictSize;CompCPU;CompMem;DecMem | |
\-0e;256 KiB;8;4 MiB;1 MiB | |
\-1e;1 MiB;8;13 MiB;2 MiB | |
\-2e;2 MiB;8;25 MiB;3 MiB | |
\-3e;4 MiB;7;48 MiB;5 MiB | |
\-4e;4 MiB;8;48 MiB;5 MiB | |
\-5e;8 MiB;7;94 MiB;9 MiB | |
\-6e;8 MiB;8;94 MiB;9 MiB | |
\-7e;16 MiB;8;186 MiB;17 MiB | |
\-8e;32 MiB;8;370 MiB;33 MiB | |
\-9e;64 MiB;8;674 MiB;65 MiB | |
.TE | |
.RE | |
.RE | |
.IP "" | |
For example, there are a total of four presets that use | |
8\ MiB dictionary, whose order from the fastest to the slowest is | |
.BR \-5 , | |
.BR \-6 , | |
.BR \-5e , | |
and | |
.BR \-6e . | |
.TP | |
.B \-\-fast | |
.PD 0 | |
.TP | |
.B \-\-best | |
.PD | |
These are somewhat misleading aliases for | |
.B \-0 | |
and | |
.BR \-9 , | |
respectively. | |
These are provided only for backwards compatibility | |
with LZMA Utils. | |
Avoid using these options. | |
.TP | |
.BI \-\-block\-size= size | |
When compressing to the | |
.B .xz | |
format, split the input data into blocks of | |
.I size | |
bytes. | |
The blocks are compressed independently from each other, | |
which helps with multi-threading and | |
makes limited random-access decompression possible. | |
This option is typically used to override the default | |
block size in multi-threaded mode, | |
but this option can be used in single-threaded mode too. | |
.IP "" | |
In multi-threaded mode about three times | |
.I size | |
bytes will be allocated in each thread for buffering input and output. | |
The default | |
.I size | |
is three times the LZMA2 dictionary size or 1 MiB, | |
whichever is more. | |
Typically a good value is 2\(en4 times | |
the size of the LZMA2 dictionary or at least 1 MiB. | |
Using | |
.I size | |
less than the LZMA2 dictionary size is waste of RAM | |
because then the LZMA2 dictionary buffer will never get fully used. | |
The sizes of the blocks are stored in the block headers, | |
which a future version of | |
.B xz | |
will use for multi-threaded decompression. | |
.IP "" | |
In single-threaded mode no block splitting is done by default. | |
Setting this option doesn't affect memory usage. | |
No size information is stored in block headers, | |
thus files created in single-threaded mode | |
won't be identical to files created in multi-threaded mode. | |
The lack of size information also means that a future version of | |
.B xz | |
won't be able decompress the files in multi-threaded mode. | |
.TP | |
.BI \-\-block\-list= sizes | |
When compressing to the | |
.B .xz | |
format, start a new block after | |
the given intervals of uncompressed data. | |
.IP "" | |
The uncompressed | |
.I sizes | |
of the blocks are specified as a comma-separated list. | |
Omitting a size (two or more consecutive commas) is a shorthand | |
to use the size of the previous block. | |
.IP "" | |
If the input file is bigger than the sum of | |
.IR sizes , | |
the last value in | |
.I sizes | |
is repeated until the end of the file. | |
A special value of | |
.B 0 | |
may be used as the last value to indicate that | |
the rest of the file should be encoded as a single block. | |
.IP "" | |
If one specifies | |
.I sizes | |
that exceed the encoder's block size | |
(either the default value in threaded mode or | |
the value specified with \fB\-\-block\-size=\fIsize\fR), | |
the encoder will create additional blocks while | |
keeping the boundaries specified in | |
.IR sizes . | |
For example, if one specifies | |
.B \-\-block\-size=10MiB | |
.B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB | |
and the input file is 80 MiB, | |
one will get 11 blocks: | |
5, 10, 8, 10, 2, 10, 10, 4, 10, 10, and 1 MiB. | |
.IP "" | |
In multi-threaded mode the sizes of the blocks | |
are stored in the block headers. | |
This isn't done in single-threaded mode, | |
so the encoded output won't be | |
identical to that of the multi-threaded mode. | |
.TP | |
.BI \-\-flush\-timeout= timeout | |
When compressing, if more than | |
.I timeout | |
milliseconds (a positive integer) has passed since the previous flush and | |
reading more input would block, | |
all the pending input data is flushed from the encoder and | |
made available in the output stream. | |
This can be useful if | |
.B xz | |
is used to compress data that is streamed over a network. | |
Small | |
.I timeout | |
values make the data available at the receiving end | |
with a small delay, but large | |
.I timeout | |
values give better compression ratio. | |
.IP "" | |
This feature is disabled by default. | |
If this option is specified more than once, the last one takes effect. | |
The special | |
.I timeout | |
value of | |
.B 0 | |
can be used to explicitly disable this feature. | |
.IP "" | |
This feature is not available on non-POSIX systems. | |
.IP "" | |
.\" FIXME | |
.B "This feature is still experimental." | |
Currently | |
.B xz | |
is unsuitable for decompressing the stream in real time due to how | |
.B xz | |
does buffering. | |
.TP | |
.BI \-\-memlimit\-compress= limit | |
Set a memory usage limit for compression. | |
If this option is specified multiple times, | |
the last one takes effect. | |
.IP "" | |
If the compression settings exceed the | |
.IR limit , | |
.B xz | |
will adjust the settings downwards so that | |
the limit is no longer exceeded and display a notice that | |
automatic adjustment was done. | |
Such adjustments are not made when compressing with | |
.B \-\-format=raw | |
or if | |
.B \-\-no\-adjust | |
has been specified. | |
In those cases, an error is displayed and | |
.B xz | |
will exit with exit status 1. | |
.IP "" | |
The | |
.I limit | |
can be specified in multiple ways: | |
.RS | |
.IP \(bu 3 | |
The | |
.I limit | |
can be an absolute value in bytes. | |
Using an integer suffix like | |
.B MiB | |
can be useful. | |
Example: | |
.B "\-\-memlimit\-compress=80MiB" | |
.IP \(bu 3 | |
The | |
.I limit | |
can be specified as a percentage of total physical memory (RAM). | |
This can be useful especially when setting the | |
.B XZ_DEFAULTS | |
environment variable in a shell initialization script | |
that is shared between different computers. | |
That way the limit is automatically bigger | |
on systems with more memory. | |
Example: | |
.B "\-\-memlimit\-compress=70%" | |
.IP \(bu 3 | |
The | |
.I limit | |
can be reset back to its default value by setting it to | |
.BR 0 . | |
This is currently equivalent to setting the | |
.I limit | |
to | |
.B max | |
(no memory usage limit). | |
Once multithreading support has been implemented, | |
there may be a difference between | |
.B 0 | |
and | |
.B max | |
for the multithreaded case, so it is recommended to use | |
.B 0 | |
instead of | |
.B max | |
until the details have been decided. | |
.RE | |
.IP "" | |
For 32-bit | |
.B xz | |
there is a special case: if the | |
.I limit | |
would be over | |
.BR "4020\ MiB" , | |
the | |
.I limit | |
is set to | |
.BR "4020\ MiB" . | |
On MIPS32 | |
.B "2000\ MiB" | |
is used instead. | |
(The values | |
.B 0 | |
and | |
.B max | |
aren't affected by this. | |
A similar feature doesn't exist for decompression.) | |
This can be helpful when a 32-bit executable has access | |
to 4\ GiB address space (2 GiB on MIPS32) | |
while hopefully doing no harm in other situations. | |
.IP "" | |
See also the section | |
.BR "Memory usage" . | |
.TP | |
.BI \-\-memlimit\-decompress= limit | |
Set a memory usage limit for decompression. | |
This also affects the | |
.B \-\-list | |
mode. | |
If the operation is not possible without exceeding the | |
.IR limit , | |
.B xz | |
will display an error and decompressing the file will fail. | |
See | |
.BI \-\-memlimit\-compress= limit | |
for possible ways to specify the | |
.IR limit . | |
.TP | |
\fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit | |
This is equivalent to specifying | |
.BI \-\-memlimit\-compress= limit | |
\fB\-\-memlimit\-decompress=\fIlimit\fR. | |
.TP | |
.B \-\-no\-adjust | |
Display an error and exit if the compression settings exceed | |
the memory usage limit. | |
The default is to adjust the settings downwards so | |
that the memory usage limit is not exceeded. | |
Automatic adjusting is always disabled when creating raw streams | |
.RB ( \-\-format=raw ). | |
.TP | |
\fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads | |
Specify the number of worker threads to use. | |
Setting | |
.I threads | |
to a special value | |
.B 0 | |
makes | |
.B xz | |
use as many threads as there are CPU cores on the system. | |
The actual number of threads can be less than | |
.I threads | |
if the input file is not big enough | |
for threading with the given settings or | |
if using more threads would exceed the memory usage limit. | |
.IP "" | |
Currently the only threading method is to split the input into | |
blocks and compress them independently from each other. | |
The default block size depends on the compression level and | |
can be overridden with the | |
.BI \-\-block\-size= size | |
option. | |
.IP "" | |
Threaded decompression hasn't been implemented yet. | |
It will only work on files that contain multiple blocks | |
with size information in block headers. | |
All files compressed in multi-threaded mode meet this condition, | |
but files compressed in single-threaded mode don't even if | |
.BI \-\-block\-size= size | |
is used. | |
. | |
.SS "Custom compressor filter chains" | |
A custom filter chain allows specifying | |
the compression settings in detail instead of relying on | |
the settings associated to the presets. | |
When a custom filter chain is specified, | |
preset options | |
.RB ( \-0 | |
\&...\& | |
.B \-9 | |
and | |
.BR \-\-extreme ) | |
earlier on the command line are forgotten. | |
If a preset option is specified | |
after one or more custom filter chain options, | |
the new preset takes effect and | |
the custom filter chain options specified earlier are forgotten. | |
.PP | |
A filter chain is comparable to piping on the command line. | |
When compressing, the uncompressed input goes to the first filter, | |
whose output goes to the next filter (if any). | |
The output of the last filter gets written to the compressed file. | |
The maximum number of filters in the chain is four, | |
but typically a filter chain has only one or two filters. | |
.PP | |
Many filters have limitations on where they can be | |
in the filter chain: | |
some filters can work only as the last filter in the chain, | |
some only as a non-last filter, and some work in any position | |
in the chain. | |
Depending on the filter, this limitation is either inherent to | |
the filter design or exists to prevent security issues. | |
.PP | |
A custom filter chain is specified by using one or more | |
filter options in the order they are wanted in the filter chain. | |
That is, the order of filter options is significant! | |
When decoding raw streams | |
.RB ( \-\-format=raw ), | |
the filter chain is specified in the same order as | |
it was specified when compressing. | |
.PP | |
Filters take filter-specific | |
.I options | |
as a comma-separated list. | |
Extra commas in | |
.I options | |
are ignored. | |
Every option has a default value, so you need to | |
specify only those you want to change. | |
.PP | |
To see the whole filter chain and | |
.IR options , | |
use | |
.B "xz \-vv" | |
(that is, use | |
.B \-\-verbose | |
twice). | |
This works also for viewing the filter chain options used by presets. | |
.TP | |
\fB\-\-lzma1\fR[\fB=\fIoptions\fR] | |
.PD 0 | |
.TP | |
\fB\-\-lzma2\fR[\fB=\fIoptions\fR] | |
.PD | |
Add LZMA1 or LZMA2 filter to the filter chain. | |
These filters can be used only as the last filter in the chain. | |
.IP "" | |
LZMA1 is a legacy filter, | |
which is supported almost solely due to the legacy | |
.B .lzma | |
file format, which supports only LZMA1. | |
LZMA2 is an updated | |
version of LZMA1 to fix some practical issues of LZMA1. | |
The | |
.B .xz | |
format uses LZMA2 and doesn't support LZMA1 at all. | |
Compression speed and ratios of LZMA1 and LZMA2 | |
are practically the same. | |
.IP "" | |
LZMA1 and LZMA2 share the same set of | |
.IR options : | |
.RS | |
.TP | |
.BI preset= preset | |
Reset all LZMA1 or LZMA2 | |
.I options | |
to | |
.IR preset . | |
.I Preset | |
consist of an integer, which may be followed by single-letter | |
preset modifiers. | |
The integer can be from | |
.B 0 | |
to | |
.BR 9 , | |
matching the command line options | |
.B \-0 | |
\&...\& | |
.BR \-9 . | |
The only supported modifier is currently | |
.BR e , | |
which matches | |
.BR \-\-extreme . | |
If no | |
.B preset | |
is specified, the default values of LZMA1 or LZMA2 | |
.I options | |
are taken from the preset | |
.BR 6 . | |
.TP | |
.BI dict= size | |
Dictionary (history buffer) | |
.I size | |
indicates how many bytes of the recently processed | |
uncompressed data is kept in memory. | |
The algorithm tries to find repeating byte sequences (matches) in | |
the uncompressed data, and replace them with references | |
to the data currently in the dictionary. | |
The bigger the dictionary, the higher is the chance | |
to find a match. | |
Thus, increasing dictionary | |
.I size | |
usually improves compression ratio, but | |
a dictionary bigger than the uncompressed file is waste of memory. | |
.IP "" | |
Typical dictionary | |
.I size | |
is from 64\ KiB to 64\ MiB. | |
The minimum is 4\ KiB. | |
The maximum for compression is currently 1.5\ GiB (1536\ MiB). | |
The decompressor already supports dictionaries up to | |
one byte less than 4\ GiB, which is the maximum for | |
the LZMA1 and LZMA2 stream formats. | |
.IP "" | |
Dictionary | |
.I size | |
and match finder | |
.RI ( mf ) | |
together determine the memory usage of the LZMA1 or LZMA2 encoder. | |
The same (or bigger) dictionary | |
.I size | |
is required for decompressing that was used when compressing, | |
thus the memory usage of the decoder is determined | |
by the dictionary size used when compressing. | |
The | |
.B .xz | |
headers store the dictionary | |
.I size | |
either as | |
.RI "2^" n | |
or | |
.RI "2^" n " + 2^(" n "\-1)," | |
so these | |
.I sizes | |
are somewhat preferred for compression. | |
Other | |
.I sizes | |
will get rounded up when stored in the | |
.B .xz | |
headers. | |
.TP | |
.BI lc= lc | |
Specify the number of literal context bits. | |
The minimum is 0 and the maximum is 4; the default is 3. | |
In addition, the sum of | |
.I lc | |
and | |
.I lp | |
must not exceed 4. | |
.IP "" | |
All bytes that cannot be encoded as matches | |
are encoded as literals. | |
That is, literals are simply 8-bit bytes | |
that are encoded one at a time. | |
.IP "" | |
The literal coding makes an assumption that the highest | |
.I lc | |
bits of the previous uncompressed byte correlate | |
with the next byte. | |
For example, in typical English text, an upper-case letter is | |
often followed by a lower-case letter, and a lower-case | |
letter is usually followed by another lower-case letter. | |
In the US-ASCII character set, the highest three bits are 010 | |
for upper-case letters and 011 for lower-case letters. | |
When | |
.I lc | |
is at least 3, the literal coding can take advantage of | |
this property in the uncompressed data. | |
.IP "" | |
The default value (3) is usually good. | |
If you want maximum compression, test | |
.BR lc=4 . | |
Sometimes it helps a little, and | |
sometimes it makes compression worse. | |
If it makes it worse, test | |
.B lc=2 | |
too. | |
.TP | |
.BI lp= lp | |
Specify the number of literal position bits. | |
The minimum is 0 and the maximum is 4; the default is 0. | |
.IP "" | |
.I Lp | |
affects what kind of alignment in the uncompressed data is | |
assumed when encoding literals. | |
See | |
.I pb | |
below for more information about alignment. | |
.TP | |
.BI pb= pb | |
Specify the number of position bits. | |
The minimum is 0 and the maximum is 4; the default is 2. | |
.IP "" | |
.I Pb | |
affects what kind of alignment in the uncompressed data is | |
assumed in general. | |
The default means four-byte alignment | |
.RI (2^ pb =2^2=4), | |
which is often a good choice when there's no better guess. | |
.IP "" | |
When the alignment is known, setting | |
.I pb | |
accordingly may reduce the file size a little. | |
For example, with text files having one-byte | |
alignment (US-ASCII, ISO-8859-*, UTF-8), setting | |
.B pb=0 | |
can improve compression slightly. | |
For UTF-16 text, | |
.B pb=1 | |
is a good choice. | |
If the alignment is an odd number like 3 bytes, | |
.B pb=0 | |
might be the best choice. | |
.IP "" | |
Even though the assumed alignment can be adjusted with | |
.I pb | |
and | |
.IR lp , | |
LZMA1 and LZMA2 still slightly favor 16-byte alignment. | |
It might be worth taking into account when designing file formats | |
that are likely to be often compressed with LZMA1 or LZMA2. | |
.TP | |
.BI mf= mf | |
Match finder has a major effect on encoder speed, | |
memory usage, and compression ratio. | |
Usually Hash Chain match finders are faster than Binary Tree | |
match finders. | |
The default depends on the | |
.IR preset : | |
0 uses | |
.BR hc3 , | |
1\(en3 | |
use | |
.BR hc4 , | |
and the rest use | |
.BR bt4 . | |
.IP "" | |
The following match finders are supported. | |
The memory usage formulas below are rough approximations, | |
which are closest to the reality when | |
.I dict | |
is a power of two. | |
.RS | |
.TP | |
.B hc3 | |
Hash Chain with 2- and 3-byte hashing | |
.br | |
Minimum value for | |
.IR nice : | |
3 | |
.br | |
Memory usage: | |
.br | |
.I dict | |
* 7.5 (if | |
.I dict | |
<= 16 MiB); | |
.br | |
.I dict | |
* 5.5 + 64 MiB (if | |
.I dict | |
> 16 MiB) | |
.TP | |
.B hc4 | |
Hash Chain with 2-, 3-, and 4-byte hashing | |
.br | |
Minimum value for | |
.IR nice : | |
4 | |
.br | |
Memory usage: | |
.br | |
.I dict | |
* 7.5 (if | |
.I dict | |
<= 32 MiB); | |
.br | |
.I dict | |
* 6.5 (if | |
.I dict | |
> 32 MiB) | |
.TP | |
.B bt2 | |
Binary Tree with 2-byte hashing | |
.br | |
Minimum value for | |
.IR nice : | |
2 | |
.br | |
Memory usage: | |
.I dict | |
* 9.5 | |
.TP | |
.B bt3 | |
Binary Tree with 2- and 3-byte hashing | |
.br | |
Minimum value for | |
.IR nice : | |
3 | |
.br | |
Memory usage: | |
.br | |
.I dict | |
* 11.5 (if | |
.I dict | |
<= 16 MiB); | |
.br | |
.I dict | |
* 9.5 + 64 MiB (if | |
.I dict | |
> 16 MiB) | |
.TP | |
.B bt4 | |
Binary Tree with 2-, 3-, and 4-byte hashing | |
.br | |
Minimum value for | |
.IR nice : | |
4 | |
.br | |
Memory usage: | |
.br | |
.I dict | |
* 11.5 (if | |
.I dict | |
<= 32 MiB); | |
.br | |
.I dict | |
* 10.5 (if | |
.I dict | |
> 32 MiB) | |
.RE | |
.TP | |
.BI mode= mode | |
Compression | |
.I mode | |
specifies the method to analyze | |
the data produced by the match finder. | |
Supported | |
.I modes | |
are | |
.B fast | |
and | |
.BR normal . | |
The default is | |
.B fast | |
for | |
.I presets | |
0\(en3 and | |
.B normal | |
for | |
.I presets | |
4\(en9. | |
.IP "" | |
Usually | |
.B fast | |
is used with Hash Chain match finders and | |
.B normal | |
with Binary Tree match finders. | |
This is also what the | |
.I presets | |
do. | |
.TP | |
.BI nice= nice | |
Specify what is considered to be a nice length for a match. | |
Once a match of at least | |
.I nice | |
bytes is found, the algorithm stops | |
looking for possibly better matches. | |
.IP "" | |
.I Nice | |
can be 2\(en273 bytes. | |
Higher values tend to give better compression ratio | |
at the expense of speed. | |
The default depends on the | |
.IR preset . | |
.TP | |
.BI depth= depth | |
Specify the maximum search depth in the match finder. | |
The default is the special value of 0, | |
which makes the compressor determine a reasonable | |
.I depth | |
from | |
.I mf | |
and | |
.IR nice . | |
.IP "" | |
Reasonable | |
.I depth | |
for Hash Chains is 4\(en100 and 16\(en1000 for Binary Trees. | |
Using very high values for | |
.I depth | |
can make the encoder extremely slow with some files. | |
Avoid setting the | |
.I depth | |
over 1000 unless you are prepared to interrupt | |
the compression in case it is taking far too long. | |
.RE | |
.IP "" | |
When decoding raw streams | |
.RB ( \-\-format=raw ), | |
LZMA2 needs only the dictionary | |
.IR size . | |
LZMA1 needs also | |
.IR lc , | |
.IR lp , | |
and | |
.IR pb . | |
.TP | |
\fB\-\-x86\fR[\fB=\fIoptions\fR] | |
.PD 0 | |
.TP | |
\fB\-\-powerpc\fR[\fB=\fIoptions\fR] | |
.TP | |
\fB\-\-ia64\fR[\fB=\fIoptions\fR] | |
.TP | |
\fB\-\-arm\fR[\fB=\fIoptions\fR] | |
.TP | |
\fB\-\-armthumb\fR[\fB=\fIoptions\fR] | |
.TP | |
\fB\-\-sparc\fR[\fB=\fIoptions\fR] | |
.PD | |
Add a branch/call/jump (BCJ) filter to the filter chain. | |
These filters can be used only as a non-last filter | |
in the filter chain. | |
.IP "" | |
A BCJ filter converts relative addresses in | |
the machine code to their absolute counterparts. | |
This doesn't change the size of the data, | |
but it increases redundancy, | |
which can help LZMA2 to produce 0\(en15\ % smaller | |
.B .xz | |
file. | |
The BCJ filters are always reversible, | |
so using a BCJ filter for wrong type of data | |
doesn't cause any data loss, although it may make | |
the compression ratio slightly worse. | |
.IP "" | |
It is fine to apply a BCJ filter on a whole executable; | |
there's no need to apply it only on the executable section. | |
Applying a BCJ filter on an archive that contains both executable | |
and non-executable files may or may not give good results, | |
so it generally isn't good to blindly apply a BCJ filter when | |
compressing binary packages for distribution. | |
.IP "" | |
These BCJ filters are very fast and | |
use insignificant amount of memory. | |
If a BCJ filter improves compression ratio of a file, | |
it can improve decompression speed at the same time. | |
This is because, on the same hardware, | |
the decompression speed of LZMA2 is roughly | |
a fixed number of bytes of compressed data per second. | |
.IP "" | |
These BCJ filters have known problems related to | |
the compression ratio: | |
.RS | |
.IP \(bu 3 | |
Some types of files containing executable code | |
(for example, object files, static libraries, and Linux kernel modules) | |
have the addresses in the instructions filled with filler values. | |
These BCJ filters will still do the address conversion, | |
which will make the compression worse with these files. | |
.IP \(bu 3 | |
Applying a BCJ filter on an archive containing multiple similar | |
executables can make the compression ratio worse than not using | |
a BCJ filter. | |
This is because the BCJ filter doesn't detect the boundaries | |
of the executable files, and doesn't reset | |
the address conversion counter for each executable. | |
.RE | |
.IP "" | |
Both of the above problems will be fixed | |
in the future in a new filter. | |
The old BCJ filters will still be useful in embedded systems, | |
because the decoder of the new filter will be bigger | |
and use more memory. | |
.IP "" | |
Different instruction sets have different alignment: | |
.RS | |
.RS | |
.PP | |
.TS | |
tab(;); | |
l n l | |
l n l. | |
Filter;Alignment;Notes | |
x86;1;32-bit or 64-bit x86 | |
PowerPC;4;Big endian only | |
ARM;4;Little endian only | |
ARM-Thumb;2;Little endian only | |
IA-64;16;Big or little endian | |
SPARC;4;Big or little endian | |
.TE | |
.RE | |
.RE | |
.IP "" | |
Since the BCJ-filtered data is usually compressed with LZMA2, | |
the compression ratio may be improved slightly if | |
the LZMA2 options are set to match the | |
alignment of the selected BCJ filter. | |
For example, with the IA-64 filter, it's good to set | |
.B pb=4 | |
with LZMA2 (2^4=16). | |
The x86 filter is an exception; | |
it's usually good to stick to LZMA2's default | |
four-byte alignment when compressing x86 executables. | |
.IP "" | |
All BCJ filters support the same | |
.IR options : | |
.RS | |
.TP | |
.BI start= offset | |
Specify the start | |
.I offset | |
that is used when converting between relative | |
and absolute addresses. | |
The | |
.I offset | |
must be a multiple of the alignment of the filter | |
(see the table above). | |
The default is zero. | |
In practice, the default is good; specifying a custom | |
.I offset | |
is almost never useful. | |
.RE | |
.TP | |
\fB\-\-delta\fR[\fB=\fIoptions\fR] | |
Add the Delta filter to the filter chain. | |
The Delta filter can be only used as a non-last filter | |
in the filter chain. | |
.IP "" | |
Currently only simple byte-wise delta calculation is supported. | |
It can be useful when compressing, for example, uncompressed bitmap images | |
or uncompressed PCM audio. | |
However, special purpose algorithms may give significantly better | |
results than Delta + LZMA2. | |
This is true especially with audio, | |
which compresses faster and better, for example, with | |
.BR flac (1). | |
.IP "" | |
Supported | |
.IR options : | |
.RS | |
.TP | |
.BI dist= distance | |
Specify the | |
.I distance | |
of the delta calculation in bytes. | |
.I distance | |
must be 1\(en256. | |
The default is 1. | |
.IP "" | |
For example, with | |
.B dist=2 | |
and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be | |
A1 B1 01 02 01 02 01 02. | |
.RE | |
. | |
.SS "Other options" | |
.TP | |
.BR \-q ", " \-\-quiet | |
Suppress warnings and notices. | |
Specify this twice to suppress errors too. | |
This option has no effect on the exit status. | |
That is, even if a warning was suppressed, | |
the exit status to indicate a warning is still used. | |
.TP | |
.BR \-v ", " \-\-verbose | |
Be verbose. | |
If standard error is connected to a terminal, | |
.B xz | |
will display a progress indicator. | |
Specifying | |
.B \-\-verbose | |
twice will give even more verbose output. | |
.IP "" | |
The progress indicator shows the following information: | |
.RS | |
.IP \(bu 3 | |
Completion percentage is shown | |
if the size of the input file is known. | |
That is, the percentage cannot be shown in pipes. | |
.IP \(bu 3 | |
Amount of compressed data produced (compressing) | |
or consumed (decompressing). | |
.IP \(bu 3 | |
Amount of uncompressed data consumed (compressing) | |
or produced (decompressing). | |
.IP \(bu 3 | |
Compression ratio, which is calculated by dividing | |
the amount of compressed data processed so far by | |
the amount of uncompressed data processed so far. | |
.IP \(bu 3 | |
Compression or decompression speed. | |
This is measured as the amount of uncompressed data consumed | |
(compression) or produced (decompression) per second. | |
It is shown after a few seconds have passed since | |
.B xz | |
started processing the file. | |
.IP \(bu 3 | |
Elapsed time in the format M:SS or H:MM:SS. | |
.IP \(bu 3 | |
Estimated remaining time is shown | |
only when the size of the input file is | |
known and a couple of seconds have already passed since | |
.B xz | |
started processing the file. | |
The time is shown in a less precise format which | |
never has any colons, for example, 2 min 30 s. | |
.RE | |
.IP "" | |
When standard error is not a terminal, | |
.B \-\-verbose | |
will make | |
.B xz | |
print the filename, compressed size, uncompressed size, | |
compression ratio, and possibly also the speed and elapsed time | |
on a single line to standard error after compressing or | |
decompressing the file. | |
The speed and elapsed time are included only when | |
the operation took at least a few seconds. | |
If the operation didn't finish, for example, due to user interruption, | |
also the completion percentage is printed | |
if the size of the input file is known. | |
.TP | |
.BR \-Q ", " \-\-no\-warn | |
Don't set the exit status to 2 | |
even if a condition worth a warning was detected. | |
This option doesn't affect the verbosity level, thus both | |
.B \-\-quiet | |
and | |
.B \-\-no\-warn | |
have to be used to not display warnings and | |
to not alter the exit status. | |
.TP | |
.B \-\-robot | |
Print messages in a machine-parsable format. | |
This is intended to ease writing frontends that want to use | |
.B xz | |
instead of liblzma, which may be the case with various scripts. | |
The output with this option enabled is meant to be stable across | |
.B xz | |
releases. | |
See the section | |
.B "ROBOT MODE" | |
for details. | |
.TP | |
.B \-\-info\-memory | |
Display, in human-readable format, how much physical memory (RAM) | |
.B xz | |
thinks the system has and the memory usage limits for compression | |
and decompression, and exit successfully. | |
.TP | |
.BR \-h ", " \-\-help | |
Display a help message describing the most commonly used options, | |
and exit successfully. | |
.TP | |
.BR \-H ", " \-\-long\-help | |
Display a help message describing all features of | |
.BR xz , | |
and exit successfully | |
.TP | |
.BR \-V ", " \-\-version | |
Display the version number of | |
.B xz | |
and liblzma in human readable format. | |
To get machine-parsable output, specify | |
.B \-\-robot | |
before | |
.BR \-\-version . | |
. | |
.SH "ROBOT MODE" | |
The robot mode is activated with the | |
.B \-\-robot | |
option. | |
It makes the output of | |
.B xz | |
easier to parse by other programs. | |
Currently | |
.B \-\-robot | |
is supported only together with | |
.BR \-\-version , | |
.BR \-\-info\-memory , | |
and | |
.BR \-\-list . | |
It will be supported for compression and | |
decompression in the future. | |
. | |
.SS Version | |
.B "xz \-\-robot \-\-version" | |
will print the version number of | |
.B xz | |
and liblzma in the following format: | |
.PP | |
.BI XZ_VERSION= XYYYZZZS | |
.br | |
.BI LIBLZMA_VERSION= XYYYZZZS | |
.TP | |
.I X | |
Major version. | |
.TP | |
.I YYY | |
Minor version. | |
Even numbers are stable. | |
Odd numbers are alpha or beta versions. | |
.TP | |
.I ZZZ | |
Patch level for stable releases or | |
just a counter for development releases. | |
.TP | |
.I S | |
Stability. | |
0 is alpha, 1 is beta, and 2 is stable. | |
.I S | |
should be always 2 when | |
.I YYY | |
is even. | |
.PP | |
.I XYYYZZZS | |
are the same on both lines if | |
.B xz | |
and liblzma are from the same XZ Utils release. | |
.PP | |
Examples: 4.999.9beta is | |
.B 49990091 | |
and | |
5.0.0 is | |
.BR 50000002 . | |
. | |
.SS "Memory limit information" | |
.B "xz \-\-robot \-\-info\-memory" | |
prints a single line with three tab-separated columns: | |
.IP 1. 4 | |
Total amount of physical memory (RAM) in bytes | |
.IP 2. 4 | |
Memory usage limit for compression in bytes. | |
A special value of zero indicates the default setting, | |
which for single-threaded mode is the same as no limit. | |
.IP 3. 4 | |
Memory usage limit for decompression in bytes. | |
A special value of zero indicates the default setting, | |
which for single-threaded mode is the same as no limit. | |
.PP | |
In the future, the output of | |
.B "xz \-\-robot \-\-info\-memory" | |
may have more columns, but never more than a single line. | |
. | |
.SS "List mode" | |
.B "xz \-\-robot \-\-list" | |
uses tab-separated output. | |
The first column of every line has a string | |
that indicates the type of the information found on that line: | |
.TP | |
.B name | |
This is always the first line when starting to list a file. | |
The second column on the line is the filename. | |
.TP | |
.B file | |
This line contains overall information about the | |
.B .xz | |
file. | |
This line is always printed after the | |
.B name | |
line. | |
.TP | |
.B stream | |
This line type is used only when | |
.B \-\-verbose | |
was specified. | |
There are as many | |
.B stream | |
lines as there are streams in the | |
.B .xz | |
file. | |
.TP | |
.B block | |
This line type is used only when | |
.B \-\-verbose | |
was specified. | |
There are as many | |
.B block | |
lines as there are blocks in the | |
.B .xz | |
file. | |
The | |
.B block | |
lines are shown after all the | |
.B stream | |
lines; different line types are not interleaved. | |
.TP | |
.B summary | |
This line type is used only when | |
.B \-\-verbose | |
was specified twice. | |
This line is printed after all | |
.B block | |
lines. | |
Like the | |
.B file | |
line, the | |
.B summary | |
line contains overall information about the | |
.B .xz | |
file. | |
.TP | |
.B totals | |
This line is always the very last line of the list output. | |
It shows the total counts and sizes. | |
.PP | |
The columns of the | |
.B file | |
lines: | |
.PD 0 | |
.RS | |
.IP 2. 4 | |
Number of streams in the file | |
.IP 3. 4 | |
Total number of blocks in the stream(s) | |
.IP 4. 4 | |
Compressed size of the file | |
.IP 5. 4 | |
Uncompressed size of the file | |
.IP 6. 4 | |
Compression ratio, for example, | |
.BR 0.123 . | |
If ratio is over 9.999, three dashes | |
.RB ( \-\-\- ) | |
are displayed instead of the ratio. | |
.IP 7. 4 | |
Comma-separated list of integrity check names. | |
The following strings are used for the known check types: | |
.BR None , | |
.BR CRC32 , | |
.BR CRC64 , | |
and | |
.BR SHA\-256 . | |
For unknown check types, | |
.BI Unknown\- N | |
is used, where | |
.I N | |
is the Check ID as a decimal number (one or two digits). | |
.IP 8. 4 | |
Total size of stream padding in the file | |
.RE | |
.PD | |
.PP | |
The columns of the | |
.B stream | |
lines: | |
.PD 0 | |
.RS | |
.IP 2. 4 | |
Stream number (the first stream is 1) | |
.IP 3. 4 | |
Number of blocks in the stream | |
.IP 4. 4 | |
Compressed start offset | |
.IP 5. 4 | |
Uncompressed start offset | |
.IP 6. 4 | |
Compressed size (does not include stream padding) | |
.IP 7. 4 | |
Uncompressed size | |
.IP 8. 4 | |
Compression ratio | |
.IP 9. 4 | |
Name of the integrity check | |
.IP 10. 4 | |
Size of stream padding | |
.RE | |
.PD | |
.PP | |
The columns of the | |
.B block | |
lines: | |
.PD 0 | |
.RS | |
.IP 2. 4 | |
Number of the stream containing this block | |
.IP 3. 4 | |
Block number relative to the beginning of the stream | |
(the first block is 1) | |
.IP 4. 4 | |
Block number relative to the beginning of the file | |
.IP 5. 4 | |
Compressed start offset relative to the beginning of the file | |
.IP 6. 4 | |
Uncompressed start offset relative to the beginning of the file | |
.IP 7. 4 | |
Total compressed size of the block (includes headers) | |
.IP 8. 4 | |
Uncompressed size | |
.IP 9. 4 | |
Compression ratio | |
.IP 10. 4 | |
Name of the integrity check | |
.RE | |
.PD | |
.PP | |
If | |
.B \-\-verbose | |
was specified twice, additional columns are included on the | |
.B block | |
lines. | |
These are not displayed with a single | |
.BR \-\-verbose , | |
because getting this information requires many seeks | |
and can thus be slow: | |
.PD 0 | |
.RS | |
.IP 11. 4 | |
Value of the integrity check in hexadecimal | |
.IP 12. 4 | |
Block header size | |
.IP 13. 4 | |
Block flags: | |
.B c | |
indicates that compressed size is present, and | |
.B u | |
indicates that uncompressed size is present. | |
If the flag is not set, a dash | |
.RB ( \- ) | |
is shown instead to keep the string length fixed. | |
New flags may be added to the end of the string in the future. | |
.IP 14. 4 | |
Size of the actual compressed data in the block (this excludes | |
the block header, block padding, and check fields) | |
.IP 15. 4 | |
Amount of memory (in bytes) required to decompress | |
this block with this | |
.B xz | |
version | |
.IP 16. 4 | |
Filter chain. | |
Note that most of the options used at compression time | |
cannot be known, because only the options | |
that are needed for decompression are stored in the | |
.B .xz | |
headers. | |
.RE | |
.PD | |
.PP | |
The columns of the | |
.B summary | |
lines: | |
.PD 0 | |
.RS | |
.IP 2. 4 | |
Amount of memory (in bytes) required to decompress | |
this file with this | |
.B xz | |
version | |
.IP 3. 4 | |
.B yes | |
or | |
.B no | |
indicating if all block headers have both compressed size and | |
uncompressed size stored in them | |
.PP | |
.I Since | |
.B xz | |
.I 5.1.2alpha: | |
.IP 4. 4 | |
Minimum | |
.B xz | |
version required to decompress the file | |
.RE | |
.PD | |
.PP | |
The columns of the | |
.B totals | |
line: | |
.PD 0 | |
.RS | |
.IP 2. 4 | |
Number of streams | |
.IP 3. 4 | |
Number of blocks | |
.IP 4. 4 | |
Compressed size | |
.IP 5. 4 | |
Uncompressed size | |
.IP 6. 4 | |
Average compression ratio | |
.IP 7. 4 | |
Comma-separated list of integrity check names | |
that were present in the files | |
.IP 8. 4 | |
Stream padding size | |
.IP 9. 4 | |
Number of files. | |
This is here to | |
keep the order of the earlier columns the same as on | |
.B file | |
lines. | |
.PD | |
.RE | |
.PP | |
If | |
.B \-\-verbose | |
was specified twice, additional columns are included on the | |
.B totals | |
line: | |
.PD 0 | |
.RS | |
.IP 10. 4 | |
Maximum amount of memory (in bytes) required to decompress | |
the files with this | |
.B xz | |
version | |
.IP 11. 4 | |
.B yes | |
or | |
.B no | |
indicating if all block headers have both compressed size and | |
uncompressed size stored in them | |
.PP | |
.I Since | |
.B xz | |
.I 5.1.2alpha: | |
.IP 12. 4 | |
Minimum | |
.B xz | |
version required to decompress the file | |
.RE | |
.PD | |
.PP | |
Future versions may add new line types and | |
new columns can be added to the existing line types, | |
but the existing columns won't be changed. | |
. | |
.SH "EXIT STATUS" | |
.TP | |
.B 0 | |
All is good. | |
.TP | |
.B 1 | |
An error occurred. | |
.TP | |
.B 2 | |
Something worth a warning occurred, | |
but no actual errors occurred. | |
.PP | |
Notices (not warnings or errors) printed on standard error | |
don't affect the exit status. | |
. | |
.SH ENVIRONMENT | |
.B xz | |
parses space-separated lists of options | |
from the environment variables | |
.B XZ_DEFAULTS | |
and | |
.BR XZ_OPT , | |
in this order, before parsing the options from the command line. | |
Note that only options are parsed from the environment variables; | |
all non-options are silently ignored. | |
Parsing is done with | |
.BR getopt_long (3) | |
which is used also for the command line arguments. | |
.TP | |
.B XZ_DEFAULTS | |
User-specific or system-wide default options. | |
Typically this is set in a shell initialization script to enable | |
.BR xz 's | |
memory usage limiter by default. | |
Excluding shell initialization scripts | |
and similar special cases, scripts must never set or unset | |
.BR XZ_DEFAULTS . | |
.TP | |
.B XZ_OPT | |
This is for passing options to | |
.B xz | |
when it is not possible to set the options directly on the | |
.B xz | |
command line. | |
This is the case when | |
.B xz | |
is run by a script or tool, for example, GNU | |
.BR tar (1): | |
.RS | |
.RS | |
.PP | |
.nf | |
.ft CW | |
XZ_OPT=\-2v tar caf foo.tar.xz foo | |
.ft R | |
.fi | |
.RE | |
.RE | |
.IP "" | |
Scripts may use | |
.BR XZ_OPT , | |
for example, to set script-specific default compression options. | |
It is still recommended to allow users to override | |
.B XZ_OPT | |
if that is reasonable. | |
For example, in | |
.BR sh (1) | |
scripts one may use something like this: | |
.RS | |
.RS | |
.PP | |
.nf | |
.ft CW | |
XZ_OPT=${XZ_OPT\-"\-7e"} | |
export XZ_OPT | |
.ft R | |
.fi | |
.RE | |
.RE | |
. | |
.SH "LZMA UTILS COMPATIBILITY" | |
The command line syntax of | |
.B xz | |
is practically a superset of | |
.BR lzma , | |
.BR unlzma , | |
and | |
.B lzcat | |
as found from LZMA Utils 4.32.x. | |
In most cases, it is possible to replace | |
LZMA Utils with XZ Utils without breaking existing scripts. | |
There are some incompatibilities though, | |
which may sometimes cause problems. | |
. | |
.SS "Compression preset levels" | |
The numbering of the compression level presets is not identical in | |
.B xz | |
and LZMA Utils. | |
The most important difference is how dictionary sizes | |
are mapped to different presets. | |
Dictionary size is roughly equal to the decompressor memory usage. | |
.RS | |
.PP | |
.TS | |
tab(;); | |
c c c | |
c n n. | |
Level;xz;LZMA Utils | |
\-0;256 KiB;N/A | |
\-1;1 MiB;64 KiB | |
\-2;2 MiB;1 MiB | |
\-3;4 MiB;512 KiB | |
\-4;4 MiB;1 MiB | |
\-5;8 MiB;2 MiB | |
\-6;8 MiB;4 MiB | |
\-7;16 MiB;8 MiB | |
\-8;32 MiB;16 MiB | |
\-9;64 MiB;32 MiB | |
.TE | |
.RE | |
.PP | |
The dictionary size differences affect | |
the compressor memory usage too, | |
but there are some other differences between | |
LZMA Utils and XZ Utils, which | |
make the difference even bigger: | |
.RS | |
.PP | |
.TS | |
tab(;); | |
c c c | |
c n n. | |
Level;xz;LZMA Utils 4.32.x | |
\-0;3 MiB;N/A | |
\-1;9 MiB;2 MiB | |
\-2;17 MiB;12 MiB | |
\-3;32 MiB;12 MiB | |
\-4;48 MiB;16 MiB | |
\-5;94 MiB;26 MiB | |
\-6;94 MiB;45 MiB | |
\-7;186 MiB;83 MiB | |
\-8;370 MiB;159 MiB | |
\-9;674 MiB;311 MiB | |
.TE | |
.RE | |
.PP | |
The default preset level in LZMA Utils is | |
.B \-7 | |
while in XZ Utils it is | |
.BR \-6 , | |
so both use an 8 MiB dictionary by default. | |
. | |
.SS "Streamed vs. non-streamed .lzma files" | |
The uncompressed size of the file can be stored in the | |
.B .lzma | |
header. | |
LZMA Utils does that when compressing regular files. | |
The alternative is to mark that uncompressed size is unknown | |
and use end-of-payload marker to indicate | |
where the decompressor should stop. | |
LZMA Utils uses this method when uncompressed size isn't known, | |
which is the case, for example, in pipes. | |
.PP | |
.B xz | |
supports decompressing | |
.B .lzma | |
files with or without end-of-payload marker, but all | |
.B .lzma | |
files created by | |
.B xz | |
will use end-of-payload marker and have uncompressed size | |
marked as unknown in the | |
.B .lzma | |
header. | |
This may be a problem in some uncommon situations. | |
For example, a | |
.B .lzma | |
decompressor in an embedded device might work | |
only with files that have known uncompressed size. | |
If you hit this problem, you need to use LZMA Utils | |
or LZMA SDK to create | |
.B .lzma | |
files with known uncompressed size. | |
. | |
.SS "Unsupported .lzma files" | |
The | |
.B .lzma | |
format allows | |
.I lc | |
values up to 8, and | |
.I lp | |
values up to 4. | |
LZMA Utils can decompress files with any | |
.I lc | |
and | |
.IR lp , | |
but always creates files with | |
.B lc=3 | |
and | |
.BR lp=0 . | |
Creating files with other | |
.I lc | |
and | |
.I lp | |
is possible with | |
.B xz | |
and with LZMA SDK. | |
.PP | |
The implementation of the LZMA1 filter in liblzma | |
requires that the sum of | |
.I lc | |
and | |
.I lp | |
must not exceed 4. | |
Thus, | |
.B .lzma | |
files, which exceed this limitation, cannot be decompressed with | |
.BR xz . | |
.PP | |
LZMA Utils creates only | |
.B .lzma | |
files which have a dictionary size of | |
.RI "2^" n | |
(a power of 2) but accepts files with any dictionary size. | |
liblzma accepts only | |
.B .lzma | |
files which have a dictionary size of | |
.RI "2^" n | |
or | |
.RI "2^" n " + 2^(" n "\-1)." | |
This is to decrease false positives when detecting | |
.B .lzma | |
files. | |
.PP | |
These limitations shouldn't be a problem in practice, | |
since practically all | |
.B .lzma | |
files have been compressed with settings that liblzma will accept. | |
. | |
.SS "Trailing garbage" | |
When decompressing, | |
LZMA Utils silently ignore everything after the first | |
.B .lzma | |
stream. | |
In most situations, this is a bug. | |
This also means that LZMA Utils | |
don't support decompressing concatenated | |
.B .lzma | |
files. | |
.PP | |
If there is data left after the first | |
.B .lzma | |
stream, | |
.B xz | |
considers the file to be corrupt unless | |
.B \-\-single\-stream | |
was used. | |
This may break obscure scripts which have | |
assumed that trailing garbage is ignored. | |
. | |
.SH NOTES | |
. | |
.SS "Compressed output may vary" | |
The exact compressed output produced from | |
the same uncompressed input file | |
may vary between XZ Utils versions even if | |
compression options are identical. | |
This is because the encoder can be improved | |
(faster or better compression) | |
without affecting the file format. | |
The output can vary even between different | |
builds of the same XZ Utils version, | |
if different build options are used. | |
.PP | |
The above means that once | |
.B \-\-rsyncable | |
has been implemented, | |
the resulting files won't necessarily be rsyncable | |
unless both old and new files have been compressed | |
with the same xz version. | |
This problem can be fixed if a part of the encoder | |
implementation is frozen to keep rsyncable output | |
stable across xz versions. | |
. | |
.SS "Embedded .xz decompressors" | |
Embedded | |
.B .xz | |
decompressor implementations like XZ Embedded don't necessarily | |
support files created with integrity | |
.I check | |
types other than | |
.B none | |
and | |
.BR crc32 . | |
Since the default is | |
.BR \-\-check=crc64 , | |
you must use | |
.B \-\-check=none | |
or | |
.B \-\-check=crc32 | |
when creating files for embedded systems. | |
.PP | |
Outside embedded systems, all | |
.B .xz | |
format decompressors support all the | |
.I check | |
types, or at least are able to decompress | |
the file without verifying the | |
integrity check if the particular | |
.I check | |
is not supported. | |
.PP | |
XZ Embedded supports BCJ filters, | |
but only with the default start offset. | |
. | |
.SH EXAMPLES | |
. | |
.SS Basics | |
Compress the file | |
.I foo | |
into | |
.I foo.xz | |
using the default compression level | |
.RB ( \-6 ), | |
and remove | |
.I foo | |
if compression is successful: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz foo | |
.ft R | |
.fi | |
.RE | |
.PP | |
Decompress | |
.I bar.xz | |
into | |
.I bar | |
and don't remove | |
.I bar.xz | |
even if decompression is successful: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-dk bar.xz | |
.ft R | |
.fi | |
.RE | |
.PP | |
Create | |
.I baz.tar.xz | |
with the preset | |
.B \-4e | |
.RB ( "\-4 \-\-extreme" ), | |
which is slower than the default | |
.BR \-6 , | |
but needs less memory for compression and decompression (48\ MiB | |
and 5\ MiB, respectively): | |
.RS | |
.PP | |
.nf | |
.ft CW | |
tar cf \- baz | xz \-4e > baz.tar.xz | |
.ft R | |
.fi | |
.RE | |
.PP | |
A mix of compressed and uncompressed files can be decompressed | |
to standard output with a single command: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt | |
.ft R | |
.fi | |
.RE | |
. | |
.SS "Parallel compression of many files" | |
On GNU and *BSD, | |
.BR find (1) | |
and | |
.BR xargs (1) | |
can be used to parallelize compression of many files: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
find . \-type f \e! \-name '*.xz' \-print0 \e | |
| xargs \-0r \-P4 \-n16 xz \-T1 | |
.ft R | |
.fi | |
.RE | |
.PP | |
The | |
.B \-P | |
option to | |
.BR xargs (1) | |
sets the number of parallel | |
.B xz | |
processes. | |
The best value for the | |
.B \-n | |
option depends on how many files there are to be compressed. | |
If there are only a couple of files, | |
the value should probably be 1; | |
with tens of thousands of files, | |
100 or even more may be appropriate to reduce the number of | |
.B xz | |
processes that | |
.BR xargs (1) | |
will eventually create. | |
.PP | |
The option | |
.B \-T1 | |
for | |
.B xz | |
is there to force it to single-threaded mode, because | |
.BR xargs (1) | |
is used to control the amount of parallelization. | |
. | |
.SS "Robot mode" | |
Calculate how many bytes have been saved in total | |
after compressing multiple files: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}' | |
.ft R | |
.fi | |
.RE | |
.PP | |
A script may want to know that it is using new enough | |
.BR xz . | |
The following | |
.BR sh (1) | |
script checks that the version number of the | |
.B xz | |
tool is at least 5.0.0. | |
This method is compatible with old beta versions, | |
which didn't support the | |
.B \-\-robot | |
option: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" || | |
[ "$XZ_VERSION" \-lt 50000002 ]; then | |
echo "Your xz is too old." | |
fi | |
unset XZ_VERSION LIBLZMA_VERSION | |
.ft R | |
.fi | |
.RE | |
.PP | |
Set a memory usage limit for decompression using | |
.BR XZ_OPT , | |
but if a limit has already been set, don't increase it: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
NEWLIM=$((123 << 20))\ \ # 123 MiB | |
OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3) | |
if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then | |
XZ_OPT="$XZ_OPT \-\-memlimit\-decompress=$NEWLIM" | |
export XZ_OPT | |
fi | |
.ft R | |
.fi | |
.RE | |
. | |
.SS "Custom compressor filter chains" | |
The simplest use for custom filter chains is | |
customizing a LZMA2 preset. | |
This can be useful, | |
because the presets cover only a subset of the | |
potentially useful combinations of compression settings. | |
.PP | |
The CompCPU columns of the tables | |
from the descriptions of the options | |
.BR "\-0" " ... " "\-9" | |
and | |
.B \-\-extreme | |
are useful when customizing LZMA2 presets. | |
Here are the relevant parts collected from those two tables: | |
.RS | |
.PP | |
.TS | |
tab(;); | |
c c | |
n n. | |
Preset;CompCPU | |
\-0;0 | |
\-1;1 | |
\-2;2 | |
\-3;3 | |
\-4;4 | |
\-5;5 | |
\-6;6 | |
\-5e;7 | |
\-6e;8 | |
.TE | |
.RE | |
.PP | |
If you know that a file requires | |
somewhat big dictionary (for example, 32\ MiB) to compress well, | |
but you want to compress it quicker than | |
.B "xz \-8" | |
would do, a preset with a low CompCPU value (for example, 1) | |
can be modified to use a bigger dictionary: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-\-lzma2=preset=1,dict=32MiB foo.tar | |
.ft R | |
.fi | |
.RE | |
.PP | |
With certain files, the above command may be faster than | |
.B "xz \-6" | |
while compressing significantly better. | |
However, it must be emphasized that only some files benefit from | |
a big dictionary while keeping the CompCPU value low. | |
The most obvious situation, | |
where a big dictionary can help a lot, | |
is an archive containing very similar files | |
of at least a few megabytes each. | |
The dictionary size has to be significantly bigger | |
than any individual file to allow LZMA2 to take | |
full advantage of the similarities between consecutive files. | |
.PP | |
If very high compressor and decompressor memory usage is fine, | |
and the file being compressed is | |
at least several hundred megabytes, it may be useful | |
to use an even bigger dictionary than the 64 MiB that | |
.B "xz \-9" | |
would use: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-vv \-\-lzma2=dict=192MiB big_foo.tar | |
.ft R | |
.fi | |
.RE | |
.PP | |
Using | |
.B \-vv | |
.RB ( "\-\-verbose \-\-verbose" ) | |
like in the above example can be useful | |
to see the memory requirements | |
of the compressor and decompressor. | |
Remember that using a dictionary bigger than | |
the size of the uncompressed file is waste of memory, | |
so the above command isn't useful for small files. | |
.PP | |
Sometimes the compression time doesn't matter, | |
but the decompressor memory usage has to be kept low, for example, | |
to make it possible to decompress the file on an embedded system. | |
The following command uses | |
.B \-6e | |
.RB ( "\-6 \-\-extreme" ) | |
as a base and sets the dictionary to only 64\ KiB. | |
The resulting file can be decompressed with XZ Embedded | |
(that's why there is | |
.BR \-\-check=crc32 ) | |
using about 100\ KiB of memory. | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo | |
.ft R | |
.fi | |
.RE | |
.PP | |
If you want to squeeze out as many bytes as possible, | |
adjusting the number of literal context bits | |
.RI ( lc ) | |
and number of position bits | |
.RI ( pb ) | |
can sometimes help. | |
Adjusting the number of literal position bits | |
.RI ( lp ) | |
might help too, but usually | |
.I lc | |
and | |
.I pb | |
are more important. | |
For example, a source code archive contains mostly US-ASCII text, | |
so something like the following might give | |
slightly (like 0.1\ %) smaller file than | |
.B "xz \-6e" | |
(try also without | |
.BR lc=4 ): | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar | |
.ft R | |
.fi | |
.RE | |
.PP | |
Using another filter together with LZMA2 can improve | |
compression with certain file types. | |
For example, to compress a x86-32 or x86-64 shared library | |
using the x86 BCJ filter: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-\-x86 \-\-lzma2 libfoo.so | |
.ft R | |
.fi | |
.RE | |
.PP | |
Note that the order of the filter options is significant. | |
If | |
.B \-\-x86 | |
is specified after | |
.BR \-\-lzma2 , | |
.B xz | |
will give an error, | |
because there cannot be any filter after LZMA2, | |
and also because the x86 BCJ filter cannot be used | |
as the last filter in the chain. | |
.PP | |
The Delta filter together with LZMA2 | |
can give good results with bitmap images. | |
It should usually beat PNG, | |
which has a few more advanced filters than simple | |
delta but uses Deflate for the actual compression. | |
.PP | |
The image has to be saved in uncompressed format, | |
for example, as uncompressed TIFF. | |
The distance parameter of the Delta filter is set | |
to match the number of bytes per pixel in the image. | |
For example, 24-bit RGB bitmap needs | |
.BR dist=3 , | |
and it is also good to pass | |
.B pb=0 | |
to LZMA2 to accommodate the three-byte alignment: | |
.RS | |
.PP | |
.nf | |
.ft CW | |
xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff | |
.ft R | |
.fi | |
.RE | |
.PP | |
If multiple images have been put into a single archive (for example, | |
.BR .tar ), | |
the Delta filter will work on that too as long as all images | |
have the same number of bytes per pixel. | |
. | |
.SH "SEE ALSO" | |
.BR xzdec (1), | |
.BR xzdiff (1), | |
.BR xzgrep (1), | |
.BR xzless (1), | |
.BR xzmore (1), | |
.BR gzip (1), | |
.BR bzip2 (1), | |
.BR 7z (1) | |
.PP | |
XZ Utils: <https://tukaani.org/xz/> | |
.br | |
XZ Embedded: <https://tukaani.org/xz/embedded.html> | |
.br | |
LZMA SDK: <http://7-zip.org/sdk.html> | |