Cromfs: Compressed ROM filesystem for Linux (user-space)
0. Contents
This is the documentation of cromfs-1.1.2.3.
1. Purpose
Cromfs is a compressed read-only filesystem for Linux. Cromfs is intended
for permanently archiving gigabytes of big files that have lots of redundancy.
In terms of compression it is much similar to
7-zip files, except that fast random
access is provided for the whole archive contents; the user does not need
to launch a program to decompress a single file, nor does he need to wait
while the system decompresses 500 files from a 1000-file archive to get
him the 1 file he wanted to open.
Note: The primary design goal of cromfs is compression power.
It is much slower than its peers, and uses more RAM.
If all you care about is "powerful compression" and "normal
(random) read-only file access", then you will be happy with cromfs.
The creation of cromfs was inspired
from
Squashfs
and
Cramfs.
2. News
3. Overview
- Data, inodes, directories and block lists are stored compressed
- Duplicate inodes, files and even duplicate file portions are detected and stored only once
- Especially suitable for gigabyte-class archives of
thousands of nearly-identical megabyte-class files.
- Files are stored in solid blocks, meaning that parts of different
files are compressed together for effective compression
- Most of inode types recognized by Linux are supported (see comparisons).
- The LZMA compression is used.
In the general case, LZMA compresses better than gzip and bzip2.
- As with usual filesystems, the files on a cromfs volume can be accessed
in arbitrary order; the waits to open a specific file are small, despite
the files being semisolidly archived.
- Works on 64-bit and 32-bit systems.
See
the documentation of the cromfs format for technical details
(also included in the source package as doc/FORMAT).
4. Limitations
- Filesystem is write-once, read-only. It is not possible to append
to a previously-created filesystem, nor it is to mount it read-write.
- Max filesize: 2^64 bytes (16777216 TB), but 256 TB with default settings.
- Max number of files in a directory: 2^30 (smaller if filenames are longer, but still more than 100000 in almost all cases)
- Max number of inodes (all files, dirs etc combined): 2^60, but depends on file sizes
- Max filesystem size: 2^64 bytes (16777216 TB)
- There are no "." and ".." entries in directories.
- mkcromfs is slow. You must be patient.
- The cromfs-driver has a large memory footprint. It is not
suitable for very size-constrained systems.
- Maximum filename length: 4095 bytes
- Being an user-space filesystem, it might not be suitable for
root filesystems of rescue, tiny-Linux and installation disks.
(Facts needed.)
- For device inodes, hardlink count of 1 is assumed.
(This has no effect to compression efficiency.)
Development status: Pre-beta. The Cromfs project has been created
very recently and it hasn't been yet tested extensively. There is no
warranty against data loss or anything else, so use at your own risk.
5. Comparing to other filesystems
This is all very biased, hypothetical, and by no means
a scientific study, but here goes:
Feature |
Cromfs |
Cramfs |
Squashfs (3.0) |
Compression unit |
adjustable (4 MB default) |
4 kB |
adjustable (64 kB max) |
Files are compressed |
together (up to block limit) |
individually |
individually |
Maximum file size |
16 EB (2^44 MB) |
16 MB (2^4 MB) |
16 EB (2^44 MB) (4 GB before v3.0) |
Duplicate whole file detection |
Yes |
No (but hardlinks are detected) |
Yes |
Hardlinks detected and saved |
Yes |
Unknown |
Yes, since v3.0 |
Near-identical file detection |
Yes (identical blocks) |
No |
No |
Compression method |
LZMA |
gzip |
gzip |
Ownerships |
uid,gid (since version 1.1.2)
| uid,gid (but gid truncated to 8 bits) |
uid,gid |
Timestamps |
mtime only |
None |
mtime only |
Endianess-safety |
Works on little-endian only |
Safe, but not exchangeable |
Safe |
Kernelspace/userspace |
User (fuse) |
Kernel |
Kernel |
Appending to a previously created filesystem |
No |
No |
Yes |
Supported inode types |
reg,dir,chrdev,blkdev,fifo,link,sock |
reg,dir,chrdev,blkdev,fifo,link,sock |
reg,dir,chrdev,blkdev,fifo,link,sock |
Note: cromfs now saves the uid and gid in the filesystem. However,
when the uid is 0 (root), the cromfs-driver returns the uid of the
user who mounted the filesystem, instead of root. Similarly for gid.
This is both for backward compatibility and for security.
If you mount as root, this behavior has no effect.
5.1. Compression tests
Note: I use the -e and -r options in all of these mkcromfs tests
to avoid unnecessary decompression+recompression steps, in order
to speed up the filesystem generation. This has no effect in
compression ratio.
Item |
10783 NES ROMs (2523 MB) |
Mozilla source code from CVS (279 MB) |
Damn small Linux liveCD (113 MB)
(size taken from "du -c" output in the uncompressed filesystem) |
cromfs |
mkcromfs -s16384 -a16 -f16777216
With 2k blocks (-b2048), 202,811,971 bytes |
mkcromfs -b65536 -f2097152
29,525,376 bytes |
mkcromfs -f1048576
With 64k blocks (-b65536), 39,778,030 bytes
With 16k blocks (-b16384), 39,718,882 bytes
With 1k blocks (-b1024), 40,141,729 bytes
|
cramfs |
mkcramfs -b65536
dies prematurely, "filesystem too big" |
mkcramfs
with 2M blocks (-b2097152), 58,720,256 bytes
with 64k blocks (-b65536), 57,344,000 bytes
with 4k blocks (-b4096), 68,435,968 bytes
|
mkcramfs -b65536
51,445,760 bytes
|
squashfs |
mksquashfs -b65536
(using an optimized sort file) 1,185,546,240 bytes |
mksquashfs -b65536
43,335,680 bytes |
mksquashfs -b65536
50,028,544 bytes
|
cloop |
untested |
create_compressed_fs image.iso
(using an iso9660 image created with mkisofs -RJ)
using 7zip, 1M blocks (-B1048576 -L-1): 41,201,014 bytes
(1 MB is maximum block size in cloop)
|
create_compressed_fs image.iso
(using an iso9660 image)
using 7zip, 1M blocks (-B1048576 -L-1): 48,328,580 bytes
using zlib, 64k blocks (-B65536 -L9): 50,641,093 bytes
|
6. Getting started
- Install the development requirements: make, gcc-c++ and fuse
- Remember that for fuse to work, the kernel must also contain the fuse support.
Do "modprobe fuse", and check if you have "/dev/fuse" and check if it works.
- If an attempt to read from "/dev/fuse" (as root) gives "no such device",
it does not work. If it gives "operation not permitted", it might work.
- Build "cromfs-driver" and "util/mkcromfs", i.e. command "make":
$ make
- Create a sample filesystem:
$ util/mkcromfs . sample.cromfs
- Mount the sample filesystem:
$ mkdir sample
$ ./cromfs-driver sample.cromfs sample &
- Observe the sample filesystem:
$ cd sample
$ ls -al
- Unmounting the filesystem:
$ cd ..
$ fusermount -u sample
or, type "fg" and press ctrl-c to terminate the driver.
7. Tips
To improve the compression, try these tips:
- Adjust the block size (--bsize) in mkcromfs. If your files
have a lot identical content, aligned at a certain boundary,
use that boundary as the block size value. If you are uncertain,
use a small value (500-5000) rather than a bigger value (20000-400000).
Too small values will however make inodes large, so keep it sane.
Note: The value does not need to be a power of two.
- Adjust the fblock size (--fsize) in mkcromfs. Larger values
cause almost always better compression.
Note: The value does not need to be a power of two.
- Adjust the --autoindexratio option (-a). A bigger value will
increase the chances of mkcromfs finding an identical block
from something it already processed (if your data has that
opportunity). Finding that two blocks are identical always
means better compression.
- Sort your files. Files which have similar or partially
identical content should be processed right after one other.
- Adjust the --bruteforcelimit option (-c). Larger values will require
mkcromfs to check more fblocks for each block it encodes (making the
encoding much slower), in the hope it improves compression.
The fewer your fblocks are in number (larger in size),
the better the chances it does good.
Note: If you use --bruteforcelimit, you should also adjust
your --minfreespace setting as instructed in mkcromfs --help.
To improve the filesystem generation speed, try these tips:
- Use the --decompresslookups option (-e), if you have the
diskspace to spare.
- Use the TEMP environment variable to control where the temp
files are written. Example: TEMP=~/cromfs-temp ./mkcromfs ...
- Use larger block size (--bsize). Smaller blocks mean more blocks
which means more work. Larger blocks are less work.
- Do not use the --bruteforcelimit option (-c). The default value 0
means that only one fblock will be assumed as a candidate.
To control the memory usage, use these tips:
- Adjust the fblock size (--fsize). The memory used by cromfs-driver
is directly proportional to the size of your fblocks. It keeps at
most 10 fblocks decompressed in the RAM at a time. If your fblocks
are 4 MB in size, it will use 40 MB at max.
- In mkcromfs, adjust the --autoindexratio option (-a). This will
not have effect on the memory usage of cromfs-driver, but it will
control the memory usage of mkcromfs. If you have lots of RAM, you
should use bigger --autoindexratio (because it will improve the chances
of getting better compression results), and use smaller if you have less RAM.
- Find the CACHE_MAX_SIZE settings in cromfs.cc and edit them. This will
require recompiling the source. (In future, this should be made a command
line option for cromfs-driver.)
To control the filesystem speed, use these tips:
- The speed of the underlying storage affects.
- The bigger your fblocks (--fsize), the bigger the latencies are.
cromfs-driver caches the decompressed fblocks, but opening a non-cached
fblock requires decompressing it entirely, which will block the user
process for that period of time.
- The smaller your blocks (--bsize), the bigger the latencies are, because
there will be more steps to process for handling the same amount of data.
- Use the most powerful compiler and compiler settings available
for building cromfs-driver. This helps the decompression and cache lookups.
- Use fast hardware…
8. Copying
cromfs has been written by Joel Yliluoma, a.k.a.
Bisqwit,
and is distributed under the terms of the
General Public License (GPL).
The LZMA code embedded within is licensed under LGPL.
Patches and other related material can be submitted
by e-mail at:
5JoeleFfcSdw Ylixisabwrluomeea <bi@@B@1jsqwiltt@ikg@.iju1ti.fi>
9. Requirements
- GNU make and gcc-c++ are required to recompile the source code.
- The filesystem works under the Fuse
user-space filesystem framework. You need to install both the Fuse kernel
module and the userspace programs before mounting Cromfs volumes.
You need version fuse version 2.6.0 or newer. (2.5.2 might work.)
10. Downloading
Generated from
progdesc.php (last updated: Tue, 16 May 2006 23:58:08 +0300)
with docmaker.php (last updated: Sun, 12 Jun 2005 06:08:02 +0300)
at Wed, 17 May 2006 00:25:30 +0300