Sparse file

A sparse file (English sparse file; sparse for " sparse ", " sparse " or " scattered " ) refers to a file that is very compact storage in a file system if it consists partly of indeterminate content areas. In a sparse file to change areas where there are explicitly recorded data, with ranges from, which are called holes and do not occupy space on the storage medium.

  • 5.1 Creating sparse files
  • 5.2 Detection of sparse files
  • 5.3 Creating sparse files with MsSQL

Basics

It is a space-saving storage form for files that contain many consecutive bytes with undefined content. This storage comes from the world of the inode -based file systems and today is especially typical for all modern Unix-like operating systems. In general, is specified by the file system, that these vague areas to play when a read access as data sequences of zero bytes.

With a sparse file only the parts are stored in the background memory, which have actually been described. Thus, for example, a file with a nominal length of 100 GiB effectively include only one logical block in the file system, if data has been written about only one place in the file.

Such a form of storage can be very useful in some forms of Binärdatenbanken, as well as the mapping of partitions to a file.

Not all operating systems and file systems support sparse files, which end in a hole.

Problems in the use of

Sparse files can be problematic if they are copied. A problem case here is when the file system of the target partition does not have the ability to create sparse files, and also there is not enough free space to hold the complete file, including the then explicitly storing zero bytes. Such a problem can occur about when restoring from backups.

A similar problem occurs when a copy or backup program is not able to recognize the characteristics of the file at all; Sparse files are in fact not generated automatically, but require specific access technology.

Another problem is the automatic fragmentation: Sparse files are fragmented, so to speak, due to their inherent system and are therefore not naturally reach optimal disk access. A linear reading of a sparse file can therefore be quite time consuming, which may be of importance in databases thoroughly.

NTFS sparse

The Windows NTFS file system features as opposed to unix based file systems since version 3 via a special file attribute that causes the Eingabe-/Ausgabesubsystem of the Windows file system for contiguous regions of a file that consists of only zero values ​​, to not take up any space on the volume.

Both normal and compressed data can be handled by NTFS sparse file. On Windows Server 2003 and Windows XP can be a once declared as sparse file of NTFS file no longer be converted to a normal file. In later versions of Windows, this is only possible if no more holes are present.

The conditions stated for unix based file systems problems exist in principle in the same way with NTFS, although the file attribute ensures that written at least according to the general programming guidelines programs can copy sparse files transparently without the sparse property is lost.

Treatment of sparse files under Unix-like operating systems

Creating sparse files

Sparse files can be generated using the Unix dd command:

Dd if = / dev / zero of = sparsefile bs = 1 count = 1 seek = 9999999 This exemplary command creates a 10- megabyte file is sparse, in which it sets the write pointer means seek to position 9999999, and then writes a byte.

The creation of sparse files that end in a "hole" is only indirectly possible in some dd implementations. This requires first a file are generated, the ends, as in the above example, the data written. After the last data portion of the file can be removed using the system call truncate ( ) or ftruncate (). This applies for example for Solaris. For Linux, last = 0 must be set to prevent that after the "hole" data is being written to count. On Linux, if count = 0 has been set, executed without a write operation only ftruncate (), which is a Sparsedatei without differing from the null byte character invests in it.

With the GNU dd can be generated with the following abbreviated call an identical file.

Dd of = sparsefile bs = 1 count = 0 seek = 10000000 Detection of sparse files

For sparse files, the logical and physical file size is different. While the logical file size includes the null bytes, refers to the physical file size of the space that the file is actually needed in the file system.

The option- s of the Unix ls command also displays the physical file size, but in blocks. With -k, the logical size in blocks is displayed with -h are both displayed in readable format:

Ls- lhs sparse -file   ls- lhk sparse -file Alternatively, can be viewed using the Unix command you the logical file size, but also in first block. The option - block-size 1 displays the physical size in bytes, whereas - bytes the logical size in bytes indicating:

Du - block-size 1 - sparse file   du - bytes sparse -file example of use

The following is a 10MB sparse file is created. For comparison with a 3MB large file only falls on you by a simple call that there is a sparse file, which requires only 10 blocks on the hard disk.

> Dd if = / dev / zero of = sparsefile bs = 1 count = 0 seek = 10M 0 0 records a 0 0 records out 0 bytes ( 0 B ) copied, 2.9615 s -05 s, 0.0 kB / s > Dd if = / dev / urandom of = file bs = 1M count normal = 3 3 0 records a 3 0 records out 3145728 bytes ( 3.1 MB) copied, 1.71034 s, 1.8 MB / s > Ls -lh total of 3.1 M -rw -r - r - 1 sven users 3.0M May 18 03:08 normal file -rw -r - r - 1 sven users 10M May 18 03:06 sparsefile > You * 3075 normal file 10 sparsefile Treatment of sparse files under Microsoft Windows

Creating sparse files

A file can be fsutil marked with the Windows command as a sparse file:

Fsutil sparse setflag This unwritten parts of the file are not allocated on the disk for future write operations. To existing areas marked as a sparse file file release, the command can also be used:

Fsutil sparse SetRange Thus the specified range is de - allocated. It should be noted that only complete blocks, be the length of a multiple of 64 KiB and their starting positions are located at multiples of 64 KiB, can be released.

To perform these operations programmatically, the kernel function DeviceIoControl can be used with the control codes FSCTL_SET_SPARSE and FSCTL_SET_ZERO_DATA. The latter code also works for files that are not sparse files, but the data fields are not released, but filled with null bytes.

Detection of sparse files

Whether a file is a sparse file, can also be determined using the fsutil command:

Fsutil sparse queryflag To list the actual alloziierten areas, the command is invoked as follows:

Fsutil sparse query range Creating sparse files with MsSQL

The creation of sparse files by MSSQL Version 2005 is available as a database snapshot. The following SQL statements create a sparse file size 2 gigabytes under the name C: \ Uncompressed \ Dummy_Snap.mdf

CREATE DATABASE [ dummy ]   ON PRIMARY (NAME = N'Dummy ', FILENAME = N'C: \ Uncompressed \ Dummy.mdf ', SIZE = 2097152KB )   LOG ON (NAME = N'Dummy_log ', FILENAME = N'C: \ Uncompressed \ Dummy_log.ldf ')   GO   CREATE DATABASE [ Dummy_Snap ]   ON PRIMARY (NAME = N'Dummy ', FILENAME = N'C: \ Uncompressed \ Dummy_Snap.mdf ')   AS SNAPSHOT OF [ dummy ] see also

  • Sparse Matrix
  • " Sparse disk image " (English "sparse image" ) on Mac OS X
611005
de