Filename

A file name identifies a file on a disk or in a data transmission. Usually, a file is additionally characterized by a directory name, so a full path name is created. Only this combination to a full path name is unique within the rule.

  • 4.1 World Wide Web
  • 4.2 File Download
  • 4.3 E -mail

Properties

A file name can - depending on the operating system - made ​​up of several parts. The individual pieces are separated by specific characters which can not be part of the file name in the rule; the list of file name extensions with an overview.

Some operating systems make the treatment of the files of the respective file name extension dependent, others work without this convention and recognize the file type based on the content ( for example, using a so-called magic number ). Even on these systems but files are often provided with such extensions, since it simplifies the data exchange.

The maximum length of a file name is limited by the operating system and the file system of the disk. How can be used such as on a CD -ROM using the Joliet file system, up to 64 characters. Indirect limit can also arise through a maximum length of the path name in the operating system.

One difference between MS Windows and Linux / Unix is that Windows does not distinguish file names are case -insensitive, while Unix is doing (for example, call and there Haustuer.txt hausTuer.txt different files ).

File Systems

Operating Systems

Unix

Unix and Unix -like operating systems such as Solaris or Linux look filename as a whole. A file can have several names and be located in multiple directories ( "hard left" or "bind mounts" ). All characters except the slash "/" and the null character are allowed. Early versions had 1 to 14 characters long file names. The BSD variants led up to 255 characters long names.

A relative file path may consist of several segments and begins with a segment. Each segment is subject to the rules of the file name, so it can be 14 or 255 characters long. The segments of the file paths are separated by the character " /". The last segment identifies the actual file. The previous segments are either directory name, or symbolic links (English " symbolic links " ) on directory names. A relative path starts from the current working directory, which can set each process individually. An absolute file path already begins to run on "/" and is independent of the current working directory. He starts from the root directory. About the root directory of all files on a system accessible.

When access is case - sensitive.

Examples:

/ home / user / Documents / letter.txt / usr / bin / text editor The file name '.' (Dot) refers to the current working directory. The name '..' refers to the parent directory.

Also, the space character, the newline character or the wildcards '*' and '?' may be part of a path name. Such signs bring but sometimes later problems, as for example, poorly written scripts can not cope. There can also be problems with file names that contain characters that are not present in the current character set of a program ( for example, Japanese characters on an American system set up ). The non-displayable characters are then often displayed as a question mark or small box, which makes the access to the data very difficult. These files can then often be only edited after they have been changed on a low file system abstraction layer (for example, by specifying the so-called inode instead of the filename with ls -i and find. - Inum [ ... ] -exec mv {} [ ... ] \ ;).

A Unix system does not use specific extensions, such as. EXE or. CMD. However, it has become customary to provide files of a specific type, as in other operating systems, even with a point and a corresponding expansion in order to increase the clarity. For example, the extension. C used for C source programs. Executable files, which are programs and scripts will not get any ending. File types can otherwise with the simple program "file", be determined independently of any expansion.

Files or directories whose names begin with a dot are usually treated files as " hidden" and only appears when the user explicitly specifies this ( for example, ls -a).

The same is true for directory paths.

CP / M, DOS, Windows, to version 3.11

File names are under CP / M and the various PC-compatible DOS versions including MS- Windows up to version 3.11 on a maximum of eight characters comprehensive real " name " and, optionally, a point and a maximum of three -character " extension" (english extension ), which also indicates the type of the file (see 8.3). Extensions are often awarded by programs or reserved for programs, such as the extension. Txt for text files. The operating systems themselves use specific extensions such as. BAT for script files. SYS driver for files or. EXE and. COM for executable files.

The following characters are as they meet in the systems mentioned syntactic functions in file names and extensions are not allowed:

<>? ": | \ / * The space is also not allowed. Also, some words are reserved and may not be used as the file name, since they are used as the device name:

CON, PRN, AUX, NUL COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 This can be under classical DOS, for example, the following file names that can be permitted under other operating systems, do not use: aux.c, q " uote " s.txt, NUL.txt.

Are directory names under the mentioned operating systems such as ordinary file names handled. They usually have no extension, but can be provided with such a. This is then generally, in contrast to the names of other files, no function. Every file and directory is on a drive that is identified by a letter and a colon. A full name consists of the drive, optionally one or more directory names and the actual file name. These components are separated by the list separator symbol '\ ' from one another.

A: \ MSDOS.SYS C: \ DOCUMENT \ LETTER.TXT Since only eight characters are available, the terms are often mutilated. The name '.' and '..' are reserved as on Unix for the current directory and the parent directory.

When access is not case - sensitive.

Windows Version 95

On Windows (Windows 95, 98, ME, NT, 2000, XP, Vista, 7, 8), there is a file name from the name, a period and an extension that specifies the file type. It can be specified in a file name several points, the last point is then used to separate the name and extension.

Length of the file name and path

Normally, the path length is limited to 260 characters in Windows, ie three characters for a drive specification, 256 characters for the path within the drive and an invisible string termination character. Longer paths up to 32,767 characters, as they are supported by NTFS are possible using UNC (Uniform Naming Convention ), ie \ \? \ Must be prefixed.

For compatibility with old MS- DOS programs, the file name can also be specified in the 8.3 notation, if this has not been disabled in Windows. In this case, the file name is clearly shown with eight characters for the name, a period, and up to three characters for the file extension, which are regenerated in each directory. If files have lost their long file names, so they have only this specific short name, it can cause conflicts with already existing files with long file names, the file name was shortened to the same name, even if they can easily co-existed previously in a different directory. (→ 8.3)

Problematic and illegal characters or name

Following characters in file names and extensions, as already under DOS and Windows to version 3.11, not allowed:

<>? ": | \ / * Also prohibited are the following, as has been previously reserved as device names file names:

CON, PRN, AUX, NUL COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. This can be also among the newer versions of Windows, for example, the following file names that can be permitted under other operating systems, do not use: aux.c, q " uote " s.txt, NUL.txt.

The problem also are file names that contain the actually allowed & character that is not used by the DOS environment in Windows as a delimiter single-line chain of command, so that everything is interpreted on an ampersand following as another DOS command line. The consequence is the Windows "Command Prompt", therefore, in this case an error message that find or run could a command not whose name is the rest of the input file name after the '&' character, not to mention that the question file itself, of course, could not be opened or edited.

In addition, file names are problematic, which have a space at the end. This one can not create in Windows; they are created in other operating systems, you can not access to it under Windows, because Windows simply truncates the spaces at the end. Authors of malicious code have already exploited, as this anti-virus programs can access only through specific actions on those files.

Otherwise, all characters defined in the Unicode standard can be used in practice, older applications often have trouble with characters whose code is not included in the Windows -1252 character set.

VMS

Under VMS (Virtual Memory System) is a file name from the name, a dot, an extension, a semicolon and a version number. The version number is incremented for each new one of the same file (with extension) automatically by one. This can be multiple versions (the number is adjustable, max 32,767 ) keep the same file simultaneously. The following specifications apply to ODS -2 ( on disk structure ):

File names can be 39 characters long, with only certain characters ( letters, digits, underscore, dollar sign ) are allowed. No distinction is made between uppercase and lowercase. The extension may also be 39 bytes long, it is separated by a dot and is not part of the file name. Except for directories where the extension is always ". DIR " is, but it has no importance for the possible use of the file ( but there are standards that are in some file types commonly observed ).

The total path length (ie, disk, directory tree, file name, extension and version ) must not exceed 255 bytes.

Internet

World Wide Web

The transfer of files on the World Wide Web is governed by the HTTP standard. If a file name contains characters outside of the ASCII letters and digits, they will be in the URL in a % representation encoded with a percent sign followed by a two-character code in hexadecimal form, such as " dwells % FCr.html " instead " haustür.html ". In order to determine the code value, knowing the character encoding (eg UTF -8 or ISO 8859-1) of the filename is required.

File Download

The FTP standard requires only ASCII characters still supported as mandatory. Often, a file download, however, also carried out by using HTTP.

E-mail

The transfer of file attachments (and thus also the permissible where filename ) is governed by the standards SMTP and MIME.

220024
de