Clustered file system

The term describes a cluster file system file system, which allows a computer cluster concurrent access to a shared storage.

In a cluster file system, all access to directly without the mediation of a server in the cluster located computer. The file system must have to be on a storage medium, which is directly accessible from all computers. This is generally achieved by building a SAN based on Fibre Channel or iSCSI. With direct access results in better performance than using a network file system such as NFS or CIFS. In particular, with databases or applications that manipulate large amounts of data (video) the performance gain is significant.

Motivation

If more than one computer to work together on a network, it is generally desirable or even necessary that all access a common database. However, this access can not uncoordinated, because this would lead to problems.

This condition is called inconsistency. The appearance of inconsistency must be avoided to prevent loss of data. To achieve this, it is necessary as soon as a participant has attained A write access to a file in the shared database to block all further writes to the same file, to subscriber A has given up the write access again.

This problem also exists in principle within a computer with multi-tasking operating system in which the individual processes compete for the write access to individual files. Here it is for the operating system kernel, to ensure that never received two processes simultaneously write access to one and the same file. Because there is no higher authority between the computers of a network which will revert this function naturally, additional measures are needed to ensure the consistency of the data. A cluster file system assumes this control task.

Examples of Cluster File Systems

  • RMS ( OpenVMS )
  • AdvFS (HP Tru64 UNIX)
  • Veritas Cluster File System (various operating systems),
  • OCFS2 (Linux)
  • The predecessor OCFS (Linux and Windows, but only for Oracle databases)
  • GPFS (AIX, Linux, Windows )
  • GlusterFS (Linux, FreeBSD, Solaris, MacOS X)
  • CXFS ( IRIX, Linux, Solaris, Mac OS X and Windows)
  • StorNext FS (Linux, Solaris, HP- UX, AIX, IRIX, Windows and the XSan variant and Mac OS X) or
  • PolyServe Matrix Server ( Linux).
  • Lustre ( Linux).
  • Global File System ( Linux).
  • MelioFS (Windows).
  • Ceph (Linux)
  • QFS Shared Writer, a hierarchical file system with cluster support

To ensure the consistency of the data, the need to manage data, that is, directories, attributes, and space allocations (metadata) are stored coordinates. These usually involve a metadata server is used which gets transmitted all the data from the various participants in the cluster file system, usually over an Ethernet. This also assumes the coordination of the cache and the file locks ( locks). In some cluster file systems, the metadata servers also perform other tasks, while others ( CXFS ) one or more dedicated metadata server must be used to increase the reliability.

Failure to achieve the servers due to a fault in the network synchronization, there is a risk of inconsistencies. Usually, the affected file system will then shut down on all those servers ( himself included) can see servers only a maximum of 50 % of the total. Since there can be only a maximum of a group comprising more than 50 % of the server, only the latter remains active, there can be no inconsistencies. It is also said, the quorum is over 50 %.

From the Quorum shows that a so -configured file system at least three servers needed when high availability is desired. These should be logically then also involved in separate infrastructures, that is, three server rooms in different fire compartments, three uninterruptible power supplies, etc.

A cluster file system that only serves a common data base for a variety of parallel processing servers, though need not necessarily be highly available, but it will be to build such a server farm anyway much more than two computers. The multi -space need arises in such applications not initially.

In contrast to the largely autonomous position of the server in a cluster file system access is set to files over a network, eg via Network File System (NFS) on Unix systems, NetWare Novell NetBIOS or Microsoft. Here "heard" of disk space a particular server that provides access to data. If it fails, the affected file system is not available.

The third possibility of the distributed access to files is to use raw devices. Here we omitted entirely filesystems and leaves it to the application to manage the available space on the disk system in question. Thus, the application may need to perform synchronization between servers and deal with disturbances. Modern operating systems allow users to use also shares a physical disk system as raw devices, while other portions are reserved for file systems.

Note: The names of the mentioned products and company are protected.

194700
de