![]() |
MFS is a networking, distributed file system. It spreads data over several physical localizations (servers), which are visible to a user as one resource. For standard file operations MFS acts as other Unix-alike file systems. It has hierarchical structure (directory tree), stores file attributes(permissions, last access and modification times) and makes it possible to create special files (block and character devices, pipes and sockets), symbolic links (file names pointing to other files accessible locally, not necessarily on MFS) and hard links (different names of files which refer to the same data on MFS).
Distinctive features of MFS are:
System Architecture

MFS installation consists of three types of machines:
Metadata is stored in the memory of the managing server and is simultaneously saved on the disk (as a temporarily updated binary file and immediately updated incremental logs).
File data is divided into fragments (chunks) of a maximum size 64MB each which are stored as files on selected disks on data servers (chunkservers). Each chunk is saved on different computers in a number of copies equal to a "goal" for the given file. Goals may be configured separately for each file.
How the system works
On a client computer with mounted MFS all file operations are exactly the same as with other file systems. The kernel of the operating system transfers all file operations to the FUSE module, which communicates with the mfsmount process. The mfsmount process communicates through the network subsequently with the managing server and data servers (chunkservers). This process is fully transparent to the user.
mfsmount communicates with the managing server during all operations on metadata files (creating files, deleting files, reading directories, reading and changing attributes, changing sizes and on any access to special files on MFSMETA) and to start reading or writing data. Data is sent through a direct connection to one of the dataservers (chunkservers) which is storing the relevant chunk of a file. After finishing the write process, the managing server receives information to update a file's length and the last modification time.
Furthermore, data servers (chunkservers) communicate with each other to replicate data so that each chunk exists in an appropriate number of copies on different machines.
Fault tolerance
Thanks to the fact that file data is stored in many copies, the system is resistant to failures of single data servers (chunkservers) or temporary communication outages between them (this of course does not refer to files with the "goal" set to 1). Exceptionally important files may be set to have a number of copies higher than two, in this case these files are resistant even to a breakdown of more servers at once (there should be at least one more copy available than the number of inaccessible or out-of-order servers).
In case of failure or disconnecting the server storing the data of a file which has at least two copies, the data will remain accessible from another server, and later the data will be replicated on another accessible data server (chunkserver) to provide the required number of copies. Note: if the number of available servers is lower than the "goal" set for a given file or the number of servers is equal to the required number of copies, but there is no free space, the required number of copies can not be preserved. In this case a new server should be connected as soon as possible. Another server can be connected at any time.
Stored chunks (data fragments) are versioned, so there is no worry that after connecting the server with older copy of data the files will become incoherent. All obsolete chunks will be removed and the free place will be assigned to the new data.
Failures of a client's machine have no influence on the coherence of a file system or on other clients operations. In the worst case scenario the data that has not yet been sent from the failed computer may be lost.
| Copyright © 2006-2008 Gemius S.A. All rights reserved. |