Gluster is one of the best open source storage platforms, which mainly focuses on dividing the storage and managing the explosive growth of unstructured data. It is isolated from other storage technologies and it keeps a unified global namespace to virtualize the disk and memory resources into a single shared pool that is centrally controlled. As it runs on commodity hardware, it is possible to scale out and build blocks from a few terabytes to multiple petabytes and meet the most demanding capacity and performance needs, at a fraction of the cost vis-a-vis traditional system. Also, Gluster storage has its own Network Attached Storage (NAS), which addresses the needs of traditional data centers, shared storage for Virtual machine environments – both public and private storage clouds. Gluster files system can be accessed using FUSE module. In order to perform a single file system operation various context switches are needed, which results in performance issues. Fuse is one of the Linux kernel modules that support the interaction between kernel virtual file system and non-privileged user application and it has an API which can be accessed from the allocated user space. Using this API, any type of file system can be written using almost any language we prefer as there are many bindings between Fuse and other languages.
Gluster File system architecture defines the volumes, which are the collections of bricks and most of the Gluster file system happens on the volume. Generally, it supports different types of volumes based on the requirements; these volumes are good for scaling storage and size and for improving performance.
Distributed Gluster FS volume: This is the default volume in Gluster FS. If we are not selecting the volumes while creating the Gluster FS, it will be designed automatically by forming the distributed volume. The files are stored in various bricks in this volume, which means data redundancy is not happening in this volume, but the purpose of this storage volume is very cheap comparatively. The main drawback of this storage volume is, one of the brick failures will lead to complete data loss.
Replicated Gluster FS volume: In Replicated Gluster FS Volume, we overcome data loss matter discussed in distributed Gluster file system. Here, the copies of each data are stored on all blocks, which help to retrieve relevant information in case of disk failure. The number of volume replica can be assigned during the creation of the volume. This volume requires at least 2 to 3 bricks in order to form a minimum number of replicas. Due to data redundancy, these volumes are used mainly for better reliability.
Distributed Gluster File System Volume: This volume shows the architecture on how the data is distributed over a set of replicated bricks. Here, the number of bricks must be in multiples of the replica count. This type of volume can be used when high availability of data due to redundancy and scaling is required.
Striped Gluster FS volume: When a large set of files being stored in a brick, which is frequently accessed by many clients at the same time, it will cause too much load on the server. This volume helps to divide the stored data into stripes, then a large set of files will be converted to smaller chunks and each chunk is stored in each brick, therefore the load is distributed and the file can be fetched faster.
Distributed striped Glusterfs volume: This is almost similar to the Striped Glusterfs volume except that the stripes can now be distributed across more number of bricks. But the number of bricks must be in multiples of the number of stripes. We must add bricks in the multiple stripe count, if we want to increase the volume size.