The ReFS file system (Resilient File System) is the Microsoft latest file system, designed to optimize data availability, efficiently manage scalability for large amounts of data, and to ensure data integrity through so-called “resilience” to file corruption. ReFS was designed to cope with the new data growth scenarios and as a basis for future innovations.
ReFS was introduced with Windows Server 2012, then also brought to Windows 8 and the latest versions of Windows 10. Since its first release, other important features have been introduced, especially with Windows Server 2016 and Windows Server 2019.
Compared to NTFS, ReFS introduces key features to improve resilience to data corruption, performance and scalability. To get into the practical, it should be noted that on all the latest Windows operating systems, especially on servers, we can easily create ReFS formatted drives and partitions. We will see what are the main advantages of using this file system, and when to use it.
Here are some of the key benefits of the ReFS file system:
ReFS introduces new features that can accurately detect corruption and even correct such corruption while remaining online, helping to provide greater data integrity and availability:
- Integrity-stream: ReFS uses checksums for metadata and optionally for file data, allowing ReFS to reliably detect file system corruption.
- Integration with the Storage Spaces feature: when used in conjunction with a mirror or parity space, ReFS can automatically repair the detected corruptions using the alternative copy of the data provided by Storage Spaces.
- Proactive error correction: In addition to validating data before read and write operations, ReFS introduces a data integrity scanner, known as scrubber. This scrubber periodically analyzes the volume, identifying latent corruptions and proactively activating a repair of corrupt data.
ReFS introduces new features for virtualized and performance-sensitive workloads. Real-time tier optimization, block cloning and sparse VDL are good examples of the evolving capabilities of ReFS, designed to support dynamic and diverse workloads:
Mirror-accelerated parity: this feature offers high performance and together more efficient data storage. To do this, ReFS divides a volume into two logical storage groups, known as tiers. These levels can have their own drives and types of resilience, allowing each level to optimize performance or capacity.
Performance improvement for Hyper-V VMs: ReFS introduces new features specifically designed to improve the performance of virtualized workloads
Block cloning: Block cloning accelerates copy operations, allowing faster and lower impact virtual machine checkpoint merge operations.
Sparse VDL: ReFS allows you to quickly zero files (zero-fill), reducing the time it takes to create fixed VHDs from minutes to seconds.
Variable cluster size: ReFS supports both 4K and 64K cluster sizes. 4K is the recommended cluster size for most distributions, but 64K clusters are appropriate for large sequential I/O workloads.
ReFS is designed to support extremely large data sets – millions of terabytes – without affecting performance, resulting in greater scalability than previous file systems.
For what configurations if ReFS supported/recommended?
Microsoft exposes some of the scenarios in which the use of ReFS is recommended / supported and ensures undoubted advantages:
Storage Spaces Direct and Storage Spaces
Storage Spaces is a technology in Windows and Windows Server that can help protect data from drive failure. It is conceptually similar to RAID, but implemented at the software level. You can use a storage space to group three or more units together in a storage pool. If you run out of capacity, simply add more units to the storage pool (see an example here: https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/deploy-standalone-storage-spaces).
Storage Spaces Direct is a server-side caching feature to optimize storage performance. This cache is configured automatically and based on the type of physical drives present (https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/understand-the-cache).
ReFS considerably increases the performance in these scenarios, thanks to its mirror-accelerated parity, block clone, sparse VDL, and so on.
Common disks and Backup destinations
This use generally benefits from the use of specific applications that need reliability and resilience in the management of their data, and that can implement internally the ReFS features. Having a backup destination formatted with ReFS obviously guarantees great data security benefits against any possible corruption.
Let’s see specifically the differences between NTFS and ReFS in the following comparison tables:
|Max file name length||255 Unicode characters||255 Unicode characters|
|Max path length||32K Unicode characters||32K Unicode characters|
|Max file size||35 PB (petabytes)||256 TB|
|Max volume size||35 PB||256 TB|
|Cluster Shared Volume (CSV) support||Yes||Yes|
|Failover cluster support||Yes||Yes|
|Offloaded Data Transfer (ODX)||No||Yes|
The following features are available on ReFS only:
|Mirror-accelerated parity||Yes (on Storage Spaces Direct)||No|
The following features are not available on ReFS:
|File system compression||No||Yes|
|File system encryption||No||Yes|
|Page file support||No||Yes|
|Supported on removable media||No||Yes|
Starting from the innovations introduced with Windows Server 2016 and the more recent ones of Server 2019, we can highlight some of the key aspects that can make ReFS the best choice in many scenarios.
Performance and scalability are certainly one of the strengths of ReFS, being able to manage large amounts of data very quickly and optimally. In fact, ReFS allows volumes up to 1 Yottabyte or 1000 billion Terabytes. ReFS uses the B+ Tree mode to manage the file structure. The B+ tree is very efficient in data storage as there is a very high amount of child nodes in the structure. Using pointers, the B+ tree can reduce the amount of I/O operations to retrieve an element in the tree.
Security, ie the fact that there is no longer any need to make a “check disk”, since this file system alone can check and correct any file corruption problem, thanks to metadata and resilience functions.
Its use in Storage Spaces technology and in virtualization, where it takes full advantage of the increase in performance.
As for the limitations or anyway some configurations with which it is not possible to use ReFS, it should be highlighted how ReFS can not be used to format the OS boot disk. Also, removable disks and file system encryption are not supported. Until Server 2016, compression and deduplication were not supported, but these were then introduced with Windows Server 2019: https://docs.microsoft.com/en-us/windows-server/get-started/whats-new-in-windows-server-1709
Here are two interesting articles on how to create a Storage Space using the Storage Pools feature of Server 2016, and how to use it to create a ReFS volume: