My experience with Windows Distributed File System and Replication(DFS-R)

Hello folks,

In this article, I would like to shed some light on certain aspects that I had encountered recently while working on one of exciting project.

If you have ever worked with any enterprise which is having Microsoft windows based products or applications, you will surely be either heard or came across the need to have a central file system to store the data which is highly available for applications as well as for user’s and which should be available for access via single common static name across organization. That’s enough?. No, not enough. Further more, there should be access control applied to these common shares names based on one’s role in organizations active directory. The shares should be available in HA mode to get rid of single point of failure and to provide application resilience. Data should be replicated across geographies or organization sites.

To achieve all above, Microsoft has answered with Distributed File System and Replication, commonly acronymed as DFS-R.

What is DFS?

Distributed File System is a set of client and server services that allow an organization using Microsoft Windows servers to organize many distributed SMB file shares into a distributed file system.

What is DFS-R?

Distributed File System Replication (DFSR) is a replication engine that organizations can use to synchronize folders/data for servers across its network.

You can read more about these in details on internet. So, lets delve in to things which I think matters most for this article.

  • Always use FQDN names for DFS Namespaces.

Its recommended that you should use fully qualified domain names(fqdn) while naming your DFS namespaces. Problem I faced without fqdn is, I was not able to access DFS shares which are created in AWS from my local company network. Domain controller was not able to resolve target links somehow in the absence of fqdn and was resulting in failure.

So, always give name to DFS Namespaces in FQDN format. What I mean is, name should be like ‘\\domain\name’. i.e. ‘\\abc.com\my-dfs-namespace’

  • Segregate root folder from actual data folder.

When you setup DFSN, its always better to keep DFSN root folder and actual data folder separate. Its recommended approach. It helps in keeping things clean and you can have only one DFSN managing multiple shares which are not related to each other and having different data needs. For example, under one DFSN entry you can manage one share for finance, one for administrators, one for business and so on. You do not have to re-setup DFSN in every case.

  • Choosing ‘writable’ DFSN target.

By default, all targets(servers) in DFS are ‘writable’.  What i mean by that is, clients which gets connected to DFSN can pick any target and can write on to it. DFS-R then takes care of replicating that data to other nodes/targets in DFS.

So, how this target selection happens? Can you modify DFS so that clients can always write to one target in DFS?

By default all targets in DFS have equal priority. In technical term, each target has ‘referral priority rank‘. When client make a call to domain controller to resolve targets,  client is presented with list of active targets and based on the actual settings, one from that list is returned. The setting basically states whether first should be picked amongst list or last one or closest one in site.

So, indirectly, we can explicitly set the ‘referral priority rank‘ of all the targets when we setup the DFS, so that all clients will always pick the highest priority rank target.

But this approach also has a problem which I will explain in next point.

  • AWS ‘Availability Zones’ matters (site configuration)

Consider a scenario where you are running DFS and your client application in AWS. You have 2 targets in DFS and some application servers. To achieve high availability you have setup both DFS targets in across the zones. Say one in 1a and other in 1b. Refer below image.

DFSR

Even after tampering referral ranks of targets which I explained in above point, What I found out is, that clients were still choosing the DFS targets ‘zone wise’. What i mean by that is, application servers in 1a were got DFS target server from 1a zone and 1b application servers got DFS target as 1b server when they made connection/target resolution request. Why explicitly set referral rank is not working here?

One of the main reason for above is how site configuration done in your active directory and domain controllers. It always tries to resolve DFS target withing same site. That’s why the above behavior.

  • File Locking while replication.

One of the shocking discovery was DFS-R locks the files while replication. This was not something expected but its true. Our application started giving ‘file locked’ errors when it tries to write to file too frequently. This type of error we encountered when a file is opened/written/closed too frequently in short period of time. To get away with this error, temporarily you can remove replication from all such folders at the cost of single point of failure though.

  • Not suitable for high transactional file writes.

Because of above file locking issue, DFS-R is not comprehensive solution if you have an application which do high transactional writes. Its best suited for low volume/offsite back kind of applications.

What are alternatives?

Microsoft has come up with ‘Storage Replica‘ to get away with some of DFS-R problems like locking etc. Its available in windows server 2016 data center edition. Read more here.

But what I really excited about is, AWS has recently announced fully managed file share service for Microsoft servers called Amazon FSx.  No more headaches of maintaining/managing/setting up DFS/file clusters. Read more here.

Hope you will find this article helpful.

Happy learning. 🙂