Configuring EMC DD Boost with Veeam Availability Suite

This article provides a tour of the configuration steps required to integrate EMC Data Domain System with Veeam Availability Suite 9 as well as provides benefits of using EMC DD Boost for backup application.

Data Domain Boost (DD Boost) software provides advanced integration with backup and enterprise applications for increased performance and ease of use. DD Boost distributes parts of the deduplication process to the backup server or application clients, enabling client-side deduplication for faster, more efficient backup and recovery. All Data Domain systems can be configured as storage destinations for leading backup and archiving applications using NFS, CIFS, Boost, or VTL protocols.

The following applications work with a Data Domain system using the DD Boost interface: EMC Avamar, EMC NetWorker, Oracle RMAN, Quest vRanger, Symantec Veritas NetBackup (NBU), Veeam and Backup Exec. In this example, we will be using Veeam Availability Suite version 9.

Data Domain Systems for Service Provider

Data Domain Secure Multitenancy (SMT) is the simultaneous hosting by a service provider for more than one consumer (Tenant) or workload (Applications, Exchange, Standard VMs, Structured Data, Unstructured Data, Citrix VMs).

SMT provides the ability to securely isolate many users and workloads in a shared infrastructure, so that the activities of one Tenant are not apparent or visible to the other Tenants. A Tenant is a consumer (business unit, department, or customer) who maintains a persistent presence in a hosted environment.

Basic Configuration requirements are:

  • Enable SMT in the DD System
  • Role Based Access Control in DD Systems
  • Tenant Self-Service in the DD Systems
  • A Tenant is created on the DD Management Center and/or DD system.
  • A Tenant Unit is created on a DD system for the Tenant.
  • One or more MTrees are created to meet the storage requirements for the Tenant’s various types of backups.
  • The newly created MTrees are added to the Tenant Unit.
  • Backup applications are configured to send each backup to its configured Tenant Unit MTree.

Prerequisites:

  1. Backup Server

Physical Server- Fibre Channel or iSCSI

OR

Virtual Server- Fibre Channel with N-Port Virtualization or NPIV or Pass-through Storage or iSCSI

  1. Backup Software

Backup Application, DD Boost Library, DD Boost-over-FC Transport

  1. Storage Area Network

Fibre Channel or iSCSI

  1. Data Domain System

DD Boost Service

DD Boost-over-FC Server

SCSI Commands over FC

SCSI Processor Devices

  1. Virtual Infrastructure

Hyper-v Server cluster & System Center Virtual Machine Manager OR

VMware vCenter with vSphere Hosts

Designing DD Boost for resiliency & availability

The Data Domain System broadcast itself to the backup server using one or more path physically or virtually connected. The design of entire systems depend on the Data Domain sizing on how you connect Data Domain with backup server(s), how many backup jobs will be running, size of backup, de-duplication, data retention and frequency of data restore. A typical backup solution should include the following environment.

  • Backup server with 2 initiator HBA ports (A and B)
  • Data Domain System has 2 FC target endpoints (C and D)
  • Fibre Channel Fabric zoning is configured such that both initiator HBA ports can access both FC target endpoints
  • Data Domain system is configured with a SCSI target access group containing:
  • Both FC target endpoints on the Data Domain System
  • Dual Fabric for fail over and availability
  • Multiple physical and logical Ethernet for availability and fail over

Examples of Sizing

To calculate the maximum simultaneous connection to Data Domain Fibre Channel System (DFC) from all Backup servers. DFC device (D) is the number of devices to be advertised to the initiator of the backup server(s). Lets say we have 1 backup server, single data domain systems, the backup server is running 100 backup jobs.

DFC Device Count D= (minimum 2 X S)/128

J=1 Backup Server x 100 Backup Jobs=100

C= 1 (Single DD System)

S=JXC (100X1)=100

D=2*100/128 = 1.56 rounded up 2

Therefore, all DFC groups on the Data Domain system must be configured with 2 devices.

Step1: Preparing DD System

Step2: Managing system licenses

  1. Select Administration > Licenses> Click Add Licenses.
  2. On the License Window, type or paste the license keys. Type each key on its own line or separate each key by a space or comma (DD System Manager automatically places each key on a new line)
  3. Click Add. The added licenses display in the Added license list.

OR

  1. In System Manager, select Protocols > DD Boost > Settings. If the Status indicates that DD Boost is not licensed, click Add
  2. License and enter a valid license in the Add License Key dialog box.

Step3: Setting up CIFS Protocol

  1. On the DD System Manager Navigation>click Protocols > CIFS.
  2. In the CIFS Status area, click Enable.

Step4: Remove Anonymous Log on

  1. Select Protocols > CIFS > Configuration.
  2. In the Options area, click Configure Options.
  3. To restrict anonymous connections, click the checkbox of the Enable option in the

Step4: Restrict Anonymous Connections area.

  1. In the Log Level area, click the drop-down list to select the level number 1.
  2. In the Server Signing area, select Enabled to enable server signing

Step5: Specifying DD Boost user names

The following user will be used to connect to DD boost from backup software.

  1. Select Protocols > DD Boost.
  2. Select Add, above the Users with DD Boost Access list.
  3. On the Add User dialog appears. To select an existing user, select the user name in the drop-down list. EMC recommends that you select a user name with management role privileges set to none.
  4. To create and select a new user, select Create a new Local User and Enter the password twice in the appropriate fields. Click Add.

Step6: Enabling DD Boost

  1. Select Protocols > DD Boost > Settings.
  2. Click Enable in the DD Boost Status area.
  3. Select an existing user name from the menu then complete the wizard.

Step7: Creating a storage unit

  1. Select Protocols > DD Boost > Storage Units.
  2. Click Create. The Create Storage Unit dialog box is displayed.
  3. Enter the storage unit name in the Name box e.g. DailyRepository1
  4. Select an existing username that will have access to this storage unit. EMC recommends that you select a username with management role privileges set to none. The user must be configured in the backup application to connect to the Data Domain system.
  5. To set storage space restrictions to prevent a storage unit from consuming excess space: enter either a soft or hard limit quota setting, or both a hard and soft limit.
  6. Click Create.
  7. Repeat the above steps for MonthlyRepository1 each Data Domain Boost-enabled system.

Step8: Encrypting Communication between Backup Server and Data Domain (Optional)

Generate an advanced certificate from Active Directory Certificate services and install into the Data Domain DD Boost. You must install the same certificate into the backup servers so that both data domain and data domain client which is backup server can talk to each via encrypted certificate.

  1. Start DD System Manager on the system to which you want to add a host certificate.
  2. Select Protocols > DD Boost > More Tasks > Manage Certificates….
  3. In the Host Certificate area, click Add.
  4. To add a host certificate enclosed in a .p12 file, Select I want to upload the certificate as a .p12 file. Type the password in the Password box.
  5. Click Browse and select the host certificate file to upload to the system.
  6. Click Add.
  7. To add a host certificate enclosed in a .pem file, Select I want to upload the public key as .pem file and use a generated private key. And Click Browse and select the host certificate file to upload to the system.
  8. Click Add.

DD Boost client access and encryption

  1. Select Protocols > DD Boost > Settings.
  2. In the Allowed Clients section, click Create. The Add Allowed Client dialog appears.
  3. Enter the hostname of the client. This can be a fully-qualified domain name (e.g. Backupserver1.domain.com) or a hostname with a wildcard (e.g. *.domain.com).
  4. Select the Encryption Strength. The options are None (no encryption), Medium (AES128-SHA1), or High (AES256-SHA1).
  5. Select the Authentication Mode. The options are One Way, Two Way.
  6. Click OK.

Step9:Configuring DD Boost over Fibre Channel

  1. Select Protocols > DD Boost > Fibre Channel.
  2. Click Enable to enable Fibre Channel transport.
  3. To change the DD Boost Fibre Channel server name from the default (hostname), click Edit, enter a new server name, and click OK.
  1. Select Protocols > DD Boost > Storage Units to create a storage unit (if not already

created by the application).

  1. Install the DD Boost API/plug-in (if necessary, based on the application).

Step10: Configuring storage for DD Extended Retention (Optional)

Before you proceed with Extended Retention you must add required license on the DD System.

  1. Select Hardware > Storage tab.
  2. In the Overview tab, select Configure Storage. In the Configure Storage tab, select the storage to be added from the Available Storage list.
  3. Select the appropriate Tier Configuration (or Active or Retention) from the menu.
  4. Select the checkbox for the Shelf to be added.
  5. Click the Add to Tier button. Click OK to add the storage.

Step11: Configure a Veeam backup repository

  1. To create an EMC Data Domain Boost-enabled backup repository, navigate to the Backup Infrastructure section of the user interface, then select Backup Repositories and right-click to select Add Backup Repository.

DDBoost

  1. The next step is to select the repository type, De-duplicating storage appliance. Type the Name of the DD Systems, Choose Fibre Channel or Ethernet Option, add credentials to connect to DD System and Gateway to connect to DD System. To be able to connect Veeam Backup server to the DD System using Fibre Channel you must add DD System & Veeam Backup server in the same SAN zone. You also need to enable FC on the DD System. To be able to connect Veeam Backup Server using Ethernet Veeam backup Server and DD System must be in same VLAN or for multi-VLAN you must enable unrestricted communication between VLANs.
  2. On the next screen, select the Storage Unit of the DD System to be used by the Veeam Server as repository, leave concurrent connection as default
  3. On the Next screen, enable vPower NFS, complete the wizard

Step12: Configure Veeam Backup Job & Backup Copy Job

The critical decision on backup jobs will be whether to do an active full backup or leverage synthetic full backups. Veeam Backup Job Creation GuideVeeam Backup Copy Job Creation Guide

Here is short business case of backup type.

Veeam Backup Options:

  1. Active Full- Financial or health sector prefer to keep a monthly full backup of data and retain certain period of time for corporate compliance and satisfying external auditor’s  requirement to keep data off-site for a period of time.
  2. Synthetic Full- A standard practice to keep synthetic full at all time to reduce storage cost and recovery time objective for any organization.

Sythetic

  • For most environments, Veeam recommends to do synthetic full backups when leveraging EMC Data Domain Boost. This will save stress on primary storage for the vSphere and Hyper-V VMs and the Boost-enabled synthesizing is very fast.
  • For a Backup Copy job using GFS retention (Monthly, Weekly, Quarterly and/or Annual restore points), the gateway server must be closest to the Data Domain server, since the Backup Copy job frequently involves an offsite transfer. When the Data Domain server is designated in the repository setup, ensure that consideration is given to the gateway server if it is being used off site.
  • Backup job timed out value must be higher than 30 minutes to be able to retry the job if it is to fail for any reason

DD System Option:

  • A virtual synthetic full backup is the combination of the last full (synthetic or full) backup and all subsequent incremental backups. Virtual synthetics are enabled by default.
  • The synthetic full backups are faster when Data Domain Boost is enabled for a repository
  • DD Boost reduces backup transformation time by less than 80% of total time if DD Boost was not used.
  • The first job has the bulk of the blocks of the vSphere or Hyper-V VM on the DD Boost Storage Unit, it will only need to transfer metadata and any possible changed blocks. This can be a significant improvement on the active full backup process when there is a fast source storage resource in place.
  • With DD Boost, multi-link provides fail over & resiliency. DD Boost also provides parallel processing of concurrent jobs to DD Boost Storage unit.
  1. To display the DD Boost option settings, select Protocols > DD Boost > Settings >Advanced Options.
  2. To change the settings, select More Tasks > Set Options. Select or deselect any option to be enabled.
  3. Click OK.

Microsoft Software Defined Storage AKA Scale-out File Server (SOFS)

Business Challenges:

  • $/IOPS and $/TB
  • Continuous Availability
  • Fault Tolerance
  • Storage Performance
  • Segregation of production, development and disaster recovery storage
  • De-duplication of unstructured data
  • Segregation of data between production site and disaster recovery site
  • Continuous break fix of Distributed File Systems (DFS) & File Server
  • Continuously extending storage on the DFS servers
  • Single point of failure
  • File systems is not available always
  • Security of file systems is constant concern
  • Propitiatory non-scalable storage
  • Management of physical storage
  • Vendor lock-in contract for physical storage
  • Migration path from single vendor to multi vendor storage provider
  • Management overhead of unstructured data
  • Comprehensive management of storage platform

Solutions:

Microsoft Software Defined Storage AKA Scale-Out File Server is a feature that is designed to provide scale-out file shares that are continuously available for file-based server application storage. Scale-out file shares provides the ability to share the same folder from multiple nodes of the same cluster.Microsoft Software Defined Storage offerings compared with third party offering:

Storage feature Third-party NAS/SAN Microsoft Software-Defined Storage
Fabric Block protocol

 

File protocol Network

 

Network Low latency network with FC

 

Low latency with SMB3Direct Management

 

Management Management of LUNs

 

Management of file shares Data de-duplication

 

Data De-duplication Data de-duplication

 

Data de-duplication Resiliency

 

Resiliency RAID resiliency groups

 

Flexible resiliency options Pooling

 

Pooling Pooling of disks

 

Pooling of disks Availability

 

Availability High

 

Continuous (via redundancy) Copy offload, Snapshots

 

Copy Offloads, Snapshots Copy offload, Snapshots

 

SMB copy offload, Snapshots Tiering

 

Tiering Storage tiering

 

Performance with tiering Persistent write-back cache

 

Persistent Write-back cache Persistent write-back cache

 

Persistent write-back cache Scale up

 

Scale up Scale up

 

Automatic scale-out rebalancing Storage Quality of Service (QoS)

 

Storage Quality of Service (QoS) Storage QoS

 

Storage QoS (Windows Server 2016) Replication

 

Replication Replication

 

Storage Replica (Windows Server 2016) Updates

 

Updates Firmware updates

 

Rolling cluster upgrades (Windows Server 2016)

 

    Storage Spaces Direct (Windows Server 2016)

 

    Azure-consistent storage (Windows Server 2016)

 

 Functional use of Microsoft Scale-Out File Servers:

1. Application Workloads

  • Microsoft Hyper-v Cluster
  • Microsoft SQL Server Cluster
  • Microsoft SharePoint
  • Microsoft Exchange Server
  • Microsoft Dynamics
  • Microsoft System Center DPM Storage Target
  • Veeam Backup Repository

2. Disaster Recovery Solution

  • Backup Target
  • Object storage
  • Encrypted storage target
  • Hyper-v Replica
  • System Center DPM

3. Unstructured Data

  • Continuously Available File Shares
  • DFS Namespace folder target server
  • Microsoft Data de-duplication
  • Roaming user Profiles
  • Home Directories
  • Citrix User Profiles
  • Outlook Cached location for Citrix XenApp Session Server

4. Management

  • Single Management Point for all Scale-out File Servers
  • Provide wizard driven tools for storage related tasks
  • Integrated with Microsoft System Center

Business Values:

  • Scalability
  • Load balancing
  • Fault tolerance
  • Ease of installation
  • Ease of management/operations
  • Flexibility
  • Security
  • High performance
  • Compliance & Certification

SOFS Architecture:

Microsoft Scale-out File Server (SOFS) is  considered as a Storage Defined Storage (SDS).  Microsoft SOFS is independent of hardware vendor as long as the compute and storage is certified by Microsoft Corporation. The following figure shows Microsoft Hyper-v cluster, SQL Cluster and Object Storage on the SOFS.

image

                 Figure: Microsoft Software Defined Storage (SDS) Architecture

image

                     Figure: Microsoft Scale-out File Server (SOFS) Architecture

image

                                      Figure: Microsoft SDS Components

image

                        Figure: Unified Storage Management (See Reference)

Microsoft Software Defined Storage AKA Scale-out File Server Benefits:

SOFS:

  • Continuous availability file stores for Hyper-V and SQL Server
  • Load-balanced IO across all nodes
  • Distributed access across all nodes
  • VSS support
  • Transparent failover and client redirection
  • Continuous availability at a share level versus a server level

De-duplication:

  • Identifies duplicate chunks of data and only stores one copy
  • Provides up to 90% reduction in storage required for OS VHD files
  • Reduces CPU and Memory pressure
  • Offers excellent reliability and integrity
  • Outperforms Single Instance Storage (SIS) or NTFS compression.

SMB Multichannel

  • Automatic detection of SMB Multi-Path networks
  • Resilience against path failures
  • Transparent failover with recovery
  • Improved throughput
  • Automatic configuration with little administrative overhead

SMB Direct:

  • The Microsoft implementation of RDMA.
  • The ability to direct data transfers from a storage location to an application.
  • Higher performance and lower latency through CPU offloading
  • High-speed network utilization (including InfiniBand and iWARP)
  • Remote storage at the speed of local storage
  • A transfer rate of approximately 50Gbps on a single NIC port
  • Compatibility with SMB Multichannel for load balancing and failover

VHDX Virtual Disk:

  • Online VHDX Resize
  • Storage QoS (Quality of Service)

Live Migration

  • Easy migration of virtual machine into a cluster while the virtual machine is running
  • Improved virtual machine mobility
  • Flexible placement of virtual machine storage based on demand
  • Migration of virtual machine storage to shared storage without downtime

Storage Protocol:

  • SAN discovery (FCP, SAS, iSCSI i.e. EMC VNX, EMC VMAX)
  • NAS discovery (Self-contained NAS, NAS Head i.e. NetApp OnTap)
  • File Server Discovery (Microsoft Scale-Out File Server, Unified Storage)

Unified Management:

  • A new architecture provides ~10x faster disk/partition enumeration operations
  • Remote and cluster-awareness capabilities
  • SM-API exposes new Windows Server 2012 R2 features (Tiering, Write-back cache, and so on)
  • SM-API features added to System Center VMM
  • End-to-end storage high availability space provisioning in minutes in VMM console
  • More Windows PowerShell

ReFS:

  • More resilience to power failures
  • Highest levels of system availability
  • Larger volumes with better durability
  • Scalable to petabyte size volumes

Storage Replica:

  • Hardware agnostic storage configuration
  • Provide a DR solution for planned and unplanned outages of mission critical workloads.
  • Use SMB3 transport with proven reliability, scalability, and performance.
  • Stretched failover clusters within metropolitan distances.
  • Manage end to end storage and clustering for Hyper-V, Storage Replica, Storage Spaces, Scale-Out File Server, SMB3, Deduplication, and ReFS/NTFS using Microsoft software
  • Reduce downtime, and increase reliability and productivity intrinsic to Windows.

Cloud Integration:

  • Cloud-based storage service for online backups
  • Windows PowerShell instrumented
  • Simple, reliable Disaster Recovery solution for applications and data
  • Supports System Center 2012 R2 DPM

Implementing Scale-out File Server

Scale-out File Server Recommended Configuration:

  1. Gather all virtual servers IOPS requirements*
  2. Gather Applications IOPS requirements
  3. Total IOPS of all applications & Virtual machines must be less than available IOPS of physical storage 
  4. Keep latency below 3 ms at all time for high performance
  5. Gather required capacity + potential growth + best practice
  6. N+1 Compute, Network and Storage Hardware
  7. Use low latency, high throughput networks
  8. Segregate storage network from data network using logical network (VLAN) or fibre channel
  9. Tools to be used

*Not all virtual servers are same, DHCP server generate few IOPS, SQL server and Exchange can generate thousands of IOPS.

*Do not place SQL Server on the same logical volume (LUN) with Exchange Server or Microsoft Dynamics or Backup Server.

*Isolate high IO workloads to separate logical volume or even separate storage pool if possible.

Prerequisites for Scale-Out File Server

  1. Install File and Storage Services server role, and the Failover Clustering feature on the cluster nodes
  2. Configure Microsoft failover Clusters using this article Windows Server 2012: Failover Clustering Deep Dive Part II
  3. Add Cluster Share Volume
  • Log on to the server as a member of the local Administrators group.
  • Open Server Manager> Click Tools, and then click Failover Cluster Manager.
  • Click Storage, right-click the disk that you want to add to the cluster shared volume, and then click Add to Cluster Shared Volumes> Add Storage Presented to this cluster.

Configure Scale-out File Server

  1. Open Failover Cluster Manager> Right-click the name of the cluster, and then click Configure Role.
  2. On the Before You Begin page, click Next.
  3. On the Select Role page, click File Server, and then click Next.
  4. On the File Server Type page, select the Scale-Out File Server for application data option, and then click Next.
  5. On the Client Access Point page, in the Name box, type a NETBIOS of Scale-Out File Server, and then click Next.
  6. On the Confirmation page, confirm your settings, and then click Next.
  7. On the Summary page, click Finish.

Create Continuously Available File Share

  1. Open Failover Cluster Manager>Expand the cluster, and then click Roles.
  2. Right-click the file server role, and then click Add File Share.
  3. On the Select the profile for this share page, click SMB Share – Applications, and then click Next.
  4. On the Select the server and path for this share page, click the name of the cluster shared volume, and then click Next.
  5. On the Specify share name page, in the Share name box, type a name for the file share, and then click Next.
  6. On the Configure share settings page, ensure that the Continuously Available check box is selected, and then click Next.
  7. On the Specify permissions to control access page, click Customize permissions, grant the following permissions, and then click Next:
  • To use Scale-Out File Server file share for Hyper-V: All Hyper-V computer accounts, the SYSTEM account, cluster computer account for any Hyper-V clusters, and all Hyper-V administrators must be granted full control on the share and the file system.
  • To use Scale-Out File Server on Microsoft SQL Server: The SQL Server service account must be granted full control on the share and the file system

      8. On the Confirm selections page, click Create. On the View results page, click Close.

Use SOFS for Hyper-v Server VHDX Store:

  1. Open Hyper-V Manager. Click Start, and then click Hyper-V Manager.
  2. Open Hyper-v Settings> Virtual Hard Disks> Specify Location of Store as \\SOFS\VHDShare\ and Specify location of Virtual Machine Configuration \\SOFS\VHDCShare
  3. Click Ok.

Use SOFS in System Center VMM: 

  1. Add Windows File Server in VMM
  2. Assign SOFS Share to Fabric & Hosts

Use SOFS for SQL Database Store:

1. Assign SQL Service Account Full permission to SOFS Share

  • Open Windows Explorer and navigate to the scale-out file share.
  • Right-click the folder, and then click Properties.
  • Click the Sharing tab, click Advanced Sharing, and then click Permissions.
  • Ensure that the SQL Server service account has full-control permissions.
  • Click OK twice.
  • Click the Security tab. Ensure that the SQL Server service account has full-control permissions.

2. In SQL Server 2012, you can choose to store all database files in a scale-out file share during installation.  

3. On the step 20 of SQL Setup Wizard , provide a location of Scale-out File Server which is \\SOFS\SQLData and \\SOFS\SQLLogs

4. Create a Database on SOFS Share but on the existing SQL Server using SQL Script

CREATE DATABASE [TestDB]
ON  PRIMARY
( NAME = N’TestDB’, FILENAME = N’\\SOFS\SQLDB\TestDB.mdf’ )
LOG ON
( NAME = N’TestDBLog’, FILENAME = N’\\SOFS\SQLDBLog\TestDBLogs.ldf’)
GO

Use Backup & Recovery:

System Center Data Protection Manager 2012 R2

Configure and add a dedupe storage target into DPM 2012 R2. DPM 2012 R2 will not backup SOFS itself but it will backup VHDX files stored on SOFS. Follow Deduplicate DPM storage and protection for virtual machines with SMB storage  guide to backup virtual machines.

Veeam Availability Suite

  1. Log on to Veeam Availability Console>Click Backup Repository> Right Click New backup Repository
  2. Select Shared Folder on the Type Tab
  3. Add SMB Backup Target \\SOFS\Repository
  4. Follow the Wizard. Make Sure Service Account of Veeam has full access permission to \\SOFS\Repository  Share.
  5. Click Scale-out Repositories>Right Click Add Scale-out backup repository> Type the Name
  6. Select the backup repository you created in previous>Follow the Wizard to complete tasks.

References:

Microsoft Storage Architecture

Storage Spaces Physical Disk Validation Script

Validate Hardware

Deploy Clustered Storage Spaces

Storage Spaces Tiering in Windows Server 2012 R2

SMB Transparent Failover

Cluster Shared Volume (CSV) Inside Out

Storage Spaces – Designing for Performance

Related Articles:

Scale-Out File Server Cluster using Azure VMs

Microsoft Multi-Site Failover Cluster for DR & Business Continuity

Data Deduplication in Windows Storage Server 2012 R2

Deduplication in Windows Server: Data deduplication involves finding and removing duplication within data without compromising its fidelity or integrity. The goal is to store more data in less space by segmenting files into small variable-sized chunks (32–128 KB), identifying duplicate chunks, and maintaining a single copy of each chunk. Redundant copies of the chunk are replaced by a reference to the single copy. The chunks are compressed and then organized into special container files in the System Volume Information folder.

Enhanced Dedupe features in Windows Server 2012 R2

  • Data deduplication for remote storage of Virtual Desktop Infrastructure (VDI) workloads
  • Expand an optimized file on its original path.

When using the Data Deduplication feature for the first time or migrating from a previous version of Windows Server, be sure to consider the following related technologies and issues:

  • BranchCache
  • Failover Clusters
  • DFS Replication
  • FSRM quotas
  • Single Instance Storage or NAS Box

Install and Configure Data Deduplication using GUI

1. Open Server Manager, From the Add Roles and Features Wizard, under Server Roles, select File and Storage Services.

2. Select the File Services check box, and then select the Data Deduplication check box.

3. Click Next until the Install button is active, and then click Install.

4. From the Server Manager dashboard, right-click a data volume and choose Configure Data Deduplication. The Deduplication Settings page appears.

5. In the Data deduplication box, select the workload you want to host on the volume. Select General purpose file server for general data files or Virtual Desktop Infrastructure (VDI) server when configuring storage for running virtual machines.

6. Enter the number of days that should elapse from the date of file creation until files are deduplicated, enter the extensions of any file types that should not be deduplicated, and then click Add to browse to any folders with files that should not be deduplicated.

7. Click Apply to apply these settings and return to the Server Manager dashboard, or click the Set Deduplication Schedule button to continue to set up a schedule for deduplication.

Install and Configure Data Deduplication using Windows PowerShell

Start Windows PowerShell. Right-click the Windows PowerShell icon on the taskbar, and then click Run as Administrator.

Import-Module ServerManager | Add-WindowsFeature -name FS-Data-Deduplication

Import-Module Deduplication

Enable-DedupVolume E: -UsageType HyperV

Enable-DedupVolume E: -UsageType Default

Set-Dedupvolume E: -MinimumFileAgeDays 20

Get-DedupVolume | fl

Start-DedupJob E: –Type Optimization –Wait

References:

Windows Server 2012 R2 NAS Box with Deduplication Capacity

Introduction to Windows Deduplication

Windows PowerShell Cmdlet for Deduplication