Deduplication/Compression

The Deduplication/Compression function analyzes duplicated data in every 8KB of the write data from the server, and writes the duplicated data only once. After the first write, the data is referenced instead of writing the same data again. This reduces the total write size. Also, with the Compression function further data reduction is realized.

The Deduplication/Compression function can not only perform both deduplication and compression at the same time, but can also perform only deduplication or compression individually.

Overviews of the Deduplication/Compression function, the Deduplication function, and the Compression function are described below.

Deduplication/Compression Function

This function removes duplicate data blocks, compresses the remaining data blocks, and then stores the data.

Figure: Deduplication/Compression Overview

Deduplication Function

This function removes duplicate data blocks and stores the data.

Figure: Deduplication Overview

Compression Function

This function compresses each data block and stores the data.

Figure: Compression Overview

The following table provides the function specifications for the Deduplication/Compression.

Table: Deduplication/Compression Function Specifications (ETERNUS DX600 S6)

Model

ETERNUS DX600 S6

Number of TPPs available for Deduplication/Compression settings

8

Maximum logical capacity that can be a deduplication/compression target

Up to ten times the DATA_CNTNR Volume (*1)

Maximum logical capacity of the DATA_CNTNR Volume (*2)

If the chunk size is 21MB

1,536TB

If the chunk size is 1,344MB

48PB

Volume type

TPV

¡

Standard / FTV / WSV / SDV / SDPV / VVOL / ODX

×

System memory capacity (per CE)

128GB or more

*1

If a Deduplication/Compression Volume is created or expanded, expand the DATA_CNTNR Volume according to the total capacity of the Deduplication/Compression Volumes. If the efficiency of the Deduplication/Compression function cannot be estimated, the recommended total capacity of Deduplication/Compression Volumes is a capacity smaller than the logical capacity of the DATA_CNTNR Volume.

*2

When the Deduplication/Compression function is used, the total capacity of the Deduplication/Compression Volumes can be greater than or equal to the capacity of the DATA_CNTNR Volume. If deduplication/compression is not effective for this case, a write operation to a Deduplication/Compression Volume may fail due to a capacity shortage of the DATA_CNTNR Volume. The default logical capacity of the DATA_CNTNR Volume is 32TB.

In addition, if the DATA_CNTNR Volume capacity runs out or is close to running out, an SNMP Trap, E-Mail, or syslog is sent. The notification method is set with the "Setup Event Notification" function. The notification target event is "Thin Provisioning Pool Rate". For more details, refer to "Web GUI User's Guide".

Table: Compression Function Specifications (ETERNUS DX900 S6 and ETERNUS DX8900 S6)

Model

ETERNUS DX900 S6 and ETERNUS DX8900 S6

Number of TPPs available for Compression settings

8

Maximum logical capacity that can be a compression target

Up to ten times the DATA_CNTNR Volume (*1)

Maximum logical capacity of the DATA_CNTNR Volume (*2)

If the chunk size is 21MB

8PB

If the chunk size is 1,344MB

48PB

Volume type

TPV

¡

Standard / FTV / WSV / SDV / SDPV / VVOL / ODX

×

System memory capacity (per CE)

1,024GB or more

*1

If a Compression Volume is created or expanded, expand the DATA_CNTNR Volume according to the total capacity of the Compression Volumes. If the efficiency of the Compression function cannot be estimated, the recommended total capacity of Compression Volumes is a capacity smaller than the logical capacity of the DATA_CNTNR Volume.

*2

The Compression function can create Compression Volumes whose capacity is equal to or larger than a DATA_CNTNR Volume. In environments where compression is not effective, a write operation to the Compression Volume may fail due to a capacity shortage of the DATA_CNTNR Volume. The default logical capacity of the DATA_CNTNR Volume is 32TB.

In addition, if the DATA_CNTNR Volume capacity runs out or is close to running out, an SNMP Trap, E-Mail, or syslog is sent. The notification method is set with the "Setup Event Notification" function. The notification target event is "Thin Provisioning Pool Rate". For more details, refer to "Web GUI User's Guide".

Performance When Using the Deduplication/Compression Function

The ETERNUS DX performs data deduplication/compression in synchronization with the I/O from the server.

  • Using this function in environments where random access occurs is recommended. Data is intermittently stored in Deduplication/Compression Volumes because data is appended.

  • I/O response may significantly degrade when compared to systems that do not use the Deduplication/Compression function.

  • Using this function in environments where the I/O size is 32KB or smaller is recommended. The performance is affected in environments where the I/O size is large because data is deduplicated and compressed every 8KB.

  • If the I/O size or the boundaries of the I/O address are not 8KB, the performance is affected because the parts less than 8KB are read in the ETERNUS DX.

  • If I/Os are issued to the Deduplication/Compression Volumes, the CPU usage rate increases. The performance of non-Deduplication/Compression Volumes may also be affected.

  • The controller that controls the data reduction process of the Deduplication/Compression function is called the data reduction CM. By replacing the data reduction CM assigned to the Deduplication/Compression Volume, the memory and load of the CMs can be balanced during the data reduction process.

Note
  • The performance may decline when the Deduplication/Compression function is enabled.

    Using the Deduplication/Compression function is not recommended for the following volumes.

    • Volumes that are used to store performance-sensitive data

    • Volumes that use drives other than SSDs

  • Batch process (or sequential access) performance significantly degrades because data is written to drives intermittently or a large amount of references and updates occur. In environments where sequential access occurs, using the Deduplication/Compression function is not recommended.

  • The Deduplication/Compression function becomes a disadvantage in terms of performance and capacity if a volume stores data such as videos to which deduplication/compression is not effective and the volume is set as a Deduplication/Compression Volume. Only enable either the Deduplication function or the Compression function.

Configuration Method

  • Enabling the Deduplication/Compression function

    From ETERNUS Web GUI or ETERNUS CLI, enable the Deduplication/Compression function for the TPP. Not only can the Deduplication/Compression function be enabled, but the Deduplication function or the Compression function can be individually enabled.

    The Deduplication/Compression function can be enabled by performing one of the methods below.

    Table: Method for Enabling the Deduplication/Compression Function

    Condition of the TPP

    Chunk size (*1)

    Creation method

    New TPP (recommended)

    21MB

    During the TPP creation, select the Automatic mode and specify the Deduplication/Compression function option to enable it

    21MB (default) – 1,344MB (*2)

    During the TPP creation, select the Manual mode and specify the Deduplication/Compression function option to enable it

    Existing TPP

    21MB – 1,344MB (*3)

    Select a target TPP to enable the Deduplication/Compression function (*4)

    *1

    The chunk size can be checked in the list display of the TPP.

    *2

    The chunk size can be changed only if the maximum pool capacity is selected. Select the chunk size (21MB – 1,344MB) from Advanced Settings. For normal operations, the default value (21MB) does not need to be changed.

    *3

    The chunk size (21MB – 1,344MB) during a TPP creation is maintained. The Deduplication/Compression function can be enabled for TPPs whose chunk sizes are not 21MB.

    *4

    If I/O load exists in the ETERNUS DX, enabling or disabling the Deduplication/Compression function in the TPP may take time. If I/O load exists, changing the setting of the Deduplication/Compression function for each TPP is recommended.

    Caution

    If the Deduplication/Compression function is enabled for TPPs, the efficiency of the physical capacity decreases as the chunk size increases. Therefore, when the capacity of Thin Provisioning is sufficient, not changing the maximum Thin Provisioning capacity is recommended if the change increases the chunk size.

  • Configuration method for the Deduplication/Compression function

    Select the TPP where the Deduplication/Compression function is enabled, and create Deduplication/Compression Volumes (TPVs) in the selected TPP.

    Specify whether to enable or disable the Deduplication/Compression function for each TPV. TPVs (or Deduplication/Compression Volumes) where the Deduplication/Compression function is enabled and disabled can exist together within the same TPP. However, these two types of TPVs should be located in separate TPPs.

    Deduplication is performed for Deduplication/Compression Volumes within the same TPP. Deduplication is not performed for data in different TPPs. In some cases, deduplication might not be performed even within the same TPP.

    To enable the Deduplication/Compression function for existing volumes, use the RAID Migration function.

Volumes that are to be created and the Deduplication/Compression setting for TPPs where the target volumes can be created vary depending on the selection of "Deduplication" and "Compression".

  • Volumes that are to be created

    Table: Volumes That Are to Be Created depending on the Selection of "Deduplication" and "Compression"

    Condition

    Volumes that are to be created

    Deduplication

    Compression

    Enable

    Enable

    Deduplication/Compression Volumes where both Deduplication and Compression are enabled

    Enable

    Disable

    Deduplication/Compression Volumes where only Deduplication is enabled

    Disable

    Enable

    Deduplication/Compression Volumes where only Compression is enabled

    Disable

    Disable

    TPVs for SAN where both Deduplication and Compression are disabled

  • Deduplication/Compression setting for TPPs where the volumes can be created

    Table: Deduplication/Compression Setting for TPPs Where the Target Volumes Can Be Created

    Condition

    Deduplication/Compression setting for the destination TPP

    Deduplication

    Compression

    Only Deduplication is enabled

    Only Compression is enabled

    Both Deduplication and Compression are enabled

    Both Deduplication and Compression are disabled

    Enable

    Enable

    ×

    ×

    ¡

    ×

    Enable

    Disable

    ¡

    ×

    ×

    ×

    Disable

    Enable

    ×

    ¡

    ×

    ×

    Disable

    Disable

    ¡

    ¡

    ¡

    ¡

    ¡: Volumes can be created, ×: Volumes cannot be created

Note

TPPs with the Deduplication/Compression function enabled have one of the following attributes: deduplication/compression, deduplication only, or compression only. The Deduplication/Compression Volume conforms to the attribute of the TPP where the volume is created. TPVs where the Deduplication/Compression function is enabled and disabled can exist together within each TPP.

Deduplication/Compression System Volumes

One DATA_CNTNR Volume is created for each TPP where the Deduplication/Compression function is enabled.

Before enabling the Deduplication/Compression function for a TPP, check whether the remaining capacity is sufficient because the DATA_CNTNR volume is created just within the maximum pool capacity.

Because the data after deduplication/compression is stored in the DATA_CNTNR Volume, add RAID groups to the TPP or expand the DATA_CNTNR Volume before the usage rate of the TPP or the usage rate of the DATA_CNTNR Volume reaches 100%.

The DATA_CNTNR Volume cannot be expanded to a capacity larger than the maximum pool capacity. If the capacity of the DATA_CNTNR Volume exceeds the maximum pool capacity, use the RAID Migration function to migrate the Deduplication/Compression Volumes of the TPP to non-Deduplication/Compression Volumes (TPVs) or another TPP. For details on the maximum logical capacity of the DATA_CNTNR Volume, refer to Table: Deduplication/Compression Function Specifications (ETERNUS DX600 S6) or Table: Compression Function Specifications (ETERNUS DX900 S6 and ETERNUS DX8900 S6).

Apart from the data after deduplication/compression, the control information is written to the DATA_CNTNR Volume. The physical capacity that is used for the control information is the total of the fixed capacity of 4GB maximum and the variable capacity (1 - 15%) according to the written size from the server.

Deduplication/Compression Volumes

The physical capacity may temporarily be larger than the logical capacity that is written because data is appended in Deduplication/Compression Volumes. If the I/O load is high, the physical capacity may run out. Monitoring the physical capacity on a regular basis and enabling SNMP notifications are recommended.

As with normal TPVs, the size of the used areas that are recognized by the server may differ from the areas that are actually allocated if physical capacity is allocated to areas not used by the server.

In this case, perform one of the following area release operations.

  • Release unused areas by issuing the space reclamation (UNMAP) command from the server.

    For the procedure on issuing the UNMAP command, refer to the manuals of the relevant server or OS.

  • Zero out the unused areas and then execute "TPV/FTV capacity optimization" for the DATA_CNTNR Volume.

    For details, refer to TPV/FTV Capacity Optimization.

Caution
  • Releasing all the unused areas may take some time because the area release process releases the areas one by one after the command is issued or after the optimization is executed.

  • The area release operation may result in an incomplete release of the physically allocated areas depending on the storage system status, on the I/O load, or on whether the data in the released area is duplicated.

  • The deduplication rate may temporarily decrease due to a CM failure, a firmware update, or a blackout.

  • If a failure occurs in a RAID group configuring the TPP, or a bad sector occurs in the DATA_CNTNR Volume, the data of all the Deduplication/Compression Volumes in the TPP may be deleted.

Functional Details

Figure: Details of the Deduplication/Compression Function
Caution

Note the following when using the Advanced Copy functions for Deduplication/Compression Volumes.

  • The CPU usage rate may increase depending on the EC/OPC priority setting. Be cautious of an I/O performance reduction.

  • The copy performance may be significantly reduced when compared to non-Deduplication/Compression Volumes (TPVs).

  • When using copies between ETERNUS DX storage systems, data without deduplication and compression is sent to the copy destination. In addition, the bandwidth of the remote lines might not be fully utilized.

Operation of the Deduplication/Compression Volumes

The following table provides management functions for volumes related to Deduplication/Compression.

Table: Management Functions for Deduplication/Compression Volumes

Function

Deduplication/Compression Volume

DATA_CNTNR Volume

Creation

¡

× (*1)

Deletion

¡

× (*2)

Rename

¡

×

Format

¡

¡

TPV capacity expansion

¡

¡

RAID migration

¡

×

Balancing

×

×

TPV/FTV capacity optimization

×

¡

Modify threshold

¡

×

Allocation settings

×

×

Encrypt volume (*3)

¡

×

Advanced Copy function (local copy)

¡

×

Advanced Copy function (remote copy)

¡

×

Forbid Advanced Copy

¡

×

Release reservation

¡

×

Performance monitoring

¡

×

ALUA settings

¡

×

Modify cache parameters

×

¡

LUN mapping

¡

×

QoS

¡

×

Create ODX Buffer volume

×

×

Storage Migration

¡

×

Non-disruptive Storage Migration

¡

×

Storage Cluster

×

×

Extreme Cache Pool

×

¡

Veeam Storage Integration

×

×

*1

Automatically created when the Deduplication/Compression function is enabled for TPPs.

*2

Automatically deleted when the Deduplication/Compression function is disabled for TPPs.

*3

Encryption can be performed by creating a volume in the encrypted pool, or migrating a volume to an encrypted pool.