Subscribe Now
Trending News

Blog Post

SSD will fail at 40k power-on hours (2021)
News

SSD will fail at 40k power-on hours (2021) 

Notice

THIS FIELD NOTICE IS PROVIDED ON AN “AS IS” BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.

Revision History

Revision Publish Date Comments

1.0

23-Apr-20

Initial Release

1.1

28-Apr-20

Updated the Workaround/Solution Section

1.2

04-Aug-20

Update to emphasize software must be updated to protect drives moving forward

1.3

05-Mar-21

Updated the Background, Problem Symptom, and Workaround/Solution Sections

1.4

06-Apr-21

Updated the Workaround/Solution and How to Identify Affected Products Sections

Products Affected

Affected Product ID Comments

UCS-SD400G12S4-EP

UCS-SD400G12S4-EP=

Part Alternate

UCS-C3X60-12G240=

Part Alternate

UCS-C3X60-12G2160=

Part Alternate

UCS-C3X60-12G2160

UCS-C3X60-12G240

UCS-SD16TB12S4-EP

UCS-SD16TB12S4-EP=

Part Alternate

VDS-SD400G12S4-EP

TA-SD400G12S4-EP

CSP-SD400G12S4-EP=

CSP-SD400G12S4-EP

UCS-SP-SD-1P6TB

UCS-SD16TG1KHY-EP

UCS-SD400G1KHY-EP

UCS-SD16TG1KHY-EP=

Part Alternate

UCS-SD400G1KHY-EP=

Part Alternate

CSP-SD16TB12S4-EP=

CSP-SD16TB12S4-EP

V2P-SD16TB12S4-EP

ULTM-SD16TB12S4EP

TA-SD400G12S4-EP=

TA-SD400G12S4-E-OP

Defect Information

Defect ID Headline
CSCvt55829 SSDs will experience data loss at 40k power on hours

Problem Description

Certain Solid State Drive (SSD) models will experience data loss.

Background

Because of an industry wide firmware index bug, a drive that operates for 40,000 hours will experience an invalid index and will cease to function. The problem is caused by a check that checks for the index to pass N when the value can go to N+1. This does not impact the wear specification of the drive.

Problem Symptom

Under normal operation, after 40,000 power-on hours (4.5 years), the SSD will report that 0 GB of available storage space remains. The drive will go offline and become unusable.

Caution: If the SSD reaches the 40,000 power-on hours mark, the drive will be unusable and requires replacement. It is critical that customers upgrade the firmware in order to avoid this issue.

Workaround/Solution

Systems with the impacted PIDs must upgrade to the software below in order to avoid the issue.

Installation of the latest firmware is required to resolve this issue. The fixed firmware version is C405.

The firmware to correct this issue is contained in these releases:

Note: This list will be updated with the initial major releases that contain this firmware as they become available. For directions on how to upgrade your system, see the appropriate Install and Upgrade Guide.

Note: For C-series standalone, please be sure to update the HDD firmware using the Host Upgrade Utility (HUU). For UCSM Managed B-Series and C-Series, please be sure that the Host Firmware Package (HFP) configured on the Service Profile does not exclude Local Disk.

Note: Release 4.1(1c) was previously listed as a fixed release. However, since it has been deferred, Release 4.1(1d) is listed as the first viable release to upgrade to in that train in order to be immune to this issue.

Replacement Drives

Proactive replacement is not available for this issue. A firmware update is required.

Failed drives can be replaced with an alternative Product Identifier (PID).

  • UCS-SD16TSASS3-EP
  • UCS-SD32TSASS3-EP
  • UCS-C3K-3XTSSD16
  • UCS-C3K-3XTSSD32

In order to enable management visibility of these replacement drive PIDs, you must update the capability catalog. For UCS Manager, update the catalog to Release 3.2(3i)T or 4.0(1c)T. For Cisco Integrated Management Controller (IMC), update to Release 4.0(1c) firmware or later.

Without the appropriate catalog update the drive will be operational, but will report an Invalid FRU fault and its management properties will not be visible.

How To Identify Affected Products

UCSM Managed Devices

For devices that are managed by UCSM, determine the firmware of the drives from the Equipment> Firmware Management>  Installed Firmware section of the UCSM GUI.

In order to query across their entire environment, use the visore explorer at https://[virtualIP]/visore.html and query for a class or DN of storageLocalDisk a property of model. Search once for LT0400MO and again for LT1600MO:

C-Series Standalone Devices

Use the Cisco IMC in order to view installed drives. For RAID controller connected drives, go to Storage> RAID controller>  Physical Drive info > Inquiry Data:

Command Line Interface (CLI)

Use the CLI in order to obtain the model and running versions. For example, to identify devices affected use the command below.

Via Intersight

Servers that are claimed in Intersight and with Essentials licensing will benefit from the ability to view server details within Intersight. In order to view the inventory, go to Servers> (select Server)> Inventory> Select applicable storage controller.

With Advantage licensing, Intersight Advisories will show affected devices all in one place:

In order to access this, click the advisories link at the top of the page:

How To Determine Power On Hours for Affected Drives

In order to identify the affected Solid State Drives(SSD), you will need to download third-party utilities that will allow you to see the Power-On Hours (PoH) and in some cases the model number for the affected SSD. Instructions on where to get the utilities and how to use them are below.

Where to get the utilities

There are four different utilities to choose from, depending on your OS and needs. Please review the table below.

Note: SmartMonTools does not work in RAID for ESXi and Windows. The sg3_utils and Sandisk Tool do not work in RAID for all OSes.

When you use SmartMonTools to check Power On Hours, first you must run a short selftest on the target drive. The short selftest will last several seconds and the IO throughput of this drive will decrease during this selftest period.

Steps on how to use each utility

General

Each utility requires some knowledge on how to install software on Linux, VMware, and Windows. Be sure to read any readme files before you install.

JBOD Mode

SmartMonTools for Windows – JBOD mode

Note: If you use a RAID controller, you cannot collect this data through Windows.

  1. Use the link that is provided in the table above in order to download the smartmontools utility.
  2. Install the utility:
    1. Get the smartctl Windows setup file through above link.
    2. Run the setup file ‘smartmontools-7.1-1.win32-setup.exe’
    3. Open a command prompt window.
    4. Go to the folder ‘C:Program Filessmartmontoolsbin’.
  3. Check the drive firmware version:
    1. Run this command in order to get the device name of the target drive:
      smartctl –scan
    2. Run this command in order to read the drive firmware version:
      smartctl -i /dev/sdc
  4. Use the ‘smartctl’ utility within the smartmontools package in order to check Power On Hours:
    1. Open a command prompt window.
    2. Go to the smartmontools directory.
    3. Run this command in order to show the list of SSDs:
      smartctl.exe –scan
    4. In order to get the output needed, run the first command below (where X is the drive letter of the SSD that you want to check), WAIT 10 SECONDS, and then run the second command:
      smartctl -t short /dev/sdX
      smartctl -l selftest /dev/sdX
      See the screenshot below for examples.
    5. Look for the “Lifetime” hours on the first row. That is the latest record of Power On Hours.

SmartMonTools for Linux – JBOD Mode

  1. Use the link that is provided in the table above in order to download the smartmontools utility.
  2. Install the utility:
    1. Use this command in order to untar the installation file:
      tar -zxvf smartmontools-7.1.tar.gz
    2. Go to the folder ‘smartmontools-7.1’.
    3. Run these three commands in order:
      ./configure
      make
      make install
  3. Run this command in order to check the drive firmware version (where ‘sdb’ is the device name of the target drive):
    smartctl -i /dev/sdb
    A screenshot of textDescription automatically generated
  4. Use the ‘smartctl’ utility within the smartmontools package in order to check Power On Hours (POH):
    1. Open a command prompt window.
    2. Go to the smartmontools directory.
    3. Run this command in order to show the list of SSDs:
      lsscsi
    4. In order to get the output needed, run the first command below (where X is the drive letter of the SSD you want to check), WAIT 10 SECONDS, and then run the second command:
      smartctl -t short /dev/sdX
      smartctl -l selftest /dev/sdX

      See the screenshot below for examples.
    5. Look for the “Lifetime” hours on the first row. That is the latest record of Power On Hours.

SmartMonTools for ESXi – JBOD Mode

  1. Use the link that is provided in the table above in order to download the smartmontools utility.
  2. Install the utility:
    1. Enable shell and SSH on the ESXi host.
      A screenshot of a computerDescription automatically generated
    2. Use FTP in order to upload the file ‘smartctl-6.6-4321.x86_64.vib’ to the folder ‘tmp’ on the ESXi host.
    3. Use SSH in order to connect to the ESXi host.
    4. Set the ViB acceptance level to CommunitySupported:
      esxcli software acceptance set –level=CommunitySupported
    5. Install the package:
      esxcli software vib install -v /tmp/smartctl-6.6-4321.x86_64.vib
  3. Check the drive firmware version:
    1. Use SSH in order to connect to the ESXi host.
    2. Run this command in order to get the device name and firmware version of the target drive:
      esxcli storage core device list
      A screenshot of a cell phoneDescription automatically generated
  4. Check Power On Hours (POH):
    1. Go to the smartmontools directory.
    2. Run this command in order to show the list of SSDs:
      esxcli storage core device list
    3. In order to get the output needed, run the first command below (where naa.xxx is the drive letter of the SSD you want to check), WAIT 10 SECONDS, and then run the second command:
      /opt/smartmontools/smartctl -d scsi -t short /dev/disks/naa.xxx
      /opt/smartmontools/smartctl -d scsi -l selftest /dev/disks/naa.xxx

      See the screenshot below for examples.
    4. Look for the “Lifetime” hours on the first row. That is the latest record of Power On Hours.

JBOD Mode – Using sg3_utils

Sg3_utils for Windows – JBOD mode

  1. Use the link that is provided in the table above in order to download the sg3_utils utility.
  2. Install the utility:
    1. Copy the package ‘sg3_utils-1.45mgw64.zip’ to the folder ‘C:’.
    2. Unzip the package.
    3. Open a command prompt window.
    4. Go to the folder ‘C:sg3_utils-1.45mgw64’.
  3. Check the drive firmware version:
    1. Go to the folder ‘C:sg3_utils’.
    2. Run this command in order to get the device name and firmware version of the target drive:
      sg_scan
  4. Check Power On Hours (POH):
    1. Go to the sg3_utils directory.
    2. Run this command in order to show the list of SSDs:
      sg_scan
    3. In order to get the output needed, run this command (where X is the drive letter of the SSD you want to check):
      sg_logs –page=0x15 pdX
      See the screenshot below for examples.
    4. Look for the “Accumulated power on minutes”.

Sg3_utils for Linux – JBOD mode

  1. Use the link that is provided in the table above in order to download the sg3_utils utility.
  2. Install the utility:
    1. Use this command in order to untar the installation file:
      tar -zxvf sg3_utils-1.45.tgz
    2. Go to the folder ‘sg3_utils-1.45’.
    3. Run these three commands in order:
      ./configure
      make
      make install
  3. Run this command in order to check the drive firmware version (where ‘sdb’ is the device name of the target drive):
    sg_logs –page=0x33 /dev/sdb
  4. Check Power On Hours (POH):
    1. In order to get the output needed, run this command (where X is the drive letter of the SSD you want to check):
      sg_logs –page=0x15 /dev/sdX
      See the screenshot below for examples.
    2. Look for the “Accumulated power on minutes”.

JBOD Mode – Using Sandisk Tools

Sandisk Tool for Windows – JBOD mode

  1. Use the link that is provided in the table above in order to download the Sandisk tools.
  2. Install the tools:
    1. Unzip the scli Windows package file.
    2. Go to the folder ‘sandisk_scli-1-8-0-12-windowsWindows_1.8.0.1264’.
    3. Run the setup file ‘scli-1.8.0.12-64.exe’.
  3. Check the drive firmware version:
    1. Open the ‘scli_64’ shortcut on the Desktop and run the command “scli show all” to get the device name of target drive.
    2. Open the ‘scli_64’ shortcut on the Desktop and run the command “scli show -a” to get the firmware version of target drive.
  4. Check Power On Hours (POH):
    1. Run this command in order to show the list of SSDs:
      scli show all
    2. In order to get the output needed, run this command (where X is the drive letter of the SSD you want to check):
      scli show diskX -S
      See the screenshot below for examples.
    3. Look for the “Total Power on Hours”.

Sandisk Tool for Linux – JBOD mode

  1. Use the link that is provided in the table above in order to download the Sandisk tools.
  2. Install the tools:
    1. Unzip the installation file.
    2. Go to the folder ‘Linux_1.8.0.12/generic/x86_64’.
    3. Run this command in order to allow ‘scli’ to be executable.
      chmod +x scli
    4.  
    5. Use this command in order to untar the installation file:
      tar -zxvf smartmontools-7.1.tar.gz
    6. Run these three commands in order:
      ./configure
      make
      make install
  3. Run this command in order to check the drive firmware version (where ‘sdb’ is the device name of the target drive):
    ./scli show /dev/sdb -a
  4. Check Power On Hours (POH):
    1. In order to get the output needed, run this command (where X is the drive letter of the SSD you want to check):
      ./scli show /dev/sdX -S
      See the screenshot below for examples.
    2. Look for the “Total Power on Hours”.

RAID Mode

SmartMonTools for Linux – RAID mode

Note: In order to collect the required data, you must install both SmartMonTools and the storcli utility.

  1. Use the link that is provided in the table above in order to download the smartmontools utility.
  2. Install the utility:
    1. Use this command in order to untar the installation file:
      tar -zxvf smartmontools-7.1.tar.gz
    2. Go to the folder ‘smartmontools-7.1’.
    3. Run these three commands in order:
      ./configure
      make
      make install
  3. Use the link that is provided in the table above in order to download and install the storcli utility.
  4. Check the drive firmware version:
    1. From the storcli directory, run this command:
      storcli /c0/eall/sall show
    2. Look for the Device ID (DID) and make a note of it. It will be needed in future steps.
  5. Run this command in order to check the drive firmware version. (In the example below, ‘148’ is the device ID (DID) of the target drive, and ‘sdc’ is the device name.):
    smartctl -d megaraid,148 -i /dev/sdc
    A screenshot of textDescription automatically generated
  6. Use the ‘smartctl’ utility within the smartmontools package in order to check Power On Hours (POH):
    1. Open a command prompt window.
    2. Go to the smartmontools directory.
    3. After you identify the SSD to check, run this command (where N is the Device ID that you made a note of earlier):
      (Note: In order for this command to work on a RAID set, you must use the ‘megaraid’ switch as shown in the example.)
      smartctl -d megaraid,N -t short /dev/sdX
    4. Wait ten seconds and then run this command (where N is the Device ID that you made a note of earlier):
      smartctl -d megaraid,N -l selftest /dev/sdX
      See the screenshot below for examples.
    5. Look for the “Lifetime” hours on the first row. That is the latest record of Power On Hours.

Note: SmartMonTools does not work in RAID for ESXi and Windows. The sg3_utils and Sandisk Tool do not work in RAID for all OSes.

For More Information

If you require additional assistance, or if you have any questions about this field notice, contact the Cisco Systems Technical Assistance Center (TAC) by one of these methods:

Receive Email Notification For New Field Notices

My Notifications—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.

Read More

Related posts

© Copyright 2022, All Rights Reserved