Wednesday, July 28, 2021

Flash disk replacement in Exadata

Steps to Replace a Flash Disk in Oracle Exadata

Oracle Exadata is a powerful platform combining compute and storage nodes to deliver high-performance database services. One critical component of Exadata storage cells is the flash disk, which plays a key role in caching and accelerating I/O operations. A failure in flash modules (FMODs) can severely impact performance, making timely replacement essential.
This guide outlines the step-by-step process to identify and replace a failed flash disk in an Exadata cell node.

1. Identify the Faulty Cell Node

Log in to a compute node and run the following command to check the status of physical disks across all cell nodes:
#dcli -g /opt/oracle.SupportTools/onecommand/cell_group -l root 'cellcli -e list physicaldisk'
Look for entries indicating poor performance or warning status.

2. Verify Disk Type on the Affected Cell Node

Once the faulty cell node is identified, log in to it and run

#cellcli -e list physicaldisk

This will help determine whether the issue is with a normal disk or a flash disk. A typical output for a failed flash disk might look like:

FLASH_1_0 15557M04E3N warning - poor performance

3. Inspect Flash Cache Details

To get more information about the degraded flash disk:
#cellcli -e list flashcache detail
Check for degraded cell disks, effective cache size, and disk status.

4. Inactivate Grid Disks

Before shutting down the cell node, make all grid disks inactive:
CellCLI> alter griddisk all inactive

5. Confirm Grid Disk Status

Verify that all grid disks are offline:
CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

6. Shut Down the Cell Node

Bring down the cell node safely:
#init 0

7. Replace the Flash Disk

Hand over the cell node to hardware support for flash disk replacement.

8. Verify Disk Status Post-Replacement

Once the cell node is powered back on, log in and check the disk status:
#cellcli -e list physicaldisk
Ensure all disks, including flash disks, show a status of `normal`.

9. Check Flash Cache Health

Inspect the flash cache again:
#cellcli -e list flashcache detail

10. Reactivate Grid Disks

Bring the grid disks back online:
CellCLI> alter griddisk all active

11. Final Verification

Confirm that all grid disks are online:
CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

Conclusion

Replacing a failed flash disk in Exadata requires careful coordination and precise execution to avoid data loss and restore optimal performance. Following these steps ensures a smooth and safe replacement process.
Author: Kiran Jadhav
Principal Consultant | Exadata Admin

Sunday, July 18, 2021

How to run sundiag on multiple cell nodes - exadata or SSC

How to run sundiag on multiple cell nodes - exadata or SSC:

What is sundiag:

sundiag is Oracle Exadata Database Machine - Diagnostics Collection Tool which collects diagnostics information which help the support analyst in diagnosing problem such as failed hardware like a failed disk, etc.

In Exadata box or solaris supercluster (SSC) we may have multiple storage cell nodes attached.

If we have 10-12 storage cells nodes then instead of login to each and every cells and collecting sundiag will be a time consuming task. By below one command we can run sundiag on multiple servers (passwordless ssh should be there from the compute node to the cell nodes).

1. on Solaris super cluster:

#dcli -g /opt/oracle.supercluster/bin/cell_group -l root /opt/oracle.SupportTools/sundiag.sh

where # cat /opt/oracle.supercluster/bin/cell_group --> will list number of cell nodes attached to the SSC machine

2. on Exadata servers:

#dcli -g /opt/oracle.SupportTools/onecommand/cell_group -l root /opt/oracle.SupportTools/sundiag.sh

where # cat /opt/oracle.SupportTools/onecommand/cell_group --> will list number of cell nodes attached to the Exadata machine

Thank U

- Kiiran B Jaadhav

Kirnn Jadhav

Labels

Wednesday, July 28, 2021

Flash disk replacement in Exadata

Steps to Replace a Flash Disk in Oracle Exadata

1. Identify the Faulty Cell Node

Log in to a compute node and run the following command to check the status of physical disks across all cell nodes:
#dcli -g /opt/oracle.SupportTools/onecommand/cell_group -l root 'cellcli -e list physicaldisk'
Look for entries indicating poor performance or warning status.

2. Verify Disk Type on the Affected Cell Node

Once the faulty cell node is identified, log in to it and run
#cellcli -e list physicaldisk
This will help determine whether the issue is with a normal disk or a flash disk. A typical output for a failed flash disk might look like:
`FLASH_1_0 15557M04E3N warning - poor performance`

3. Inspect Flash Cache Details

To get more information about the degraded flash disk:
#cellcli -e list flashcache detail
Check for degraded cell disks, effective cache size, and disk status.

4. Inactivate Grid Disks

Before shutting down the cell node, make all grid disks inactive:
CellCLI> alter griddisk all inactive

5. Confirm Grid Disk Status

Verify that all grid disks are offline:
CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

6. Shut Down the Cell Node

Bring down the cell node safely:
#init 0

7. Replace the Flash Disk

Hand over the cell node to hardware support for flash disk replacement.

8. Verify Disk Status Post-Replacement

Once the cell node is powered back on, log in and check the disk status:
#cellcli -e list physicaldisk
Ensure all disks, including flash disks, show a status of `normal`.

9. Check Flash Cache Health

Inspect the flash cache again:
#cellcli -e list flashcache detail

10. Reactivate Grid Disks

Bring the grid disks back online:
CellCLI> alter griddisk all active

11. Final Verification

Confirm that all grid disks are online:
CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

Conclusion

Replacing a failed flash disk in Exadata requires careful coordination and precise execution to avoid data loss and restore optimal performance. Following these steps ensures a smooth and safe replacement process.
Author: Kiran Jadhav
Principal Consultant | Exadata Admin

Sunday, July 18, 2021

How to run sundiag on multiple cell nodes - exadata or SSC

Popular Posts

Total Pageviews

BlogAdda Rating

Visitor

Labels

Wednesday, July 28, 2021

Flash disk replacement in Exadata

Steps to Replace a Flash Disk in Oracle Exadata

1. Identify the Faulty Cell Node

Log in to a compute node and run the following command to check the status of physical disks across all cell nodes:#dcli -g /opt/oracle.SupportTools/onecommand/cell_group -l root 'cellcli -e list physicaldisk'Look for entries indicating poor performance or warning status.

2. Verify Disk Type on the Affected Cell Node

Once the faulty cell node is identified, log in to it and run#cellcli -e list physicaldiskThis will help determine whether the issue is with a normal disk or a flash disk. A typical output for a failed flash disk might look like:FLASH_1_0 15557M04E3N warning - poor performance

3. Inspect Flash Cache Details

To get more information about the degraded flash disk:#cellcli -e list flashcache detailCheck for degraded cell disks, effective cache size, and disk status.

4. Inactivate Grid Disks

Before shutting down the cell node, make all grid disks inactive: CellCLI> alter griddisk all inactive

5. Confirm Grid Disk Status

Verify that all grid disks are offline: CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

6. Shut Down the Cell Node

Bring down the cell node safely: #init 0

7. Replace the Flash Disk

Hand over the cell node to hardware support for flash disk replacement.

8. Verify Disk Status Post-Replacement

Once the cell node is powered back on, log in and check the disk status: #cellcli -e list physicaldiskEnsure all disks, including flash disks, show a status of normal.

9. Check Flash Cache Health

Inspect the flash cache again: #cellcli -e list flashcache detail

10. Reactivate Grid Disks

Bring the grid disks back online: CellCLI> alter griddisk all active

11. Final Verification

Confirm that all grid disks are online: CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

Conclusion

Replacing a failed flash disk in Exadata requires careful coordination and precise execution to avoid data loss and restore optimal performance. Following these steps ensures a smooth and safe replacement process.Author: Kiran JadhavPrincipal Consultant | Exadata Admin

Sunday, July 18, 2021

How to run sundiag on multiple cell nodes - exadata or SSC

Log in to a compute node and run the following command to check the status of physical disks across all cell nodes:
#dcli -g /opt/oracle.SupportTools/onecommand/cell_group -l root 'cellcli -e list physicaldisk'
Look for entries indicating poor performance or warning status.

Once the faulty cell node is identified, log in to it and run
#cellcli -e list physicaldisk
This will help determine whether the issue is with a normal disk or a flash disk. A typical output for a failed flash disk might look like:
`FLASH_1_0 15557M04E3N warning - poor performance`

To get more information about the degraded flash disk:
#cellcli -e list flashcache detail
Check for degraded cell disks, effective cache size, and disk status.

Before shutting down the cell node, make all grid disks inactive:
CellCLI> alter griddisk all inactive

Verify that all grid disks are offline:
CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

Bring down the cell node safely:
#init 0

Once the cell node is powered back on, log in and check the disk status:
#cellcli -e list physicaldisk
Ensure all disks, including flash disks, show a status of `normal`.

Inspect the flash cache again:
#cellcli -e list flashcache detail

Bring the grid disks back online:
CellCLI> alter griddisk all active

Confirm that all grid disks are online:
CellCLI> list griddisk attributes name, asmmodestatus, asmdeactivationoutcome

Replacing a failed flash disk in Exadata requires careful coordination and precise execution to avoid data loss and restore optimal performance. Following these steps ensures a smooth and safe replacement process.
Author: Kiran Jadhav
Principal Consultant | Exadata Admin