HP StorageWorks Scalable File ShareSystem User GuideVersion 2.2 Product Version: HP StorageWorks Scalable File Share Version 2.2Published: November 20
xA.20.9 show log ... A-27A.20.1
Creating and modifying file systems5–16File system nameThe file system name must be a maximum length of 32 characters and must not contain spaces.HP r
Creating a file System — SFS20 storage 5–17Number of OST servicesYou will be asked to specify how many OST services the file system is to use. A list
Creating and modifying file systems5–18Mount optionsUnderlying the OST and MDS services on the Lustre file systems, there are ldiskfs file systems. Wh
Creating a file System — SFS20 storage 5–19The system-wide quota attribute must be enabled in order for any file system to use quotas; this attribute
Creating and modifying file systems5–20Here are the available LUNs which you can assign to OSTs:LUN Array Controller Role Size (GB) Visible to Prefer
Creating a file System — SFS20 storage 5–21In the following example, array1 and array 2 are connected to different Smart Array 6404 adapters on the se
Creating and modifying file systems5–22LUN (WWID) Logical Drive---- ----------------------------------- --
Creating a file System — SFS20 storage 5–235.2.3.1 Using the create filesystem command in interactive mode — SFS20 storageWhen you enter the create fi
Creating and modifying file systems5–24You are now asked to specify how many OST services the file system is to use. In this example, the OST services
Creating a file System — SFS20 storage 5–25You are now asked to enter the stripe count for the file system.The stripe count of a filesystem refers to
xiAbout this guideThis guide describes how to operate the HP SFS system and perform routine system administration tasks.This guide describes only the
Creating and modifying file systems5–265.2.3.2 Using the create filesystem command in scripted mode — SFS20 storageYou can create a file system in scr
Operating a file system 5–27The following example shows that LUNs 3 and 5 are components of a mirrored MDS service (LUN 30) for the data file system,
Creating and modifying file systems5–28If the create filesystem command fails, you need to determine whether the file system was partially created or
Modifying file system attributes 5–295.6 Modifying file system attributesYou can use the modify filesystem command to modify the following attributes
Creating and modifying file systems5–30for the quotas options and that you do not change the values unless you are requested to do so by your HP Custo
Modifying file system attributes 5–314. Select the Add OSTs option, as follows:Enter your choice [c]: 1 You are then prompted to select and configure
Creating and modifying file systems5–325. Select the service and then select the preferred server for the service.Repeat this step for each service wh
Modifying file system attributes 5–333. When the file system has stopped, enter the modify filesystem filesystem_name command, as shown in the followi
Creating and modifying file systems5–34To add or remove an interconnect for a file system, perform the following steps:1. Unmount the file system on a
Modifying file system attributes 5–355.6.5 Changing other file system attributesNOTE: Systems running in Portals compatibility mode cannot use the quo
xii• Appendix A contains a description of the HP SFS CLI commands.• Appendix B contains details of expected performance figures, based on tests carrie
Creating and modifying file systems5–365.6.6 Rewriting LDAP configuration dataTo rewrite the LDAP configuration data, perform the following series of
Managing quotas 5–375.7 Managing quotasQuotas allow you to control how many blocks or inodes a user or group can use in a file system. In this release
Creating and modifying file systems5–38As the user continues to access the OST device, the amount of unused quota reservation continues to drop. When
Managing quotas 5–395.7.2.2 Step 2: Enabling quota functionality on a file systemNOTE: Unless you have enabled quotas on the file system (and the syst
Creating and modifying file systems5–40• If the file system was created on a system running HP SFS Version 2.2, proceed to Section 5.7.2.5 to activate
Managing quotas 5–41You must run the command on a client node that has been configured as described in Section 5.7.2.3, and where the file system is m
Creating and modifying file systems5–425.7.4 Disabling quota functionality on one or all file systemsYou can disable quota functionality on a specific
Deleting file systems 5–432. Stop all file systems, by entering the stop filesystem filesystem_name command for each file system.Do not proceed to the
Creating and modifying file systems5–44You can manage space on OST services as follows:• By changing the threshold at which the alert triggers, as des
Managing space on OST services 5–453. Start the file system(s).When new files are created, ost3 and ost7 will not be used to store new files. However,
xiiiNaming conventionsThis section lists the naming conventions used for an HP SFS system in this guide. You are free to choose your own name for your
Creating and modifying file systems5–46
6–16 Verifying, diagnosing, and maintaining the systemThis chapter describes how to verify the system configuration and how to diagnose possible probl
Verifying, diagnosing, and maintaining the system6–26.1 Verifying the systemThis section describes how to verify that the system has been installed co
Verifying the system 6–3• Specify the severity levels to be included in the report.There are four severity levels:• Critical conditionsCritical condit
Verifying, diagnosing, and maintaining the system6–4Table 6-2 lists the components, the component tests, and the level of each test.severity Specifies
Verifying the system 6–5The following is an example of the syscheck command when it is run with no options specified:sfs> syscheck...
Verifying, diagnosing, and maintaining the system6–6----------------------------------- Warnings -----------------------------------server------south[
Verifying the system 6–7c. Look at the server and confirm that the power has been turned off. Only one server in the system should be turned off at th
Verifying, diagnosing, and maintaining the system6–86.1.3 Verifying EVA4000 storage failover configurationThis section only applies to systems where E
Verifying the system 6–9normally, and that the problem is related to array 3. Examine the controller and switch connections, and correct any faults be
xiv
Verifying, diagnosing, and maintaining the system6–10In the output from the hpacucli utility, the SFS20 arrays are shown as MSA20. Take note of the ar
Verifying the system 6–113. Look at the status of each logical disk on the SFS20 array, as follows:=> ld all showMSA20 at 01 array A logicaldriv
Verifying, diagnosing, and maintaining the system6–127. Verify that each of the physical disks is operational by entering the following command: =>
Verifying the system 6–13• Use the following table to determine the drive_number for the array; the drive number is derived from the bay number. For e
Verifying, diagnosing, and maintaining the system6–14This example shows that there are errors in the disk’s log. Such errors do not necessarily mean t
Verifying the system 6–15•The Battery Status field should be set to OK.•The Firmware Version and Hardware Revision fields should be set to the correct
Verifying, diagnosing, and maintaining the system6–16The syntax of the command is as follows:all_ost_raw_lun_check.bash [-h] | [-v] [-r path] [-f file
Verifying the system 6–176.1.5.2 Verifying the performance of LUNs on a single serverCAUTION: Do not use the procedure described here to test the perf
Verifying, diagnosing, and maintaining the system6–18To verify the performance of LUNs on one server, perform the following steps:1. Enter the show lu
Verifying the system 6–19Speed for read to /dev/hpls/dev8a: 152 MB/sCleaning up test spaceResetting lun features for /dev/hpls/dev7a...Resetting lun f
xvSafety considerationsTo avoid bodily harm and damage to electronic components, read the following warning before performing any maintenance on a ser
Verifying, diagnosing, and maintaining the system6–206.1.6 Verifying the management networkTo verify that the servers are correctly connected to the m
Verifying the system 6–216.1.8 Interconnect diagnosticsThis section is organized as follows:• Testing Gigabit Ethernet interconnect performance (Secti
Verifying, diagnosing, and maintaining the system6–22The output from the command displays the speed the link is running at, as shown in the following
Verifying the system 6–23NOTE: To ensure that an accurate test is performed where a dual Gigabit Ethernet interconnect is used, order the client nodes
Verifying, diagnosing, and maintaining the system6–242. Run the gm_board_info program, by entering the following command:# /opt/gm/bin/gm_board_info3.
Verifying the system 6–256. Test the PCI bandwidth to confirm whether the PCI interface is correctly detected as 64-bit/132MHz, by entering the follow
Verifying, diagnosing, and maintaining the system6–26Parallel testTo run the net_test.bash command to test connections between a number of servers and
Verifying the system 6–276.1.8.2.3 Testing Myrinet interconnect performance using the gm_allsize commandYou can use the gm_allsize tool as an alternat
Verifying, diagnosing, and maintaining the system6–28The output from the command consists of two columns of data: the first column lists the message s
Verifying the system 6–29*** program interrupted by user *** Total queued = 0.Total sent = 0.Total received = 93751.Bidirectional (summed) bandw
xvi
Verifying, diagnosing, and maintaining the system6–306.1.8.3 Examining the Quadrics adapter and interconnect linkThis section describes how to identif
Verifying the system 6–31CAUTION: Using NFS on an Object Storage Server is not generally supported, as it may seriously interfere with Lustre operatio
Verifying, diagnosing, and maintaining the system6–326.1.8.3.3 Example output from a qselantest testThe following example shows typical output from a
Verifying the system 6–33BARs[5]: 00000000CISPointer: 00000000SubVendorID: 0000SubDeviceID: 0000ROMBAR: 00000000Capabilities: 60I
Verifying, diagnosing, and maintaining the system6–34--------------------------------------------------------------------------------qsnet2_dmatest -f
Verifying the system 6–356.1.8.4 Examining the Voltaire InfiniBand interconnect HCA adapter and interconnect linkThis section describes how to identif
Verifying, diagnosing, and maintaining the system6–362. On the second node in the test, enter the command shown in the following example:# perf_main -
Verifying the system 6–37When you have ensured that no other processes are accessing the file system, enter the ost_perf_check.bash command on the cli
Verifying, diagnosing, and maintaining the system6–38The command tests the speed at which each client node can read from and write to a single OST ser
Managing email alerts 6–396.1.9.3 Overall file system performance testsThe overall file system performance test measures the read and write speeds to
1–11OverviewThis chapter provides an overview of the HP StorageWorks Scalable File Share product. The chapter is organized as follows:• Product overvi
Verifying, diagnosing, and maintaining the system6–40• Viewing email alerts (Section 6.2.7)• Disabling and enabling email alerts (Section 6.2.8)• Dele
Managing email alerts 6–416.2.3 Constructing email alert filtersEmail alert filters use the syntax that is used for queries in the show log command (s
Verifying, diagnosing, and maintaining the system6–42Table 6-4 Default email alertsEmail Alert Name Purpose Email Alert Filter Action Requiredarray_fa
Managing email alerts 6–43lustre_bug Alerts you when a fault occurs in the Lustre software.facility=kern && data contains "LustreError&qu
Verifying, diagnosing, and maintaining the system6–446.2.5 Modifying email alertsWhen the HP SFS system software is installed, a number of email alert
Managing email alerts 6–45Enter filter: facility=server && data contains "Down"...Enter email address(s): [email protected]
Verifying, diagnosing, and maintaining the system6–466.2.9 Deleting email alertsTo delete an email alert, enter the delete alert alert_name command as
Backing up and restoring system data 6–47In addition to the backups created by the create database_backup command, the following snapshot backups are
Verifying, diagnosing, and maintaining the system6–486.3.3 Restoring a system database backupWhen the database is backed up automatically at a set int
Backing up and restoring file system data 6–49To delete a system database backup, perform the following steps:1. Enter the show database_backups comma
Overview1–21.1 Product overviewHP StorageWorks Scalable File Share Version 2.2 (based on Lustre® technology) is a product from HP that uses the Lustre
Verifying, diagnosing, and maintaining the system6–506.5 Removing log filesThere is an automated process to clear up log files. However, examine the u
Viewing and clearing the Integrated Management Log 6–516.6 Viewing and clearing the Integrated Management Log6.6.1 OverviewThe Integrated Management L
Verifying, diagnosing, and maintaining the system6–52
7–17 Changing the system parametersThis chapter contains instructions for reconfiguring the HP SFS system after initial installation. The chapter is o
Changing the system parameters7–27.1 Changing system parametersAfter the HP SFS system has been installed and configured, it is possible to change som
Changing system parameters 7–3Networks/Gigabit Ethernet Networks ParametersType of network No N/AStart IP Yes This procedure describes how to change t
Changing the system parameters7–4Configure the network device on all serversYes 1. Unmount the file systems on all client nodes. 2. Log in to the admi
Changing system parameters 7–5Start IP Yes This procedure describes how to change the Start IP address on a bonded Gigabit Ethernet network. Note that
Changing the system parameters7–6Networks/InfiniBand Interconnect ParametersConfigure an InfiniBand InterconnectYes 1. Unmount the file systems on all
Changing system parameters 7–7Networks/Quadrics Interconnect ParametersQuadrics Interconnect Machine IDYes 1. Run the configure system command to chan
Product overview 1–3Figure 1-1 Logical overview of the Lustre file systemA typical Lustre file system consists of multiple Object Storage Servers that
Changing the system parameters7–8The remainder of this section is organized as follows:• Using the configure system command (Section 7.2)• Using the c
Using the configure server command 7–92. The Configure Networks menu is displayed. Select the option to configure Gigabit Ethernet networks.Configure
Changing the system parameters7–10• If both the administration and the MDS servers are affected, enter the configure server command for each server, a
Changing the attributes of bonded Gigabit Ethernet networks 7–112. Use the set nic command to change the IP address, as shown in the following example
Changing the system parameters7–125. The Configure Networks menu is displayed. Enter 2 to select the option to configure the InfiniBand interconnect,
Removing InfiniBand partitioning 7–13 5) Partition Key None a) All of the above n) Next Section
Changing the system parameters7–144. Enter the configure system command, and then enter 2 to select the Configure Networks menu, as follows:sfs> co
Changing the iLO user name and password 7–15Deleting interface ipoib1, are you sure (yes/no)? [no]: yesipoib1 deleted.Delete IB Interface ------------
Changing the system parameters7–164. Turn on the power to the administration server, by entering the command shown in the following example:sfs> bo
8–18 Replacing, adding, and removing hardware, and upgrading firmwareThis chapter describes the software configuration issues that arise as a result o
© Copyright 2005, 2006 Hewlett-Packard Development Company, L.P.Lustre® is a registered trademark of Cluster File Systems, Inc.Linux is a U.S. registe
Overview1–4All of the software that is needed to operate the HP SFS system is installed from the HP StorageWorks Scalable File Share System Software C
Replacing, adding, and removing hardware, and upgrading firmware8–28.1 Replacing hardware componentsIn this section, the term replace means to remove
Replacing hardware components 8–3• Replacing a Voltaire InfiniBand switch (Section 8.1.27)• Relocating an InfiniBand cable to a different port on the
Replacing, adding, and removing hardware, and upgrading firmware8–48.1.2 Replacing an Object Storage Server or the motherboard component in an Object
Replacing hardware components 8–5To recover from failure of both internal disks in the administration server or the MDS server, perform the following
Replacing, adding, and removing hardware, and upgrading firmware8–68.1.6 Replacing a Storage Host Bus Adapter on an EVA4000 array (HP Part Number FCA2
Replacing hardware components 8–78.1.10 Replacing a disk in an SFS20 arrayTIP: Section 9.34 provides useful information on how to determine whether yo
Replacing, adding, and removing hardware, and upgrading firmware8–82. To reduce the amount of time that the read operation will take, bind each LUN on
Replacing hardware components 8–9disk drive as soon as possible, because if a second disk drive fails, it will cause the entire logical drive to fail.
Replacing, adding, and removing hardware, and upgrading firmware8–10• When a controller module is replaced in an SFS20 array, the battery in the array
Replacing hardware components 8–11Configuration impactNone.NOTE: Replace failed cache batteries as soon as possible. The batteries are not a redundant
Product overview 1–5• A server pair consisting of the administration server and the MDS server, and pairs of Object Storage ServersIn the event of fai
Replacing, adding, and removing hardware, and upgrading firmware8–128.1.15 Replacing Smart Array 6404 adapterConfiguration impactNone.Process1. Use th
Replacing hardware components 8–137. Start the file system(s).8. When file systems restart, client nodes can remount the file systems.If a client node
Replacing, adding, and removing hardware, and upgrading firmware8–148.1.20 Relocating a Myrinet cable to a different port on a Myrinet networkConfigur
Replacing hardware components 8–158.1.25 Replacing a Voltaire HCA adapterConfiguration impactNone.ProcessBoot the server.8.1.26 Replacing an InfiniBan
Replacing, adding, and removing hardware, and upgrading firmware8–168.1.29 Replacing a Power Distribution Unit (PDU) on a rackConfiguration impactFile
Upgrading firmware 8–178.2.1.1 Upgrading online using the OnlineROM Flash Component executableTo upgrade the HP Integrated Lights-Out Management Contr
Replacing, adding, and removing hardware, and upgrading firmware8–188.2.2 Upgrading firmware on Smart Array 6404 adapters and SFS20 arraysThe HP Stora
Upgrading firmware 8–19NOTE: The firmware files are currently placed in the /usr/mst/fw-23108-rel-3.0.0/ directory. However, if this directory does no
Replacing, adding, and removing hardware, and upgrading firmware8–205. When the command prompt returns, the upgrade is complete. Reboot the server to
Adding and removing components 8–21 e) Exit (save updates) q) Quit (discard updates)
Overview1–61.1.3.2.2 SFS20 storage arraysAn SFS20 storage array is a RAID array that can be attached to two servers. A system configured with SFS20 ar
Replacing, adding, and removing hardware, and upgrading firmware8–227. Boot the new Object Storage Servers, as shown in the following example:sfs>
Adding and removing components 8–23At any stage you can enter ? to get more help on the topic.To update data select one or all of the following sectio
Replacing, adding, and removing hardware, and upgrading firmware8–246. Reconfigure the administration and MDS servers and the remaining Object Storage
Adding and removing components 8–25To upgrade a single Gigabit Ethernet interconnect in an existing HP SFS system to become a dual or a bonded Gigabit
Replacing, adding, and removing hardware, and upgrading firmware8–268.3.6 Removing a dual or a bonded Gigabit Ethernet interconnectIn an HP SFS system
9–19TroubleshootingThis chapter provides information for troubleshooting possible problems in the HP SFS system. (For information on troubleshooting p
Troubleshooting9–2• Accessing the iLO component (Section 9.31)• Troubleshooting licenses (Section 9.32)• Troubleshooting failed SFS20 arrays (Section
Server fails to boot during installation 9–39.1 Server fails to boot during installationIf a server fails to boot during the installation process, it
Troubleshooting9–4This forces the server to crash and reboot, and the output of the crash dump is captured when the server reboots. The crash dump fil
Server with Quadrics interconnect may fail to boot 9–5When the configure system command attempted to add this route, the command failed and returned t
Product overview 1–71.1.3.3 Network connectionsThe servers in the HP SFS system are connected to networks inside and outside the system as follows:• T
Troubleshooting9–63. Connect to the remote console of the server, as follows:# hpls_console --server server_name --remote4. During the power on/start-
Replacing a service LUN with a spare service LUN 9–7mysql> select * from hpls_object_states where type=’Luns’;+------+------+--------------+-------
Troubleshooting9–8In this example, the service LUN is LUN 11, and LUN 13 is a spare service LUN on an array attached to servers south3 and south4.3. F
The configure array command fails 9–9========================= M e m b e r S t a t u s ========================== Member Status Node
Troubleshooting9–109.15.2 Preferred server for the SFS20 array is downIf the preferred server for the SFS20 is down, the configure array command fails
Emergency clustat events occur during configure server command 9–119.18 Emergency clustat events occur during configure server commandWhen the configu
Troubleshooting9–12If the table has not been repaired, restore an earlier version of the database. To see all of the available backup files on the sys
Troubleshooting the Quadrics interconnect 9–139.23 Troubleshooting the Quadrics interconnectThis section provides some useful tips for investigating a
Troubleshooting9–149.23.2 Nodeset and Node ID informationTo determine what nodes are visible to a node on the Quadrics interconnect, enter the followi
Troubleshooting the Myrinet interconnect 9–159.24 Troubleshooting the Myrinet interconnectThis section provides some useful tips for investigating and
Overview1–81.1.3.3.2 Management networkFigure 1-6 shows how the servers in the HP SFS system are connected to the management network.Figure 1-6 HP SFS
Troubleshooting9–16Board number 0: lanai_cpu_version = 0x0a00 (LANai10.0) lanai_sram_size = 0x001fe000 (2040K bytes)ROM settings: MAC=00:60:dd:48
Troubleshooting the Voltaire InfiniBand interconnect 9–17ats 36416 1 devucm 13944 2 q_mng 18768 0 [sdp] ibat
Troubleshooting9–181) Auto-start 4) Firmware-update 7) MPI2) IPoIB 5) Start 8) Exit3) Fabric 6) Stop=>No
Troubleshooting the Voltaire InfiniBand interconnect 9–19 Subsystem: Mellanox Technology MT23108 InfiniHost Flags: bus master, 66Mhz,
Troubleshooting9–209.25.4 Connection and data transfer problemsIf a server has connection or data transfer problems over a Voltaire InfiniBand interco
Troubleshooting file systems 9–219.26 Troubleshooting file systemsThis section provides information for troubleshooting problems with file systems, an
Troubleshooting9–223. Shut down the Object Storage Servers that served the file system by entering the following commands on the administration server
Troubleshooting file systems 9–239.26.4 Troubleshooting the stop filesystem commandIf the stop filesystem command fails, attempt to correct the proble
Troubleshooting9–24• mdc • obdclass • lvfs • lnet If only one file system is being served, and one or more of the above modules still exists on the se
Troubleshooting file systems 9–25Note the following points:• You must not run more than one file system repair session at any one time.• The version o
Supported hardware 1–91.3 Supported hardwareTable 1-1 lists the supported hardware devices for HP SFS systems.Table 1-1 Supported hardwareHardware Com
Troubleshooting9–26NOTE: When a new file system is created or an existing file system is modified, you can run the repair-lfsck script so that a shell
Troubleshooting file systems 9–272. Stop the file system and ensure that the file system devices are not being used. Check the /proc/mounts file and t
Troubleshooting9–282. Unmount the file system that uses the device on all client nodes. The /proc/fs/lustre/mds/*/num_exports counter on the MDS serve
Troubleshooting file systems 9–29NOTE: An MDS service is considered to be a client of an OST service; as a result, the number of recoverable clients s
Troubleshooting9–309.26.8 Rebalancing file system servicesSometimes, file system services do not run on their preferred server. This happens after a t
Troubleshooting file systems 9–319.26.9 Troubleshooting supplementary groups accessIf a user receives unexpected access denied errors when using suppl
Troubleshooting9–329.27 Troubleshooting file system performanceIn some circumstances, the performance of a Lustre file system may seem to be less than
Troubleshooting file system performance 9–332. Determine whether the performance anomaly can be reproduced.1. Rerun the client application.2. Ensure t
Troubleshooting9–346. Verify that the interconnect is performing correctly.1. Run the network tests for the interconnect type to determine:• the netwo
Troubleshooting file system performance 9–357. Verify that the storage is configured correctly.1. Examine the HP SFS system database to ensure that th
Overview1–10
Troubleshooting9–369.27.2 Verifying file stripingIf a file is not being striped across all OST services, the full bandwidth may not be available. This
Troubleshooting file system performance 9–37Then enter the show filesystem filesystem_name command to view details for the file system, as follows:sfs
Troubleshooting9–3809:31:28 south2 -- <Unallocated>09:31:35 south2 -- <Success>09:31:35 south2 -- Server is already configured to the desi
Troubleshooting file system performance 9–392. Rename the new file to the original name using the mv command, as shown in the following example:# mv d
Troubleshooting9–40Remedial actionSee Section 9.26.8 for information on how to rebalance file system services.9.27.4 Checking for unbalanced controlle
Troubleshooting file system performance 9–41Remedial actionSee Section 9.29 for information on troubleshooting LUN presentation.9.27.5 Examining the s
Troubleshooting9–429.27.6 Examining EVA4000 storage subsystems for errorsTo examine the EVA4000 storage subsystems for errors, perform the following c
Troubleshooting file system performance 9–439.27.8 Examining the interconnect switch for errorsTo find information on errors on a Myrinet interconnect
Troubleshooting9–44pg44lab1 11-Feb-1990 18:15:50==========================- TELNET - OPERATOR MODE
Troubleshooting file system performance 9–45Use the telnet(1) command to access the Fibre Channel switch, as shown in the following example:# telnet 1
2–12 The sfsmgr utilityThis chapter provides an overview of the sfsmgr(8) utility, and is organized as follows:• Overview (Section 2.1)• Starting the
Troubleshooting9–46To display the WWIDs of all HBAs in the HP SFS system, enter the following command:# lsdbquery system ’select * from hpls_hbas’2100
Troubleshooting EVA4000 array connectivity 9–47• There may have been a battery failure on the SFS20 array, causing the cache to be disabled.See Sectio
Troubleshooting9–48If your system cannot see the controllers (see Example 9-2), check the Fibre Channel network. If your system can see the controller
Troubleshooting LUN presentation 9–49( 2: 2): Total reqs 24, Pending reqs 0, flags 0x2, 0:0:84 00( 2: 3): Total reqs 24, Pending reqs 0, flags 0x2, 0:
Troubleshooting9–50The following is an example of the output from the command where LUNs from an EVA4000 array are (incorrectly) visible to both contr
Accessing consoles 9–519.30 Accessing consolesYou can connect to the console of a server through the iLO, as shown in the following example (where sou
Troubleshooting9–52NOTE: If you get a message similar to the following, see Section 9.31.2 for information on how to troubleshoot the problem:Requeste
Troubleshooting licenses 9–53Other than another user accessing a console, there is another known cause for a locked up console: if the administration
Troubleshooting9–548. You can directly test whether an increment is licensed, as follows.For the SFSMDSCAP or SFSMDSENT licenses, enter the following
Troubleshooting failed SFS20 arrays 9–55• When you reboot the affected servers, the show array command shows a failed status. Because an array is atta
The sfsmgr utility2–22.1 OverviewThe HP SFS system is not a general purpose system; instead, it is dedicated to running MDS and OST services. Unless i
Troubleshooting9–569.33.2 Recovering from a temporary SFS20 array failureIf an SFS20 array has had a temporary failure such as loss of power or inadve
Troubleshooting failed SFS20 arrays 9–572. Identify the LUNs that the MDS or OST service is based on, by using the show filesystem filesystem_name com
Troubleshooting9–58 Number Major Minor RaidDevice State 0 105 96 0 active sync /dev/cciss/c1d6 1 105
Handling Disk Errors on SFS20 storage 9–59When the resynchronization is complete, the status information will change, as shown in the following exampl
Troubleshooting9–60(For information on configuring email alerts, see Section 6.2.)In the event of disk errors occurring, take action as described in t
Recovering degraded MDS services on systems using EVA4000 storage 9–61For more information on reviewing SFS20 array information, see Section 6.1.4.2.I
Troubleshooting9–623. Identify the component LUNs that are used to mirror the LUN, using the show lun lun_number command as shown in the following e
Recovering degraded MDS services on systems using EVA4000 storage 9–636. In the example in the previous step, the /dev/sdc component is shown to have
Troubleshooting9–64When the resynchronization is complete, the /proc/mdstat command indicates this, as shown in the following example:sfs> show log
The MDS service fails with an ASSERTION(ino ==inode->i_ino) message 9–659.38 The MDS service fails with an ASSERTION(ino ==inode->i_ino) message
Running sfsmgr commands 2–3When you first enter the sfsmgr command when installing or upgrading a system, the output displayed is one of the following
Troubleshooting9–665. If you know which file system is causing the problem, start that file system; otherwise, start all file systems, but wait until
Rebuilding logical drives after disk failures 9–67When the disks have been replaced, the error messages shown above may continue to be generated; this
Troubleshooting9–689.41 Determining if the Network ID of a server on a Quadrics or Myrinet interconnect has been changedIf you relocate a server to a
Troubleshooting client mount failures 9–699.42 Troubleshooting client mount failuresWhen a node is booting, you can monitor the progress on the consol
Troubleshooting9–70When this happens, the SFS service prints a message similar to the following in the /var/log/messages file: sfsmount: Waiting for I
A–1ACLI commandsThis appendix lists the HP SFS CLI commands.
CLI commandsA–2A.1 General information• To start the CLI, enter the sfsmgr command.• In commands where it is appropriate, you can use either the singu
accept license command A–3A.2 accept license commandThe accept license command accepts and installs license information. For information about license
CLI commandsA–4A.5 configure commandsA.5.1 configure array The configure array command configures an SFS20 array. This command cannot be used to confi
configure commands A–5raid Optional. Specify the redundancy level that is to be used for the array. The default value is ADG. Valid values are ADG, 6
iiiAbout this guide. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xiSafety consideration
The sfsmgr utility2–4You must enter enough letters to make the abbreviated command unique; if you do not, the command will not work (the usage for the
CLI commandsA–6When high priority is specified, expansion or rebuild occurs at the expense of normal I/O operations. Although system performance is af
create commands A–7redo Optional. Specify the configuration state to which you want the servers returned. Valid values are Unconfigured, Prepared, Bo
CLI commandsA–8Syntaxcreate attribute attribute_name=attribute_value Argumentsattribute_name Required. Specify the name and value of a new or existin
create commands A–9A.6.5 create filesystem The create filesystem command creates a new Lustre file system.SyntaxInteractive modecreate filesystemScrip
CLI commandsA–10A.7 deactivate ost commandThe deactivate ost command deactivates OST services. When an OST service is deactivated, no new files will b
delete commands A–11A.8.2 delete array The delete array command deletes array information from the database.CAUTION: Deleting the array information fr
CLI commandsA–12A.8.5 delete database_backup The delete database_backup command deletes the specified system database backup copy.Syntaxdelete databas
disable commands A–13ExampleThe following example disables the server_down alert:sfs> disable alert server_downA.9.2 disable server The disable ser
CLI commandsA–14A.10 enable commandsA.10.1 enable alertThe enable alert command enables email alerts so that email messages are sent when events occur
help command A–15A.11 help commandThe help command shows information about commands.Syntaxhelp [command_name]Argumentscommand_name Optional. Specify
Troubleshooting the sfsmgr command 2–5========================= H e a r t b e a t S t a t u s ================== Name T
CLI commandsA–16A.13 modify commandsA.13.1 modify alertThe modify alert command allows you to modify email alerts. When the HP SFS system software is
modify commands A–17on OST services. HP recommends that you enable the extents mount option on file systems underlying OST services only—do not enable
CLI commandsA–18A.14 monitor command commandThe monitor command command allows you to monitor the progress of the specified command.When you issue a c
quit command A–19A.16 quit commandThe quit command closes the CLI. (You can also use the exit alias to close the CLI.)SyntaxquitA.17 restore database_
CLI commandsA–20A.19 set commandsA.19.1 set array The set array command turns on/off the blue LEDs on the SATA drives in the SFS20 array, and sets the
set commands A–21A.19.2 set attribute The set attribute command creates a system attribute and/or sets the value of the attribute. You can only set th
CLI commandsA–22A.19.3 set lun The set lun command sets the properties of LUNs.Syntaxset lun wwid|lun_number [user=lun_user] [role=role] [preferred_se
set commands A–23The following information applies to EVA4000 arrays only:The recommended way to specify path information for a LUN is to set the pref
CLI commandsA–24A.19.5 set password The set password command allows you to change the password for a user, including the root user.When you enter the
show commands A–25A.20 show commandsA.20.1 show alert The show alert command displays details of email alerts.Syntaxshow alert [alert_name] Argumentsa
The sfsmgr utility2–6
CLI commandsA–26Argumentsattribute_name Optional. Specify the name of the attribute whose details you want to display. If you do not specify an attri
show commands A–27A.20.6 show database_backups The show database_backups command displays details of all system database backups.Syntaxshow database_b
CLI commandsA–28now Optional. When this argument is used, events are displayed as they occur.ExamplesThe following example displays all system event
show commands A–29A.20.10 show lun The show lun command displays details of LUNs.Syntaxshow lun [wwid|device|lun_number]Argumentswwid|device|lun_numbe
CLI commandsA–30ExampleThe following example displays details of the ost4 OST service:sfs> show ost ost4A.20.13 show server The show server command
shutdown server command A–31A.21 shutdown server commandThe shutdown server command shuts down and turns off servers. The shutdown server command send
CLI commandsA–32A.23 stop filesystem commandThe stop filesystem command stops a file system while preserving user connections.When you stop a file sys
unconfigure array command A–33Note that specifying level=2 limits the testing to level 2 tests; it does not run level 1 tests. To run both level 1 and
CLI commandsA–34Argumentsarray_numbers Required. Specify the number of the array that you want to restore to an unconfigured state. You can specify a
B–1B Performance figuresThis appendix contains details of approximate expected performance figures for an HP SFS system, and is organized as follows:•
3–13 Operating the systemThis chapter contains instructions for operating the HP SFS system. The chapter is organized as follows:• Booting the system
Performance figuresB–2B.1 I/O performanceThis section provides expected I/O performance figures for a single server in the HP SFS system. These figure
I/O performance B–3SFS20 storage — 250GB disksTable B-2 provides details of expected performance figures for one Object Storage Server in systems usin
Performance figuresB–4B.2 Network performanceThis section provides approximate expected network performance figures for each interconnect type. Figure
SFS20 array configuration B–5B.3 SFS20 array configurationThe graph in Figure B-1 shows the variation in the performance for write operations on an SF
Performance figuresB–6B.4 SFS20 RAID5 and ADG performanceThe graphs in Figure B-2 and Figure B-3 show the streaming write and read performance for HP
SFS20 RAID5 and ADG performance B–7• Each array populated with twelve 250GB disks, configured as one 2TB LUN with ADG (RAID6) redundancy• Voltaire Inf
Performance figuresB–8B.5 Default file system stripe countFigure B-4 Aggregate bandwidth versus file stripe countThe graph in Figure B-4 shows the var
Bandwidth variation—number of OST services and number of client nodes B–9B.6 Bandwidth variation—number of OST services and number of client nodesThe
Performance figuresB–10B.7 Single client node bandwidthThe graph in Figure B-6 shows the variation of bandwidth available to a single client node as a
Gigabit Ethernet bandwidth B–11B.8 Gigabit Ethernet bandwidthThe graphs in Figure B-7 through Figure B-10 show how the addition of dual Gigabit Ethern
Operating the system3–23.1 Booting the systemBefore booting the system, ensure that all of the system components other than the servers—that is, the s
Performance figuresB–12Figure B-9 9000 MTU average write MB/secFigure B-10 9000 MTU average read MB/secThese results are based on the following config
Meta-data operations from a single client node B–13B.9 Meta-data operations from a single client nodeFigure B-11 shows the numbers of meta-data operat
Performance figuresB–14the HP StorageWorks Scalable File Share Client Installation and User Guide (specifically, the section titled Using Lustre file
C–1C File system configuration examplesThis appendix provides examples of file system configurations, and is organized as follows:• EVA4000 storage ex
File system configuration examplesC–2C.1 EVA4000 storage examplesExample C-1 shows a file system that is not optimally configured. The example is take
EVA4000 storage examples C–3 Interconnect: elan gm tcp MDS mount options: acl,user_xattr OST mount optio
File system configuration examplesC–4C.2 SFS20 storage examplesExample C-3 shows a file system that is not optimally configured. The example is taken
SFS20 storage examples C–5 Interconnect: elan gm tcp MDS mount options: acl,user_xattr OST mount options: extents
File system configuration examplesC–6
D–1D RAID rebuild timing informationThis appendix provides a guide to the estimated time that it takes to rebuild a LUN on an SFS20 array following a
Booting the system 3–3...Command has finished: south[3,5,7] -- <Success>*** Server States *** Success: south[3,5,7]6. Boot the second se
RAID rebuild timing informationD–2D.1 RAID rebuild informationThe time taken for a RAID rebuild operation on an SFS20 array is divided into two parts:
E–1E HP SFS specificationsThis appendix provides information on HP SFS software and system specifications:• Supported number of Object Storage Servers
HP SFS specificationsE–2E.1 Supported number of Object Storage ServersHP SFS supports a maximum of 64 Object Storage Servers. This means that the maxi
File system default stripe size and client page size E–3E.6 File system default stripe size and client page sizeThe stripe size that you set as the de
HP SFS specificationsE–4
Glossary–1Glossaryadministration server The ProLiant DL server that the administration service runs on. Usually the first server in the system.See als
Glossary–2internet protocol See IPIP Internet Protocol. The network layer protocol for the Internet protocol suite that provides the basis for the con
Glossary–3OST service The Object Storage Target software subsystem that provides object services in a Lustre file system.See also Object Storage Serve
Glossary–4
Index–1AAC power strip, replacing on a rack 8-16accessingconsoles 9-51iLO component 9-51addinga dual Gigabit Ethernet interconnect 8-24components
Operating the system3–43.2 Shutting down the systemHP recommends that you stop all file systems (using the stop filesystem command) before shutting do
Index–2FF1 key prompt 9-12Fibre Channel cable, replacing 8-6Fibre Channel switch, replacing 8-6file systemsbacking up and restoring data 6-49chang
Index–3testing Myrinet interconnect performance using the gm_allsize command 6-27testing Myrinet interconnect performance using the net_test.bash com
Index–4stopping file systems 3-7storageEVA4000 arrays 1-5SFS20 arrays 1-6types supported 1-5stripe count 5-8, 5-18stripe size 5-6, 5-16support
Booting multiple servers 3–5...Command has finished: south3 -- <Success>*** Server States *** Success: south33.4 Booting multiple server
Operating the system3–6To shut down an Object Storage Server or the MDS server, enter the command shown in the following example, where server south3
Stopping a file system 3–73.7 Stopping a file systemWhen you create a file system using the create filesystem command, the file system is started and
iv4 Viewing system information4.1 Viewing server information ...
Operating the system3–8If a service shows the unload-failed state, reboot the server to force the file system to unload. (For information on file syst
Starting a file system 3–9You can expect a starting service to go to the recovering or running state after about 1 minute. The recovering state indica
Operating the system3–103.9 Unconfiguring storage arraysThis section describe how to unconfigure storage arrays, and is organized as follows:• Unconfi
Unconfiguring storage arrays 3–11b. In the Navigation pane, select the array you want to unconfigure, then click the Uninitialize button on the Initia
Operating the system3–125. Boot the servers, as shown in the following example:sfs> boot server south[3-4]6. When the servers have booted, you will
Unconfiguring storage arrays 3–137. Log in to the system as follows:login: rootpassword: secret8. Unconfigure the arrays attached to the administratio
Operating the system3–14To unconfigure an SFS20 array, perform the following steps:1. Determine the name of the array you want to unconfigure, as show
Deleting SFS20 array information from the database 3–156. When you have finished unconfiguring (or reconfiguring) the array, shut down the servers, en
Operating the system3–16SFSMDSCAP (MDS license) Allows you to start a file system that is based on capacity class (SFS20 enclosure) storage.SFSOSTENT
Managing licenses 3–17INCREMENT SFSMDSCAP HPQ 1.0 permanent 1 HOSTID="000bcd505cbb \000bcd827644" NOTICE="License Number 7YCYHDHTEYAH&q
v5.7 Managing quotas... 5-375.7.1 Unde
Operating the system3–183.11.2Viewing license informationTo view the license information on the HP SFS system, enter the show license command as shown
Managing licenses 3–19When you edit the /var/flexlm/license.master file, note the following rules:• Do not remove or modify the lines that start with
Operating the system3–203. Save the license file you receive from HP in a convenient directory on a host that can be accessed by the HP SFS system.4.
Managing remote access 3–21installation, make sure these files are the same on boththe Administration and MDS servers: /var/flexlm/license.master
Operating the system3–223.12.1 Creating authorizationsAuthorizations allow users to access the HP SFS system remotely without a password. To create an
Managing remote access 3–233.12.3 Viewing authorizationsTo view a list of all authorizations in the HP SFS system, enter the following command:sfs>
Operating the system3–243.13 Locating servers and SFS20 arraysTo help you to physically locate a server or an SFS20 array, the sfsmgr utility provides
4–14Viewing system informationThere are a number of commands that you can use to view information about the status and configuration of components in
Viewing system information4–24.1 Viewing server informationThe show server command provides information on the state of servers. If you suspect that o
Viewing server information 4–3To see more information on a server, enter the show server command for that server, as shown in the following example. T
vi6.2.5 Modifying email alerts...6-446.2.6 Creat
Viewing system information4–4DIMM Status---------------------- --------01 Ok02 Ok03
Viewing server information 4–5 Server Firmware Date: 05/01/2004 iLO Firmware: 1.82 Current Config. State: Configured Desired Config. State:
Viewing system information4–6Integrated Management Log (IML) eventsCritical Caution-------- -------0 04.1.1 Server information after system
Viewing file system information 4–7The MDS and OST service states are described in Table 4-3.Table 4-2 File system statesState Descriptionstarted All
Viewing system information4–8See Section 3.8 for more information about the starting, recovering, and running states.If an MDS or OST service is a sof
Viewing file system information 4–9To see more information on a file system, enter the show filesystem command for the file system, as shown in the fo
Viewing system information4–10Size The size of the LUN, in gigabytes.Used For the MDS service: the percentage of the total number of files (shown in t
Viewing LUN information 4–11To see more information on an OST service, enter the show ost command for the OST service, as shown in the following examp
Viewing system information4–12 Type: array Role: service User: south1
Viewing array information 4–13Mirrored LUNsIf the LUN being displayed is a mirrored LUN (that is, mirrored across two component LUNs), the show lun lu
vii8.1.27 Replacing a Voltaire InfiniBand switch... 8-158.1.28 Relocati
Viewing system information4–14 LUN (WWID) Controller Number of Paths---- ----------------
Viewing array information 4–15The status of each disk drive in disk bays 1 to 12 is shown for SFS20 arrays. The disk drive status values are described
Viewing system information4–164.6 Viewing event logsEvents that occur on any server in the system are sent to the administration server (or to the MDS
Viewing event logs 4–17You can restrict the show log command to showing a small number of events by using the recent argument, as follows:sfs> show
Viewing system information4–184.6.1.4 Examples of event filtersThe following are examples of how you can filter events, using one or more filters:• Yo
Viewing performance statistics 4–194.7 Viewing performance statisticsPerformance statistics are automatically gathered from each server in the HP SFS
Viewing system information4–204.7.2 Overview of the information gathered by the collectl utilityThe collectl utility gathers the following information
Viewing performance statistics 4–21• Swap space• Megabytes in use• Megabytes free• Disk I/O sizes•Per OST device or cumulative• RPC sizes•Per OST devi
Viewing system information4–22Figure 4-1 The ColPlot Web pageThe highlighted areas of the Web page shown in Figure 4-1 are as follows:1. Dates for whi
Viewing performance statistics 4–234.7.3.1 Viewing overall throughput to OST devices from a serverTo view the overall throughput to OST devices from a
viii9.25.1.1 Determining whether Voltaire InfiniBand interconnect is loaded ...9-169.25.1.2 Starting, stopping
Viewing system information4–24Figure 4-3 A detailed graph showing throughput to each OST device from a server4.7.3.3 Viewing throughput to OST devices
Viewing performance statistics 4–25Figure 4-4 A detailed graph showing throughput to OST devices from specified network devices4.7.3.4 Viewing RPC tra
Viewing system information4–26Figure 4-5 A graph showing the information in the ost-blkD.cfg file4.7.4 Viewing information in the /proc file systemThe
Viewing performance statistics 4–27You can view the disk statistics stored in the /proc/scsi/sd_iostats and /proc/driver/cciss/cciss_iostats/ director
Viewing system information4–28read writediscont pages rpcs % cum % | rpcs % cum %0: 0 0
5–15 Creating and modifying file systemsThis chapter describes how to create, modify, operate, and delete file systems, and is organized as follows:•
Creating and modifying file systems5–25.1 Creating a file system — EVA4000 storageThis section describes how to create a file system on an HP SFS syst
Creating a file system — EVA4000 storage 5–35.1.2 Step 2: Matching array numbers to physical arrays — EVA4000 storageTo determine what role you will a
Creating and modifying file systems5–45.1.3 Step 3: Setting roles, preferred controllers, and disk group information for LUNs — EVA4000 storageNOTE: I
Creating a file system — EVA4000 storage 5–56 1 mds - 290 - south[1-2]7 2 service south3 1
ix9.43 Recovering after using the ifdown command ... 9-709.44 Password need
Creating and modifying file systems5–6Stripe sizeThe file system stripe size is the default size for files created in the file system. However, this c
Creating a file system — EVA4000 storage 5–714 4 B ost 290 south[5-6] -15 5 A ost 290 so
Creating and modifying file systems5–8HP recommends that you enable the ACL functionality on your HP SFS file systems, unless your system is running i
Creating a file system — EVA4000 storage 5–9When quotas are enabled on a file system, the itune and btune settings (specified as a percentage) are use
Creating and modifying file systems5–105.1.5 Step 5: Using the create filesystem command — EVA4000 storageCAUTION: HP recommends that you do not creat
Creating a file system — EVA4000 storage 5–11Depending on the number of LUNs that have an MDS role, you may now be prompted to choose MDS LUNs, as fol
Creating and modifying file systems5–1214 4 B ost 290 south[5-6] -15 5 A ost 290 south[5
Creating a file system — EVA4000 storage 5–13MDS LUN(s):LUN Array Controller Role Size(GB) Preferred Server Backup Server--- ----- ----------
Creating and modifying file systems5–145.1.7 Step 7: Backing up the system databaseBack up the system database, as follows:1. Back up the database,
Creating a file System — SFS20 storage 5–15The configuration shown in this example would allow you to create an MDS service that uses two LUNs, and fo
Comments to this Manuals