Hp StorageWorks Scalable File Share User Manual download pdf (Page 48)

as needed on the file system server nodes. The lustre_config command can take hours

to complete depending on the size of the disks.

2. Start the file system manually and test for proper operation before configuring Heartbeat

to start the file system. Mount the file system components on the servers:

# lustre_start -v -a ./testfs.csv

3. Mount the file system on a client node according to the instructions in Chapter 4 (page 41).

# mount /testfs

4. Verify proper file system behavior as described in “Testing Your Configuration” (page 55).

5. After the behavior is verified, unmount the file system on the client:

# umount /testfs

6. Unmount the file system components from on the servers:

# lustre_start -v -k -a ./testfs.csv

5.2 Configuring Heartbeat

HP SFS G3.2-0 uses Heartbeat V2.1.3 for failover. Heartbeat is open source software. Heartbeat

RPMs are included in the HP SFS G3.2-0 kit. More information and documentation is available

at:

http://www.linux-ha.org/Heartbeat.

IMPORTANT: This section assumes you are familiar with the concepts in the Failover chapter

of the Lustre 1.8 Operations Manual.

HP SFS G3.2-0 uses Heartbeat to place pairs of nodes in failover pairs, or clusters. A Heartbeat

failover pair is responsible for a set of resources. Heartbeat resources are Lustre servers: the MDS,

the MGS, and the OSTs. Lustre servers are implemented as locally mounted file systems, for

example, /mnt/ost13. Mounting the file system starts the Lustre server. Each node in a failover

pair is responsible for half the servers and the corresponding mount-points. If one node fails,

the other node in the failover pair mounts the file systems that belong to the failed node causing

the corresponding Lustre servers to run on that node. When a failed node returns, the

mount-points can be transferred to that node either automatically or manually, depending on

how Heartbeat is configured. Manual fail back can prevent system oscillation if, for example, a

bad node reboots continuously.

Heartbeat nodes send messages over the network interfaces to exchange status information and

determine whether the other member of the failover pair is alive. The HP SFS G3.2-0

implementation sends these messages using IP multicast. Each failover pair uses a different IP

multicast group.

When a node determines that its partner has failed, it must ensure that the other node in the pair

cannot access the shared disk before it takes over. Heartbeat can usually determine whether the

other node in a pair has been shut down or powered off. When the status is uncertain, you might

need to power cycle a partner node to ensure it cannot access the shared disk. This is referred to

as STONITH. HP SFS G3.2-0 uses iLO, rather than remote power controllers, for STONITH.

5.2.1 Preparing Heartbeat

1. Verify that the Heartbeat RPMs are installed:

libnet-1.1.2.1-2.2.el5.rf

pils-2.1.3-1.01hp

stonith-2.1.3-1.01hp

48 Using HP SFS Software

1 2 ... 43 44 45 46 47 48 49 50 51 52 53 ... 83 84

Comments to this Manuals

No comments

Hp StorageWorks Scalable File Share User Manual Page 48

Comments to this Manuals

Related products and manuals for Storage Hp StorageWorks Scalable File Share