`
`A Practical Reference
`for Implementing Fibre
`
`
`
`TOM CLARK
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`VMWARE-101 1
`
`Page 1 of 52
`
`VMWARE-1011
`Page 1 of 52
`
`
`
`THE ADDISON-WESLEY NETWORKING BASICS SERIES
`
`
`
`
`Designing
`Storage Area
`Ne works
`
`
`
`
`
`
`
`
`
`
`
` yifrIi
`
`
`
`
`
`VMWARE-1011 I Page 2 of 52
`
`VMWARE-1011 / Page 2 of 52
`
`
`
`
`
` A. x
`
`we?
`
`The Addison-Wesley Networking Basics Series
`
`The Addison—Wesley Networking Basics Series is a set of concise, hands-on guides to
`today’s key technologies and protocols in computer networking. Each book in the
`series covers a focused topic and explains the steps required to implement and work
`with specific technologies and tools in network programming, administration, and
`security. Providing practical, problem—solving information, these books are written by
`practicing professionals who have mastered complex network challenges.
`
`Tom Clark, Designing Storage Area Networks: A Practical Reference for
`Implementing Fibre Channel SANS, 0—201—61584-3
`
`Gary Scott Malkin, RIP: An Intra-Domain Routing Protocol, 0-201-43320—6
`
`Geoff Mulligan, Removing the Spam: Email Processing and Filtering,
`0—201—37957-0
`
`Alvaro Retana, Russ White, and Don Slice, EIGRP for IP: Basic Operation
`and Configuration, 0—201—65773—2
`
`Richard Shea, LZTP: Implementation and Operation, 0—201-60448—5
`
`John W Stewart III, BGP4: Inter-Domain Routing in the Internet,
`0-201—37951-1
`
`Brian Tung, Kerheros: A Network Authentication System,
`0-201—37924—4
`
`Andrew F. Ward, Connecting to the Internet: A Practical Guide about
`LAN—Internet Connectivity, 0-201—37956-2
`
`Visit the Series Web site for new title information:
`
`http://www.awl.com/cseng/networkingbasics/
`
`
`
`VMWARE-1011 I Page 3 of 52
`
`VMWARE-1011 / Page 3 of 52
`
`
`
`THE ADDISON-WESLEY NETWORKING BASICS SERIES
`
`
` Designing
`torage Area
`
`Net or
`
`
`
`
`
`
`
`
`
`
`
`A Practical Reference
`
`for Implementing
`
`Fibre Channel SANS
`
`
`
`
`Tom Clark
`
`A V
`
`V
`
`y
`
`Addison-Wesley
`An Imprint ofAddison Wesley Longman, Inc.
`Reading, Massachusetts 0 Harlow, England ' Menlo Park, California
`Berkeley, California 0 Don Mills, Ontario 0 Sydney 0 Bonn
`Amsterdam ' Tokyo 0 Mexico City
`
` i
`
`l ll l V
`
`MWARE-1011 I Page 4 of 52
`
`VMWARE-1011 / Page 4 of 52
`
`
`
`
`
`Many of the designations used by manufacturers and sellers to distinguish their products
`are claimed as trademarks. Where those designations appear in this book, and Addison
`- Wesley Longman, Inc. was aware of a trademark claim, the designations have been
`printed in initial capital letters or in all capitals.
`
`The author and publisher have taken care in the preparation of this book, but make no
`expressed or implied warranty of any kind and assume no responsibility for errors or
`omissions. No liability is assumed for incidental or consequential damages in connection
`with or arising out of the use of the information or programs contained herein.
`
`The publisher offers discounts on this book when ordered in quantity for special sales.
`For more information, please contact:
`
`AWL Direct Sales
`Addison Wesley Longman, Inc.
`One Jacob Way
`Reading, Massachusetts 0 1 8 67
`(781) 944-3700
`
`Visit A—W on the Web: www.awl.com/cseng/
`
`Library of Congress Camloging—in—Publication Data
`Clark, Tom, 1947
`Designing storage area networks : a practical reference for
`implementing Fibre Channel SANS / Tom Clark.
`p.
`cm. — (The Addison-Wesley networking basics series)
`Includes bibliographical references.
`1. Computer networks.
`2. Information storage and retrieval
`systems.
`3. Internetworking (Telecommunication)
`I. Title.
`II. Series.
`TK5105.5.C547
`004.6—dc21
`
`1999
`
`'
`
`99—33181CIP
`
`Copyright © 1999 by Addison Wesley Longman, Inc.
`
`All rights reserved. No part of this publication may be reproduced, stored in a retrieval
`system, or transmitted, in any form, or by any means, electronic, mechanical, photo-
`copying, recording, or otherwise, Without the prior consent of the publisher. Printed in
`the United States of America. Published simultaneously in Canada.
`
`Text printed on recycled and acid-free paper.
`ISBN 0201615843
`
`3 4 5 67 8 CR
`
`03020100
`
`3rd Printing March 2000
`
`
`
`
`
`VMWARE-1011 I Page 5 of 52
`
`VMWARE-1011 / Page 5 of 52
`
`
`
`
`
`Preface
`
`Chapter 1: Introduction
`
`1.1
`1.2
`
`1.3
`
`A Paradigm Shift
`Text Overview
`
`Summary
`
`Chapter 2: Storage and Networking Concepts
`
`2.1
`
`Networking in Front of the Server
`2.1.1
`Serial Transport
`2.1.2
`Access Method
`
`2.1.3 Addressing
`2.1.4
`Packetizing ofData
`2.1.5
`Routing ofPackets
`2.1.6
`Upper-Layer Protocol Support
`Traditional SCSI Bus Architecture
`
`Network-Attached Storage
`Networking behind the Server
`Summary
`
`'
`
`2.2
`
`2.3
`2.4
`2.5
`
`Chapter 3: Fibre Channel Internals
`3.1
`Fibre Channel Layers
`3.2
`Gigabit Transport
`3.3
`Physical-Layer Options
`3.4
`Data Encoding
`3.5
`Ordered Sets
`3.6
`Framing Protocol
`3.7
`Class of Service
`3.8
`Flow Control
`3.9
`Name and Addressing Conventions
`3.1 0
`Summary
`
`.
`
`’
`
`Contents
`
`ix
`
`1
`
`1
`4
`
`6
`
`7
`
`9
`13
`13
`
`13
`14
`14
`14
`15
`
`17
`19
`20
`
`23
`23
`24
`26
`30
`‘ 33
`34
`36
`39
`40
`43
`
`{
`l
`l
`‘
`
`l
`
`l
`l
`l
`‘
`
`l
`5
`l
`'l
`
`
`
`
`
`VMWARE-1011 I Page 6 of 52
`
`VMWARE-1011 / Page 6 of 52
`
`
`
`
`
`vi
`
`Contents
`
`Chapter 4: SAN Topologies
`
`3 4.1
`4.2
`
`4.3
`
`4.4
`4.5
`
`3
`3
`E
`
`[
`3
`‘3
`"
`
`Point-to-Point
`Arbitrated Loop
`4.2.1
`Loop Physical Topology
`4.2.2
`Loop Addressing
`4.2.3
`Loop Initialization
`4.2.4
`Port Login
`4.2.5
`Loop Port State Macbine
`4.2.6 Arbitration
`
`4.2.7 Nonbroadcast Nature ofArbitrated Loop
`4.2.8 Design Considerations for Arbitrated Loop
`Fabrics
`4.3.1
`4.3.2
`
`Fabric Login
`Simple Name Server
`
`State Cbange Notification
`4.3.3
`Private Loop Support
`4.3.4
`Fabric Zoning
`4.3.5
`Building Extended SANS
`Summary
`
`Chapter 5: Fibre Channel Products
`
`5.1
`5 .2
`5.3
`
`5.4
`5.5
`
`5 .6
`5 .7
`
`5 .8
`5.9
`
`Gigabit Interface Converters (GBICS)
`Host Bus Adapters
`Fibre Channel RAID
`
`Fibre Channel JBODs
`Arbitrated Loop Hubs
`5.5.1
`Star Topology for Arbitrated Loop
`5.5.2 Hub Arcbitecture
`
`‘
`
`Unmanaged Hubs
`5.5.3
`5.5.4 Managed Hubs
`Switching Hubs
`Fabric Switches
`
`'
`
`Fibre Channel—to—SCSI Bridges
`SAN Software Products
`
`5.10
`
`Summary
`
`47
`
`47
`49
`50
`52
`55
`62
`63
`64
`
`66
`68
`75
`79
`80
`
`81
`82
`84
`85
`86
`
`89
`
`90
`92
`95
`
`99
`101
`102
`103
`
`107
`108
`1 12
`1 13
`
`116
`117
`
`121
`
`
`
`VMWARE-1011 I Page 7 of 52
`
`VMWARE-1011 / Page 7 of 52
`
`
`
` WFWWWM1‘9“?»me
`Contents
`vii
`
`Chapter 6: Problem Isolation in SANS
`
`6.1
`6.2
`6.3
`
`Simple Problem-Isolation Techniques
`Fibre Channel Analyzers
`Summary
`
`Chapter 7: Management of SANs
`
`7.1
`
`7.2
`7.3
`7.4
`
`7.5
`
`Storage Network Management
`7.1.1
`In-Band Management
`7.1.2 Out-of-Band Management
`7.1.3
`SNMP
`7.1.4 HTTP
`7.1.5
`TELNET
`
`Storage Network Management Issues
`7.1.6
`Storage Resource Management
`Storage Management
`Storage, Systems, and Enterprise Management
`Integration
`Summary
`
`Chapter 8: Application Studies
`8.1
`Full-Motion Video
`
`Prepress Operations
`8.2
`LAN~Free and Server-Free Tape Backup
`8.3
`Server Clustering
`8.4
`Internet Service Providers
`8.5
`Campus Storage Networks
`8.6
`Disaster Recovery
`8.7
`Summary
`8.8
`Chapter 9:..Fibre Channel Futures
`
`9.1
`
`Bandwidth
`
`9.2
`
`9.3
`9.4
`9.5
`
`9.6
`
`Fibre Channel over Wide Area Networking
`
`Coexistence within Enterprise Networks
`Interoperability
`Total SAN Solutions
`
`Summary
`
`125
`
`126
`130
`133
`
`135
`
`137
`138
`139
`140
`143
`144
`
`144
`146
`147
`
`148
`149
`
`151
`151
`
`154
`156
`161
`163
`165
`167
`168
`171
`
`171
`
`172
`
`172
`175
`176
`
`176
`
`2
`
`f
`
`1
`
`1
`1
`
`1
`
`1
`
`1 1
`
`1
`1
`
`1
`1
`1
`1
`1
`
`1
`
`
`
`
`
`VMWARE-1011 I Page 8 of 52
`
`VMWARE-1011 / Page 8 of 52
`
`
`
`
`
`
`
`
`
`i
`
`E
`
`viii
`
`Appendix A: Fibre Channel Ordered Sets
`
`Appendix B: Fibre Channel Vendors
`Bibliography
`Glossary
`
`Index
`
`
`
`
`Contents
`
`1 79
`
`183
`187
`189
`
`195
`
`i
`1
`
`\
`
`VMWARE-1011 I Page 9 of 52
`
`VMWARE-1011 / Page 9 of 52
`
`
`
`
`
`
`
`
`
`
`
`SAN Topologies
`
`Fibre Channel architecture has evolved three distinct physical topolo-
`gies. The first SANS were built on a dedicated, point-to-point connec—
`tion between two devices. Participants in a point-to-point topology
`establish an initial connection via login and then assume full-bandwidth
`availability. Arbitrated Loop allows more than two devices to commu-
`nicate over a shared bandwidth. An initiator in a loop environment
`must negotiate for access to the media before launching a transaction.
`A Fibre Channel fabric topology provides multiple, concurrent, point-
`to-point connections Via link—level switching and so entails a much
`higher level of complexity, both in the physical configuration and in
`transport protocol.
`Depending on application requirements, all three topologies may
`be viable for SAN design. Although point-to-point configurations are
`more restrictive, Arbitrated Loop and fabrics offer a wide range of so—
`lutions for a variety of application needs. By incorporating application-
`specific integrated circuits (ASICs), switch and loop hub prices have
`declined while functionality has increased. And by providing attach-
`ment of loop topologies to fabric ports, it is now possible to design
`and implement complex SANS that efficiently service multiple, some-
`times contending, applications. This was not the case just a few
`years ago.
`
`
`
`4.1 Point-to-Point
`
`A point-to-point topology is a simple, direct connection between two
`N_Ports. As shown in Figure 4-1, the transmit lead of one N_Port is
`
`47
`
`
`
`VMWARE-1011 I Page 10 of 52
`
`VMWARE-1011 / Page 10 of 52
`
`
`
`
`
` ‘
`
`i}
`
`,
`
`l
`
`
`48
`Desagning Storage Area Networks
`
`
`
`
`
`Server
`
`Disks
`
`Figure 4-1 A point-to-point link between a server and a dis/a
`
`connected via copper or optical cabling to the receive lead of its part-
`ner. The partner’s transmit, in turn, is cabled to the other N_Port’s
`receive lead. This cabling scheme creates dedicated bandwidth between
`the pair, typically IO/OMBps in each direction.
`Before data transactions can occur, the two N_Ports must perform
`an N_Port login to assign N_Port addresses, or N_Port IDs. Thereafter,
`a persistent connection is maintained, with utilization of the dedicated
`link determined by the application.
`Although it is conceivable to have an application requiring simul-
`taneous full—duplex transfers, that is, 200MBps total throughput, in
`practice only one side of the link sees any real traffic at a given
`moment. A server and a disk in a point-to-point configuration would
`normally be performing either reads or writes of data but not both
`concurrently. From the server’s standpoint, an ongoing read operation
`would initiate incoming frames on the server’s receiver, with only ACKs
`(for Class 1 or 2 service) leaving the server’s transmitter. Even then, the
`100MBps available bandwidth on the server’s receive link is not likely
`to saturate. Link utilization in point-to-point configurations is deter-
`mined by the performance of the Fibre Channel controllers at either
`end and the buffering available to queue up data to be transmitted or
`received.
`
`The original point-to-point configurations were based on quarter—
`speed, 266Mbps, bandwidth, with an effective throughput of 25MBps.
`Distributed primarily by Sun Microsystems, quarter-speed imple-
`mentations used fiber—optic GLMs as the transceiver interface to the
`host bus adapter and disk controller logic. Tens of thousands of these
`systems have been shipped to customers over the past several years,
`creating a large but unadvertised base of Fibre Channel. products in
`
`it?;.
`
`ii
`i
`{
`L
`
`
`
`VMWARE-1011 I Page 11 of 52
`
`VMWARE-1011 / Page 11 of 52
`
`
`
`
`
`Chapter 4: SAN Topologies
`
`49
`
`production environments. This legacy base has provided valuable
`experience for improving both the physical transport and protocol
`support.
`As server and disk performance have increased, Fibre Channel
`throughput has superseded quarter and half speed and now provides
`full gigabit bandwidth. At the same time, advances in fabric and Arbi-
`trated Loop technology have enabled more flexibility and functional-
`ity than point to point provides. A point-to-point configuration is still
`viable for simple configurations; for growth of the SAN, however, it is
`important to select the proper HBA and controller components. If the
`vendor includes device drivers or microcode for both point—to-point
`protocol and Arbitrated Loop, accommodating additional devices on
`the SAN can be accomplished with minimal pain.
`
`
`4.2 Arbitrated Loop
`
`Arbitrated Loop is the most commonly deployed topology for Fibre
`Channel SANs. Loops provide more flexibility and support for more
`devices than does point to point and are more economical per port
`than are fabric switches. Loop-capable HBAs, Fibre Channel disks,
`and Fibre Channel—to—SCSI bridges are also more prevalent than
`fabric-capable devices, primarily because most of these devices have
`already passed through a development and interoperability cycle and
`have emerged as more stable products.
`Arbitrated Loop is a shared, gigabit transport. Like shared Ether-
`net or Token Ring segments, the functional bandwidth available to any
`individual loop device is determined by the total population on the seg—
`ment and the level of activity of the other participants: more active
`talkers, less available bandwidth. An Arbitrated Loop with 50 equally
`active nodes, for example, would provide 100MBps/50, or only
`2MBps functional bandwidth per node. Arbitrated Loop would there-
`forenot be a popular choice for SANS were it not for the fact that a
`typical storage network has relatively few active contenders for band—.
`width. Although a single loop may have more than a hundred disk
`drives, there are usually no more than four to six servers initiating
`requests to those drives. Large configurations are thus possible on a
`single loop without dividing the bandwidth down to the level of ordi—
`nary Ethernet.
`
`
`
`iir liii
`
`
`
`
`
`VMWARE-1011 I Page 12 of 52
`
`VMWARE-1011 / Page 12 of 52
`
`
`
`
`
`
`50
`Designing Storage Area Networks
`
`Since the transport is shared, some means must be provided for
`orderly access to the media. In Arbitrated Loop, media access is gained
`through an arbitration protocol. Once an NL_Port has arbitrated and
`won control of the transport, it has the full 100MBps bandwidth avail—
`able for its transaction. When the transaction is complete, the NL_Port
`closes the temporary connection, making the transport available to
`others.
`
`4.2.1 Loop Physical Topology
`
`
`
`Arbitrated Loop is a true physical loop, or ring, created by tying the
`transmit lead of one NL_Port to the receive lead of its downstream
`neighbor. The neighbor’s transmit is, in turn, connected to the receiver
`of yet another NL~Port, and so on, until the circle completes at the
`original N,4_Port’s receiver. In this way, a continuous data path exists
`through all the NL_Ports, allowing any device to access any other
`device on the loop, as illustrated in Figure 4—2.
`The first Arbitrated Loops were built in this fashion, using copper
`or fiber-optic cabling to create the daisy chain of NL’~Ports. Several
`problems quickly arose. Powering off or disconnecting a single node
`would break the chain and thus crash the loop. A break in cabling or
`faulty transceiver anywhere along the loop would also halt loop traffic
`and entail tedious troubleshooting to locate the problem. Similar to the
`problems encountered in hardwired Token Ring topologies, the over—
`head and risks associated with dispersed loop cabling promoted the
`development of centralized Arbitrated Loop hubs.
`Arbitrated Loop hubs provide a physical star topology for a loop
`configuration, bringing each NL_Port’s transmit and receive leads to a
`common location. The internal architecture of a hub completes the
`connections between transmitters and receivers on a port-by-port basis
`via mux (multiplexer) circuitry and finishes the loop by connecting the
`transmitter of the last hub port (for example, port 12) to the receiver
`of the first (for example, port 1). One of the most useful features of a
`hub is bypass circuitry at each port, which allows the loop to circum—
`vent a disabled or disconnected node while maintaining operation.
`Most unmanaged Arbitrated Loop hubs also validate proper gigabit
`signaling before allowing a device to insert into the loop, whereas
`managed hubs provide additional functionality. These features will be
`described in more detail in Chapter 5.
`
`
`
`VMWARE-1011 I Page 13 of 52
`
`VMWARE-1011 / Page 13 of 52
`
`
`
`
`
`‘
`
`i
`
`‘
`
`
`
`‘R
`
`eceive
`
`Transmit
`
`
`Transmit ~ Receive
`
`Figure 4-2 A daisy chain Arbitrated Loop
`
`Since an Arbitrated Loop hub supplies a limited number of ports,
`building larger loops may require linking multiple hubs. This is called
`hub cascading. As shown in Figure 4-3, a cascade is simply a normal
`cable connection between a port on one hub and a port on another. No
`special cable is required, although to minimize potential ground loop
`and noise problems, fiber—optic cabling is recommended over copper.
`Cascading consumes one port on the first and last hubs in a chain and
`two ports on intervening hubs. Cascading four 6-port hubs, for exam—
`ple, would yield 18 usable ports, with a fourth of the total ports sacri-
`ficed to achieve the cascade. Depending on the vendor, hubs can be
`cascaded to the Arbitrated Loop maximum of 127 ports, although the
`advisability of doing so should be application-driven. Just because
`some configurations can be built does not mean that they should be.
`Cascading one hub to another extends the loop through the addi-
`tional ports on the downstream hub. A similar effect is achieved by
`inserting a JBOD (just a bunch of disks) into a hub port. Although the
`link between the hub and the JBOD consists of a single cable pair, the
`JBOD itself comprises of a series of Arbitrated Loop disks daisy
`chained (transmit to receive) together. The transmit and receive leads
`of the JBOD interface to the hub represents not a single NL_Port but
`an entire cluster, or loop segment, of multiple NL_Ports. The loop is
`
`
`
`VMWARE-1011 I Page 14 of 52
`
`
`
`Chapter 4: SAN Topologies
`
`51
`
`Transmit
`
`-
`
`Receive
`
`
`
`VMWARE-1011 / Page 14 of 52
`
`
`
`
`
`
`
`
`
`
`
`Designing Storage Area Networks
`
`Disks
`
`.
`
`h
`
`
`
`
`
`
`
`
`
`
`
`L!_------- J
`Loop
`
`
`
`Loop
`
`--------
`
`l L—
`
`
`
`
`
`
`
`
`
`
`
`
`
`52
`
`
`
`!
`E
`
`Figure 4-3 Cascaded Arbitrated Loop hubs
`
`thus extended through the JBOD enclosure and the loop population
`increased by the number of drives in the JBOD chassis. This is an
`important consideration when calculating hub port requirements and
`optimal loop size.
`Arbitrated Loop standards provide address space for up to 127
`(126 NL_Ports and 1 FL_Port) devices on one loop. Fibre Channel
`specifications allow for 10-km runs over single—mode cabling and long-
`wave fiber-optic transceivers. The reader is advised not to combine
`these two concepts. Even a few 10-km links on a single loop can
`severely impede loop performance, since each 10-km link incurs a 50-
`microsecond propagation delay in each direction. Every transaction on
`the loop would have to traverse the extended links, multiplying the
`effect of each transit delay by the number of transactions. Long-haul
`requirements for disaster recovery or campus networks are better
`served withrdedicated switch ports, with attached loops at either end
`for local traffic.
`
`4.2.2 Loop Addressing
`
`An NL_Port, like an N_Port, has a 24-bit port address. If no switch
`connection exists, the upper 2 bytes of this port address are zeroed to
`x' 00 OO '. This is referred to as private loop, since devices on the loop
`have no connection to the outside world. If the loop is attached to a
`fabric and an NL_Port supports fabric login, the upper 2 bytes (and
`
`
`
`
`
`
`
`
`
`VMWARE-1011 I Page 15 of 52
`
`VMWARE-1011 / Page 15 of 52
`
`
`
`
`
`
`
`
`
`
`
`
`
` W A v _ . a mum”. «7- t.,/J,,/m,mrf. ,7..wm..¢,\..,w,._wa maxwwmm M, sevvmnmsmwx Mafia 4, wave“...
`
`
`
`
`
`Chapter 4: SAN Topologies
`
`53
`
`‘ possibly the last byte) are assigned a positive value by the switch. This
`mode is called public loop, since fabric-capable NL_Ports are members
`of both a local loop and a greater fabric community and need a full
`24-bit address for identity in the network. In the case of public loop
`assignment, the value of the upper 2 bytes represents the loop identi-
`fier and would be common to all NL_Ports on the same loop that per-
`formed login to the fabric.
`.
`In both public and private Arbitrated Loops, the last byte of the
`24-bit port address is referred to as the Arbitrated Loop Physical
`Address, or ALHPA. The AL_PA is acquired during initialization of the
`loop and may, in the case of fabric-capable loop devices, be modified
`by the switch during login. The 1—byte AL_PA provides a very compact
`addressing scheme and allows a device’s identity to be included as part
`of a 4—byte ordered set. In fact, an ordered set may include two
`AL_PAs, identifying both source and destination devices. The ordered
`set for Open Full Duplex, for example,
`is “K28.5 D17.4 AL_PD
`ALHPS,” with AL_PD representing the destination address and AL_PS
`representing the source address.
`The total number of AL_PAs available for Arbitrated Loop
`addressing is 127. This number was not determined by rigorous per-
`formance testing on assorted loop topologies; nor was it calculated on
`theoretical throughput given various loop populations. It is based
`instead on the requirements of 8b/1 Ob running disparity between
`frames. As a frame terminates with an end—of—frame character, the EOF
`forces the current running disparity negative. By Fibre Channel stan—
`dard, each transmission word between the end of one frame and the
`beginning of another should also leave the running disparity negative.
`
`This function is provided by the IDLE ordered set, which has a fixed
`format of K28 . 5 D21.4 D21. 5 D21 . 5. The special K28 . 5 leaves run—
`ning disparity positive. The D21 . 4 leaves the running disparity nega-
`tive. The D21 . 5 characters used for the last 2 bytes are neutral
`disparity. The net result is a negative running disparity at the end of the
`
`IDLE transmission word.
`‘
`
`Since the loop-specific ordered sets may include AL_PAs in the last
`2 byte positions, negative running disparity is facilitated if these values
`are neutral. In the Open Full Duplex ordered set cited previously, for
`example, the D17 . 4 character following the special K2 8 . 5 would leave
`the running disparity negative. If the destination and source AL_PAs
`
`
`
`VMWARE-1011 I Page 16 of 52
`
`VMWARE-1011 / Page 16 of 52
`
`
`
`
`
`
`
`
`
`
`
` W2®WW<T¢w r’W‘Jg\SW252«‘4».>>\V,.\\\\\\x\\7swswm>\\\>~\\:\_\\\\\\¢J;s\o\\x\\~x\fl\zwhxrmMamie mmcwutmm“we“. «m , w w H. wmw.
`
`
`. mmmwmm
`
`54
`
`Designing Storage Area Networks
`
`are neutral disparity, the Open transmission word will leave the run—
`ning disparity negative. This satisfies the requirement for the next start
`of frame (SOF).
`
`If all 256 possible 8-bit bytes are dispatched to the 8b/1 Ob encoder,
`134 will emerge with neutral disparity characters. Fibre Channel
`claims some of these for special purposes. The remaining 127 neutral
`disparity characters have been assigned as AL_PAs.
`The number “127” is thus not a recommended load for a
`
`100MBps shared transport. It is simply the maximum number (minus
`reserved values) of neutral disparity addresses that could be assigned
`for loop use. At higher Fibre Channel speeds, such as 4OOMBps, 127
`active loop participants may be quite reasonable or even considered
`inadequate for some needs.
`Since the AL_PA values are determined on the basis of neutral dis—
`
`parity, a listing of hex values of AL_PAs seems to jump randomly over
`some byte values and not others. Listed sequentially, the hex value of
`AL_PAs would begin oo, 01, 02, 04, 08, OF, 10, 17, 18, 1B,
`1D,
`.
`.
`.
`. The gaps in the list represent byte values that, after 8b/10b
`encoding, result in nonneutral disparity characters. This is significant
`for some Fibre Channel disk drives, which allow the user to set
`
`jumpers or dip switches on a controller card to assign a fixed AL_PA
`manually. Typically, the jumper positions correspond only to an index
`of AL_PA values, not to actual hex values, which is the case with most
`
`network equipment.
`Arbitrated Loop assigns priority to AL_PAs, based on numeric
`value. The lower the numeric value, the higher the priority. AL_PA pri—
`ority'is used during arbitration to give advantage to initiators, such as
`file servers and fabric loop ports. An FL_Port by default has the
`address x' 00 ' , which gives it the highest priority over all other
`NL_Ports. When arbitrating against other devices for access to the
`loop, the FL_Port will always win. This helps ensure that a valuable
`resource, such as a switch, can quickly service the loop and then return
`to fabric duties. During address selection, as shown in Figure 4-4,
`servers typically attempt to take the highest-priority, lowest-value
`AL_PAs, whereas disk arrays take lower-priority, higher—value AL_PAs.
`A server with an AL_PA of x' 01' will have a statistically higher
`chance of winning arbitration against
`lower-priority contenders,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`VMWARE-1011 I Page 17 of 52
`
`VMWARE-1011 / Page 17 of 52
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Figure 4-4 AL_PA assignment on a small loop
`
`although Arbitrated Loop also provides safeguards against starvation
`of any port.
`
`An NL_Port’s AL_PA may change with every initialization of the
`loop or reset of the device. On the surface, this may seem disruptive,
`but dynamic address assignment by the topology itself greatly reduces
`administrative overhead. As anyone who has had to reconfigure an IP
`network can testify, offloading low-level address administration to the
`topology is highly desirable. Arbitrated Loop initialization guarantees
`that each attached device will have a unique AL_PA. Potential address-
`ing conflicts are possible only when two separate loops are joined
`together—for example, by cascading two active hubs—without initial-
`ization. Some hub vendors have responded to this problem by incor—
`porating an initialization sequence whenever a cascade condition is
`sensed.
`
`4.2.3 Loop Initialization
`
`Loop initialization is an essential process for allowing new participants
`onto the loop, assigning AL_PAs, providing notification of topology
`changes, and recovering from loop failure. Following loop initializa-
`tion, the loop enters a stable monitoring mode and begins (or resumes)
`
`normal activity. Depending on the number of NL_Ports attached to
`the loop, an entire loop initialization sequence may take only a few
`
`milliseconds. For Sun Solaris servers, a loop initialization may result
`
`
`
`
`VMWARE-1011 I Page 18 of 52
`
`VMWARE-1011 / Page 18 of 52
`
`
`
`
`
`_
`
`i:
`
`l1
`l
`‘
`
`E
`
`“
`
`1
`
`i
`:
`
`l
`l
`‘
`
`
`
`
`
`56
`
`Designing Storage Area Networks
`
`in a message posted to the event log. For NT servers, it is largely
`ignored. In either case, a loop initialization on an active loop normally
`causes a brief suspension of activity, which resumes once initialization
`is complete.
`A loop initialization may be triggered by a number of causes, the
`most common being the introduction of a new device. The new device
`could be a former participant that has been powered on or an active
`device that has been moved from one hub port to another.
`A number of ordered sets has been defined to cover the various
`conditions that an NL_P0rt may sense as it launches the initialization
`process. These ordered sets are Loop Initialization Primitive sequences
`and are referred to collectively as LIPS. An NL_Port issues at least 12
`LIPs to start loop initialization. In the following examples, we will as-
`sume a Fibre Channel host bus adapter installed in a file server.
`
`0 An HBA that is attached to an active loop and is power cycled
`will, on bootup, start processing the incoming bit stream. The
`presence of valid signal and protocol verifies that the server is
`on an active loop. Because the server was powered down,
`however, the HBA has lost the AL_PA that it was previously
`assigned. That previously assigned AL_PA was stored in a
`temporary register in the HBA, and the register was wiped
`clean by the power cycle. The HBA immediately begins
`transmitting LIP(F7, F7) onto the loop. The xF7 is a reserved,
`neutral disparity character. The first occurrence of xF7 indicates
`that the HBA recognizes that it is on an active loop. The second
`xF7 indicates that the HBA has no AL_PA.
`
`0 An HBA that is attached to an active loop is moved from one
`hub port to another. As the cable is unplugged from the hub
`and moved to the other port, the HBA temporarily loses Fibre
`Channel signal. On reinsertion, the HBA sees valid signal
`return and begins processing the bit stream. In this instance, the
`HBA still has its previously assigned AL_PA and so begins
`transmitting LIP(F7, AL_PS) onto the loop. The XF7 indicates
`that the HBA sees the active loop. The ALHPS is the source
`'
`AL_PA of the LIP, that is, the HBA’s previously assigned
`AL_PA. In this example, the HBA is not issuing LIPs in order to
`
`
`
`3
`l
`
`I
`
`
`
`VMWARE-1011 I Page 19 of 52
`
`VMWARE-1011 / Page 19 of 52
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
` mv/ w . . » a, ,t w c.
`
`
`
`
`
`
`
`
`
`Chapter 4: SAN Topologies
`
`57
`
`acquire an address but to notify the loop that a topology change
`has occurred.
`
`0 The receiver of the HBA or the receive cable is broken, and the
`server has been power cycled. In this instance, the HBA does
`not see a valid signal on its receiver and assumes that a loop
`failure has occurred. It also does not recall its previously
`assigned AL_PA. The HBA therefore starts streaming LIP(FS,
`F7) onto the loop. The XF8 is another reserved, neutral
`disparity character that is used to indicate a loop—down state.
`The XF7 indicates that the HBA has no AL_PA.
`
`0 In the same scenario, if the HBA still has a previously assigned
`AL_PA, it will issue a LIP(FS, AL_PS). The XF8 indicates that
`the HBA senses loop failure. The AL_PS is the source AL_PA of
`the alert.
`
`Of the conditions listed, the most insidious for Arbitrated Loop
`environments is the LIP(F8) stream. A node issuing a normal LIP(F7)
`will trigger, at most, a temporary suspension of loop operations until
`the initialization process is completed. A node issuing LIP(F8)s, how-
`ever, will continue streaming loop—down alarms as long as it cannot
`recognize loop activity on its receiver. If the node’s transmitter is con—
`nected to an active loop, all NL_Ports will enter a suspended initializa—
`tion state and continue to forward the offender’s LIP(F 8) stream, as
`shown in Figure 4-5. Normal loop initialization cannot complete, and
`the loop in fact fails. This has been another challenge for Arbitrated
`Loop hub vendors. Some have responded with autorecovery policies
`that automatically bypass a port that is streaming LIP(F8)s.
`In addition to loss of signal, an NL_Port may LIP(F8) if no valid
`ordered sets are present on the loop. This may occur if an upstream
`node is corrupting the bit stream, due to excessive jitter or malfunction
`of processing logic. Other conditions may trigger LIPs, including a
`node’s inability to arbitrate successfully for loop access. Arbitrated
`Loop provides a fairness algorithm for media access, but if a partici-
`pant is not playing fairly, others on the loop may LIP to reinitialize a
`level playing field. Arbitrated Loop also provides a selective reset LIP
`that is directed by one NL_Port to another. How the reset is imple-
`mented is vendor—specific, but the selective reset LIP(AL_PD, AL_PS)
`
`
`
`VMWARE-1011 I Page 20 of 52
`
`VMWARE-1011 / Page 20 of 52
`
`
`
`
`
`58
`
`Designing Storage Area Networks
`
`LIP(F8, F7)s
`
`
`LIP(F8, F7)s
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Figure 4-5 An NL_Port streaming LIP(F8)s onto the loop
`
`Disk
`
`Disk
`
`may cause the target device to reboot. This allows one NL_P0rt to
`force a misbehaving NL_Port into a known good state.
`The loop initialization process begins when an NL_P0rt streams at
`least 12 LIPs onto the loop. As each downstream device receives the
`LIP stream, the device enters a state known as Open—Init, which sus-
`pends any current operations and prepares the device for the loop ini-
`tialization procedure. The LIPS are forwarded along the loop until all
`NL_Ports, including the originator, are in an Open—Init condition.
`At this point, the NL_Ports need someone to be in charge. Unlike
`Token Ring, Arbitrated Loop has no permanent master to monitor the
`topology. Loop initialization therefore provides a selection process to
`determine which device will be the temporary loop master. Once
`selected, the loop master is responsible for conducting the rest of the
`initialization procedure and returning the loop to normal operation.
`See Figure 4—6.
`The loop master is determined by a subroutine known as the Loop
`Initialization Select Master procedure, or LISM. Each loop device vies
`for the position of temporary master by continuously issuing LISM
`
`