CMG Home

Site Map Links Members Only National CMG Groups Measure IT International Conference

MeasureIT
 In This Issue
 
From the Editors

Articles >

Forecast Generation

I/O Virtualization

Measurement for Maturity (Part 2)

Capacity Utilisation

CMG News >

'07 Program Update

Press Release (05/31/2007)

Press Release (06/18/2007)

Region News >

Philadelphia

New York

Events >

Calendar

 Article Database
 Resources
 Industry Articles
 Submit Article
 SubscribeIT
 RemoveIT
 Letter to Editor
 About MeasureIT
 Contact Us
 
MeasureIT

A Review of Resilient Storage Networks: Designing Flexible Scalable Data Infrastructures
by Greg Schulz
November, 2006
Reviewed by Michael S. Hines

About the Author
Mike Hines, Purdue University

Mike has 34 years of experience in data processing related jobs, mostly with mainframe computing. For 5 of those years he has been responsible for performance management and capacity planning at Purdue. He helped establish the first capacity planning and performance monitoring role in ADPC (now ITaP). He has served as a reviewer and session coordinator at CMG National.

[Hide]

Publisher: Elsevier, Digital Press, 2004, 441 pp
ISBN: 1565921496

If you are presently involved in using Storage Networks, Mr. Schulz gives us a lot of factors to consider in management of Storage Networks. If you are not involved in Storage Networks, just wait. I predict your time will come (perhaps sooner rather than later, and sooner than you think!) Either way, this book is packed full of valuable information on Storage and Networks and Storage Networks which are Storage + Networks. It's not the kind of book you read once and put on a shelf. It's the type of book you would reach for on occasion as you are evaluating your Storage Network for improvements in performance or management. If (When?) you are at the design stages of Storage Networking, this resource should be in your hands all the time - the checklists and guidance in the design process are invaluable.

This book is organized into four sections. Part 1 makes the case for Resilient Network storage and describes Information Life Cycle Management (ILM). Part 2 describes various ways storage is connected to servers. Part 3 gets into the details of Storage Network design. Performance and Capacity are impacted by design. Part 4 entitled "Putting It All Together" provides examples of Storage Networks for Small Systems, Consolidated Storage Network systems, Metropolitan (distances of 100-150 km) Storage Network systems, Wide Area Storage Network Systems, Large Storage Network Systems and High-Performance Storage Network systems. All through the book there are discussions about hardware, software and data networks - and how they work together to provide Storage Networks.

Part I. Why Build Resilient Networks?

Chapter 1 is about the importance of information. Information is an organizations resource, and as such is an asset to the organization. Information needs to be (1) available when needed, and (2) protected for (a) access to the appropriate parties, (b) secured from inappropriate parties, and (c) copied and stored off site for recovery purposes.

The author states that storage is increasing 80 times as fast as it did 10 years ago. The majority of the increase occurred in open systems, with legacy environment's growth typically flat.

In order to build an ILM, one must determine how data is created, stored, accessed, retained, and retired. Equipped with this data, different classes of data can be moved to different tiers of storage. The ILM is also affected by regulatory and contractual requirements, acts of nature, acts of man (accidental or intentional), technology failure, and event chains of events and fault containment (‘the domino effect'). Availability also determines the structure of data access. In the event of a failure with no backup, it could take days or weeks to recover. If recovery must occur in seconds, then live backup in dispersed locations must occur. This rapid recovery comes with high cost and system complexity. Availability should be spelled out in a Service Level Agreement (SLA).

Chapter 2 covers data storage and access fundamentals - how data is accessed and organized. This includes units of measure from bytes (8 bits) to Zetabyte (1 Zetabyte = 1000 Exabyte; 1 Exabyte = 1 Million Terabyte; 1 Petabyte = 1000 Terabyte; 1 Terabyte = 1000 Gigabyte; 1 Gigabyte= 1000 Megabyte; 1 Megabyte = 1 million bytes). (so 1 Exabyte (EB) = 1 x 1021 bytes)

Classes of storage fall into the following categories: spinning disk media (fastest access, most expensive), optical media (slower access, less expensive), magnetic tape media (slowest access, least expensive).

Part II. Networking with Your Storage

Chapter 3 discusses the ways storage can be interconnected and connected to servers (networks and I/O channels). It also discusses interconnections between storage devices. Once storage is dispersed and connected with networks, network latency can become an issue. Various network interfaces and performance figures are provided in the chapter.

The three types of data access are discussed: block access (SAN, FAS, DAS); file access (NAS, file/data sharing); and object accessible storage (CAS). SAN and NAS both are a part of storage networking.

The typical connection scheme is server to storage device. But you can also connect device to device to provide for distributed mirroring of data. You can also connect server to server to provide high speed channel to channel communications. Finally the chapter ends with the trade-offs between localized vs. remote access.

Next, interfaces and performance are discussed. Interface connectors on the server side include HBA, ESCON/FICON, SCSI/iSCSI. Network connections to the storage device can be Fiber Channel, Ethernet (10/100/1000/10000 or 10GB), InfiniBand, ATA/SATA or Firewire/USB. Each combination of interconnections has differing performance characteristics (and costs). Performance characteristics are presented in various tables in this chapter.

Chapter 5 dives into the specifics of Fiber-Optics communications components including cabling, connectors, transceivers, and cable plant (fiber). The chapter presents performance factors for fiber-optics.

Chapter 6 expands our horizon by discussing various configurations for Metropolitan Storage Networks (MAN, typical 100 - 150 km) and Wide Area Storage Networks (WAN, typically > 150 km). MANs and WANs provide interconnection of data centers to share resources and centralize management of widely dispersed resources. After approx. 100 km, network latency (transfer time point to point) becomes a performance issue.

Next, in Chapter 7, Storage Network Devices are discussed. It's no secret that Storage Network Devices consist of Storage and Networks. There are a variety of storage devices that can be used - both disk and tape. There are a variety of disk type storage devices from single drives to "Just a Bunch of Disks" (JBOD) - a large electronic component with numerous disks in it. In addition there are a variety of network hardware and software methods to interconnect the devices - switches and directors, bridges, gateways and routers. The chapter concludes with a discussion of test/diagnostic devices.

Part III. Resilient Storage Networks

Chapter 8 deals with Storage Network Design. Design is influenced by business and regulatory needs (access and data retention considerations). Storage Network designs should be flexible for performance, maintenance, growth, security and availability. A comprehensive checklist for the design process is provided.

Next, in Chapter 9 various Storage Network Topologies are discussed. For resilient Storage Networks one should have redundant paths, redundant networks, design determined by business requirements. Storage Network and Fabric Networks are discussed. The Fabric is the SAN (the networking components which connect storage devices together). The Storage Network consists of the Storage Devices (disks and controllers) along with the Fabric. The chapter concludes with a discussion of interswitch link (ISL) topologies and best practices.

Performance and Capacity Planning for Storage Networks is the topic of Chapter 10. The tools and methods of performance assessments and capacity planning are brought to bear on Storage Networks. Capacity planning includes an assessment of what you have, who is using it, and expected future growth patterns (the forecast). This should also incorporate the Information Lifecycle you designed earlier. Performance assessments include considering CPU, switch, network and storage path, and storage devices. One should also include backup performance estimates in planning (large volumes of data moving over networks). Tools to assist with this planning exist for base line and trend analysis as well as establishing benchmarks and testing performance tuning. This Chapter concludes by discussing organizations involved with Storage Network performance assessments and capacity planning, namely: Computer Measurement Group (CMG), Storage Performance Council (SPC), and Transaction Processing Performance Council (TPC).

Chapter 11 is about Storage Management. While the unit cost of storage devices is decreasing, demand is growing. With growing demand comes an increased cost of managing storage. And the value of the information stored grows also. Storage management entails the following activities: configuration, provisioning (allocation), logical volume managers, event monitoring and notification, server clustering and fail over tools, diagnostics, resource monitoring, data projection and retention, and data and storage security. The tools for Storage Management range from logical data managers (high level abstraction of the SAN), to device and element managers at the device level. The typical interfaces use either SNMP or SMIS. With ‘in-band control' data and control information flow on the same data path. With ‘out-of-band control' separate paths are used for data and control information. Out-of-band is the preferred method for resilient Storage Networks. There is a section on the SMIS standard based on the Common Information Model (CIM) defined by the Data Management Task Force (DMTF). The underlying data transport mechanism in SMIS is XML over HTTP. The chapter ends with a discussion of virtualization and abstraction - of storage, data, hosts, fabric and networks. A tool check list/evaluation guide in provided.

Chapter 12, Protecting Data, points out possible losses of data through - accidental or intentional deletion, data corruption, theft, component failure, etc. To provide resilient access to data one needs component redundancy (devices and I/O paths), mirroring and replication (RAID), backup-restore-archive-retention plans (the Information Lifecycle), snapshots, journal file systems, and point in time copies of data.

Securing Storage and Storage Networks is the topic of Chapter 13. The risks to consider are physical attacks on cables and switches, eavesdropping (sniffing), unauthorized SNMP traps and alerts, unauthorized telnet commands, attacks on ports by unauthorized servers and devices, among others. Protection is needed along the full path from client to server and between servers. This includes ports, devices, transmission networks, management interfaces, resources, and disk volumes. Various control methods include:

  • LUN mappings (server to device access control)
  • Access Control Lists (ACLs) on fabric, switches, and ports (control access and authorizations)
  • Port Binding (ACLs and WWPN to control port connections)
  • Switch Binding (ACLs and WWNN to control switch interconnections)
  • Fabric Binding (ACLs and Domain Name to control SAN access)
  • Zoning - create virtual subnets within the fabric to be created to isolate and restrict traffic
  • Persistent Binding - server based zoning and isolation (HBA configuration specifies WWPN of storage network device).
  • Fabric WWN
To protect data in transit, encryption should be employed. To protect the infrastructure, physical security should be employed (secured server rooms; redundant power, heating and cooling, and alternate network paths). Disposal of Storage Network devices should also be planned - so data is not accidentally released to unauthorized parties.

IV. Putting It All Together

These chapters illustrate various applications of Storage Area Networks.

Chapter 14 shows how Storage Area Networks are appropriate for even SOHO (Small Office / Home Office) and SMB (Small and Medium Businesses).

Chapter 15 shows how you can Consolidate and Intermix Storage. You can consolidate servers, storage, or SANs. In addition you can mix storage devices in Storage Networks. Windows and Open Systems use Fixed Block architecture (data is written in fixed block size, typically 512 bytes per block). IBM Mainframes use a Count-Key-Data format which is variable length. Storage Networks can host both types of storage on the same Storage Network system (Intermixing).

Various MAS and WAS Storage Networks are illustrated in Chapter 16. The point is made that distance between storage systems provides for survivability and business continuation.

Chapter 17 addresses the need for Large and High-Performance Storage Networks. Applications such as data mining, data warehouses, imaging, video storage, archiving, and backup require large amounts of storage. Other applications such as transaction processing systems may need to process many I/O operations per second or may transport large volumes of data on demand. In the example of live streaming video - you have both large volumes of data and high transfer speeds needed at the same time.

Chapter 18 is a Wrap-Up of the book. It reviews what's already been presented. Business needs should determine design. Flexible design is needed for growth. Management of Storage is necessary. Protection of Storage is vital to business survival. Information Lifecycle Management is vital to the entire process - putting the right data in the right place at the right time.

Three Appendices provide additional resources;

    A - Useful Web Sites - extensive list of approx. 100 web sites
    B - Resilient Storage Networking Checklist - factors to consider
    C - Glossary of Storage Networking Terminology - supplemented below with some acronyms not in the glossary)

Supplemental Glossary of Acronyms:
ATA - Advanced Technology Attached hard drive (aka Integrated Drive Electronics or IDE, and more recently Parallel ATA or P-ATA for the data transfer method used)
CAS - Content Addressable Storage
DAS - Direct Attached Storage
ESCON - Enterprise Systems Connection (17 MB bandwidth)
FAS - Fabric Attached Storage
FICON - Fiber Connection (400 MB bandwidth)
HBA - Host Buss Adapter
HTTP - Hypertext Transfer Protocol
ISL - Interswitch Link
NAS - Network Attached Storage
SAN - Storage Area Network
SCSI/iSCSI - Small Computer Systems Interface (i=SCSI over IP)
SATA - Serial ATA (packet switched network between motherboard and disk drive)
SMIS - Storage Management Initiative Specification
SNMP - Simple Network Management Protocol
USB - Universal Serial Bus
WWPN - Worldwide Port Name
WWNN - Worldwide Node Name
XML - Extensible Markup Language

 

Last Updated 11/21/06


Home | Conference | Groups | National | Members | Links | Site Map

Computer Measurement Group