About This CourseCourse Goals Upon completion of this course, you should be able to: ● Remove and install most Sun Fire™ high-end server HESfield-replaceable units FRUs ● Configure the S
Trang 1Sun Microsystems, Inc.
UBRM05-104
500 Eldorado Blvd Broomfield, CO 80021
Maintenance and Administration
IES-421
Trang 3decompilation No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.
Third-party software, including font technology, is copyrighted and licensed from Sun suppliers.
Sun, Sun Microsystems, the Sun logo, OpenBoot, Solaris, Solstice DiskSuite, SunATM, Sun Blade, Sun Enterprise, Sun Fire, Sun Java, SunOS, SunSolve, SunSolve Online, SunSpectrum, Sun StorEdge, SunVTS, and Ultra are trademarks or registered trademarks of Sun Microsystems, Inc in the U.S and other countries.
All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc in the U.S and other countries Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc.
ORACLE is a registered trademark of Oracle Corporation.
Federal Acquisitions: Commercial Software – Government Users Subject to Standard License Terms and Conditions
Export Laws Products, Services, and technical data delivered by Sun may be subject to U.S export controls or the trade laws of other countries You will comply with all such laws and obtain all licenses to export, re-export, or import as may be required after delivery to You You will not export or re-export to entities on the most current U.S export exclusions lists or to any country subject to U.S embargo
or terrorist controls as specified in the U.S export laws You will not use or provide Products, Services, or technical data for nuclear, missile,
or chemical biological weaponry end uses.
DOCUMENTATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS, AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE
OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
THIS MANUAL IS DESIGNED TO SUPPORT AN INSTRUCTOR-LED TRAINING (ILT) COURSE AND IS INTENDED TO BE USED FOR REFERENCE PURPOSES IN CONJUNCTION WITH THE ILT COURSE THE MANUAL IS NOT A STANDALONE TRAINING TOOL USE OF THE MANUAL FOR SELF-STUDY WITHOUT CLASS ATTENDANCE IS NOT RECOMMENDED.
Export Commodity Classification Number (ECCN) assigned: 12 December 2002
Trang 4Sun Proprietary: Internal Use Only
et la décompilation Aucune partie de ce produit ou document ne peut être reproduite sous aucune forme, par quelque moyen que ce soit, sans l’autorisation préalable et écrite de Sun et de ses bailleurs de licence, s’il y en a.
Le logiciel détenu par des tiers, et qui comprend la technologie relative aux polices de caractères, est protégé par un copyright et licencié par des fournisseurs de Sun.
Sun, Sun Microsystems, le logo Sun, OpenBoot, Solaris, Solstice DiskSuite, SunATM, Sun Blade, Sun Enterprise, Sun Fire, Sun Java, SunOS, SunSolve, SunSolve Online, SunSpectrum, Sun StorEdge, SunVTS, et Ultra sont des marques de fabrique ou des marques déposées de Sun Microsystems, Inc aux Etats-Unis et dans d’autres pays.
Toutes les marques SPARC sont utilisées sous licence sont des marques de fabrique ou des marques déposées de SPARC International, Inc aux Etats-Unis et dans d’autres pays Les produits portant les marques SPARC sont basés sur une architecture développée par Sun Microsystems, Inc.
ORACLE est une marque déposée registre de Oracle Corporation.
Législation en matière dexportations Les Produits, Services et données techniques livrés par Sun peuvent être soumis aux contrôles américains sur les exportations, ou à la législation commerciale dautres pays Nous nous conformerons à lensemble de ces textes et nous obtiendrons toutes licences dexportation, de ré-exportation ou dimportation susceptibles dêtre requises après livraison à Vous Vous nexporterez, ni ne ré-exporterez en aucun cas à des entités figurant sur les listes américaines dinterdiction dexportation les plus courantes,
ni vers un quelconque pays soumis à embargo par les Etats-Unis, ou à des contrôles anti-terroristes, comme prévu par la législation américaine en matière dexportations Vous nutiliserez, ni ne fournirez les Produits, Services ou données techniques pour aucune utilisation finale liée aux armes nucléaires, chimiques ou biologiques ou aux missiles.
LA DOCUMENTATION EST FOURNIE “EN L’ETAT” ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFAÇON.
CE MANUEL DE RÉFÉRENCE DOIT ÊTRE UTILISÉ DANS LE CADRE D’UN COURS DE FORMATION DIRIGÉ PAR UN INSTRUCTEUR (ILT) IL NE S’AGIT PAS D’UN OUTIL DE FORMATION INDÉPENDANT NOUS VOUS DÉCONSEILLONS DE L’UTILISER DANS LE CADRE D’UNE AUTO-FORMATION.
Trang 5About This Course Preface-xxi
Course Goals Preface-xxiCourse Map Preface-xxiiTopics Not Covered Preface-xxiiiHow to Use the Course Materials Preface-xxivConventions Preface-xxvIcons Preface-xxvTypographical Conventions Preface-xxvi
Sun Fire™ High-End Server Product and Architecture
Review and Assessment 1-1
Objectives 1-1Relevance 1-2Additional Resources 1-3Reviewing Sun Fire HES 1-4Self-Assessment Review 1-5Product Review 1-10Sun Fire HES Products 1-10Board Sets 1-10Domain Configurable Units (DCUs) 1-11Reviewing the System Controllers 1-12Reviewing the Fireplane Operations 1-13Understanding Memory Coherency 1-15Coherency Protocol 1-15Snoopy Coherency 1-17Scalable Shared Memory 1-18Understanding Bus Interconnect Levels 1-19Interconnect Levels 1-20Address Interconnect 1-21Data Interconnect 1-22Self-Assessment Review Answers 1-25
Trang 6Understanding Enterprise Installation ServicesMethodology 2-5Understanding the EISdoc Tool 2-6Goals of the EISdoc Tool 2-7EISdoc Resources 2-8Using the EIS Installation and Patch CD-ROM 2-9Sun Fire HES EIS Installation Checklist 2-9Other EIS-CD Tools 2-10Exploring System Components 2-11Exploring Board Sets 2-12System Board Set 2-13Exploring System Boards 2-14CPU and Memory Configuration 2-16Memory Configuration Rules 2-18System Board LEDs 2-20Exploring Sun Fire HES I/O Support 2-22hsPCI I/O Assembly Slot Assignments 2-23hsPCI I/O Assembly Status LEDs 2-24Exploring the MaxCPU Board 2-26MaxCPU Board Components and Status LEDs 2-26Exploring the Sun Fire Link Board 2-28Sun Fire Link Board Components and Status LEDs 2-28Exploring the Expander Board 2-30Exploring the System Controller 2-32System Controller Components 2-33System Controller Faceplate 2-35System Controller LEDs and Controls 2-37System Controller Peripheral Board LEDs 2-40System Controller Physical Locations 2-43Exploring Centerplanes 2-44Power Centerplane 2-46Fan Backplane 2-47Exploring Carrier Plates 2-48Carrier Plate Label 2-50Understanding Electrical Specifications 2-51Processor Cabinet Power System 2-51Specifications (Sun Fire 15K and 12K Servers Only) 2-52Understanding Input Power Requirements 2-53Sun Fire 12K Server Power Requirements 2-54Power Factor 2-54Power Distribution 2-55Exploring AC Power Supplies 2-56Power Supply LEDs 2-57Installing Power Cables 2-59
Trang 7Powering Off the Sun Fire HES 2-62Exploring the Processor Cabinet Cooling System 2-63Exploring Fan Trays 2-65Exploring FrameManager 2-67FrameManager Front Panel 2-69Physical Locations 2-71Exercise 1: Identifying the Contents of the EIS-CD 2-73Preparation 2-73Tasks 2-73Exercise 2: Removing and Installing Sun Fire HES FRUs 2-74Preparation 2-74Tasks 2-76Exercise 3: Powering the Platform On and Off 2-79Preparation 2-79Tasks 2-79Exercise Summary 2-80Exercise 1 Solutions 2-81Tasks 2-81Exercise 2 Solutions 2-82Tasks 2-82Exercise 3 Solutions 2-85Tasks 2-85
Managing the Sun Fire HES System Controller Networks
and Software 3-1
Objectives 3-1Relevance 3-2Additional Resources 3-3Reviewing System Architecture 3-4Exploring System Controller Networking 3-5Management Networks (MANs) 3-7Console Bus 3-14Exploring System Controller Software 3-15SMS Capabilities 3-16Reviewing SMS Startup 3-18Message Logging Daemon 3-21SMS Startup Daemon 3-23Hardware Access Daemon 3-26Management Network Daemon 3-28FRU Access Daemon 3-32Failover Management Daemon 3-33Platform Configuration Database Daemon 3-34
Trang 8Key Management Daemon 3-39Domain X Server 3-43Domain Configuration Administration Server 3-44Exploring SC Environment Variables 3-45Exploring SMS File System Structure 3-46Examining SMS Installation 3-50Installing the System Management Services
Software 3-51Installing the SMS Software Packages Using the
smsinstall Command 3-52Installing the SMS Software Packages Using the
smsupgrade Command 3-55Upgrading SMS Versions 3-58Configuring the System Controller 3-60Changing the Configuration During the Initial
Startup 3-60Configuring the Management Network 3-61Thesmsconfig Script 3-63Using thesmsconfig Script 3-66Configuring the Name Services 3-70Securing the Sun Fire HES System Controller 3-71Using thesmsconfig Security Option 3-71Exercise: Configuring the System Management Services 3-74Preparation 3-74Task 1 – Installing and Configuring SMS Software 3-74Task 2 – Logging In to the Actual SC 3-74Exercise Summary 3-76Exercise Solutions 3-77Task 1 – Installing and Configuring SMS Software 3-77Task 2 – Logging In to the Actual SC 3-77
Configuring the Sun Fire HES Platform 4-1
Objectives 4-1Relevance 4-2Additional Resources 4-3Architecture Overview 4-4Sun Fire HES Administrative Privileges 4-5Managing Administration Groups 4-6Adding Administrators 4-10Group Privileges 4-13Platform Administrator Group 4-13Platform Operator Group 4-15Platform Service Group 4-16Domain Administrator Group 4-17
Trang 9Command-by-Command Privileges 4-19Platform Administration Tasks 4-20Exploring the Available Component List 4-21Thesetupplatform Command 4-22Thesetcsn Command 4-25Displaying Platform Configuration 4-26Thesmsversion Command 4-27Theshowplatform Command 4-28Theshowboards Command 4-35Monitoring Platform Environmentals and Status 4-38Theshowenvironment Command 4-38Theshowlogs Command 4-43Setting Up the Platform 4-45Thesetdate Command 4-45Thesetdefaults Command 4-47Powering Platform Components On and Off 4-49Thepoweron Command 4-49Thepoweroff Command 4-51Updating the Firmware 4-53Theflashupdate Command 4-54Examples of theflashupdate Command 4-55Backing Up and Restoring the SMS Environment 4-56Thesmsbackup Command 4-56Thesmsrestore Command 4-57Re-Initializing the System Controller 4-58Theresetsc Command 4-58Exploring System Controller Failover 4-59How System Controller Failover Works 4-60Startup 4-61Main SC Role During Startup 4-61Spare SC Role During Startup 4-61File Propagation 4-62Failover Triggering Faults 4-64Failover Disabling Faults 4-64Exploring Detailed Failover Scenarios 4-65Fault on Main SC Detected by the Main SC 4-65Fault on Main SC Detected by the Spare SC 4-66I2 Network Fault 4-67Fault on Main SC and I2 Network Also Down 4-67Using System Controller Failover Commands 4-68Theshowfailover Command 4-68Thesetfailover Command 4-70
Trang 10Task 1 – Configuring Administration Groups 4-74Task 2 – Monitoring the Platform 4-75Task 3 – Configuring Domain ACLs 4-76Task 4 – Updating Flash PROM Images 4-76Task 5 – Backing Up and Restoring the SMS
Environment 4-77Exercise Summary 4-78Exercise Solutions 4-79Task 1 – Configuring Administration Groups 4-79Task 2 – Monitoring the Platform 4-80Task 3 – Configuring Domain ACLs 4-81Task 4 – Updating Flash PROM Images 4-81Task 5 – Backing Up and Restoring the SMS
Environment 4-82
Configuring Sun Fire HES Domains 5-1
Objectives 5-1Relevance 5-2Additional Resources 5-3Architecture Overview 5-4Exploring Sun Fire HES Domains 5-5Static and Dynamic Domain Configuration 5-5Domain Configuration 5-6Domain Configuration Unit 5-6Domain Configuration Requirements 5-7Configuring Static Domains 5-8Theaddtag Command 5-10Thedeletetag Command 5-12Theaddboard Command 5-13Thedeleteboard Command 5-16Themoveboard Command 5-19Thesetobpparams Command 5-20Domain Configuration Example 5-22Virtual Keyswitch 5-25Thesetkeyswitch Command 5-25Summary of Keyswitch Transitions 5-28Displaying the Virtual Keyswitch Setting in a
Domain 5-29Thehpost Utility 5-30File Locations of the.postrc File 5-30Controlling Level and Verbosity in the.postrc File 5-31Other Directives in the.postrc File 5-31Accessing the Domain Console 5-32Theconsole Command 5-32
Trang 11Resetting the Domain 5-38Thereset Command 5-38Exploring the OpenBoot PROM Device Tree 5-40OpenBoot PROM Device Tree 5-41Sun Fire Server Physical Device Mapping 5-44Device Mapping Algorithm 5-44Decoding CPU and Memory Locations 5-46Decoding I/O Card Locations 5-50Agent ID 5-50IOC PCI Buses 5-53Slot Number 5-53Sun Fire HES Server on the Solaris™ OS 5-55Configuring the Solaris OS in a New Domain 5-56Setting Up the OpenBoot PROM for the Default Boot
Device 5-56Configuring the Domain’s MAN Interface 5-57Setting the Domain’s Time-of-Day Clock 5-61Configuring NTP in the Domain 5-62Finishing the Installation 5-62System Controller Domain Log Files 5-63Exercise: Creating a Sun Fire HES Domain 5-64Preparation 5-64Task 1 – Configuring a Domain 5-67Task 2 – Analyzing the OpenBoot PROM Device
Tree 5-69Task 3 – Booting the Solaris OS for the First Time 5-71Exercise Summary 5-72Exercise Solutions 5-73Task 1 – Configuring a Domain 5-73Task 2 – Analyzing the OpenBoot PROM Device Tree 5-75Task 3 – Booting the Solaris OS for the First Time 5-77
Exploring Dynamic Reconfiguration 6-1
Objectives 6-1Relevance 6-2Additional Resources 6-3Introducing Dynamic Reconfiguration 6-4Benefits of DR 6-4
DR Operational Locations 6-5Sun Fire HES System Controller 6-5
Sun Management Center 3.x Software 6-6
Domain 6-6
Trang 12DRdetach Operation 6-11
DRdetach States 6-11I/O Board Considerations 6-12Multipathed I/O Configuration 6-12Non-Multipath Configuration 6-13Detach-Safe and Suspend-Safe Devices 6-14Hot-Pluggable Hardware 6-16System Memory Considerations 6-17Detaching Permanent Memory 6-17Detaching Swap Space 6-18CPU Considerations 6-20Bound Threads 6-20Processor Sets 6-20Performing DR From the System Controller 6-21SMS Commands 6-21Theaddboard Command 6-22Thedeleteboard Command 6-23Themoveboard Command 6-25Using thercfgadm Utility 6-27Viewing DR Status 6-29Standard View 6-29Detailed View 6-33
DR Procedures Using thercfgadm Command 6-35Replacing a System Board or MaxCPU Board 6-35Replacing an I/O Board or a Card 6-39
DR Process With VERITAS DMP 6-40Moving Physical Resources Between Domains 6-43Using Reconfiguration Coordination Manager (RCM) 6-45How RCM Works 6-46Sun Cluster Example 6-46Exercise: Performing DR 6-47Preparation 6-47Task 1 – Replacing a System Board 6-47Task 2 – Replacing an I/O Board 6-50Task 3 – Replacing an I/O Card 6-52Task 4 – Moving Physical Resources Between
Domains 6-54Task 4a – Moving Physical Resources Between
Domains 6-55Exercise Summary 6-56Exercise Solutions 6-57Task 1 – Replacing a System Board 6-57Task 2 – Replacing an I/O Board 6-59Task 3 – Replacing an I/O Card 6-61
Trang 13Relevance 7-2Additional Resources 7-3Exploring COD Version 2.0 7-4Identifying COD Systems 7-4Understanding COD Part Number and License
Requirements 7-4Managing COD RTU Licenses 7-5Understanding COD RTU Licenses and
UltraSPARC IV 7-6Understanding Instant Access CPUs and Headroom 7-6Identifying COD SMS Commands 7-7Managing thecodd Daemon on the Sun Fire HES SC 7-8Installing and Removing a COD RTU License Key
to and From the COD License Database 7-10Enabling COD Resources 7-13Thesetupplatform Command 7-14Verifying the COD Resource Configuration 7-15Monitoring COD Resources 7-16Obtaining COD Resource Usage information 7-18Identifying COD-Disabled CPUs 7-22Identifying Other Useful Commands 7-23Servicing COD V2.0 FRUs 7-24Converting a Non-COD CPU/Memory Board 7-25Exercise: Managing COD Resources 7-27Preparation 7-27Tasks 7-27Exercise Summary 7-29Exercise Solutions 7-30Tasks 7-30
Troubleshooting Sun Fire HES 8-1
Objectives 8-1Relevance 8-2Additional Resources 8-3Diagnostic Programs 8-4Power-On Self-Test 8-4SunVTS™ Software 8-4SunSolve OnlineSM Service 8-5Sun Explorer Software 8-5Thesmshelp Command 8-5Examining the POST Process 8-6Domain Configuration 8-6
Trang 14POST Logs 8-11Using thehpost Test 8-12File Locations of the.postrc File 8-14Using theblacklist Files 8-15Theblacklist File 8-15Using Automatic System Reconfiguration (ASR) 8-18Identifying an ASR Event 8-19Isolating Centerplane Problems 8-20Thesetbus Command 8-20Centerplane Fault Isolation Example 8-21System Failures Requiring Specific Handling 8-24Reboot Request 8-25System Panic 8-26Considerations 8-26Watchdog, Redmode, and XIR Resets 8-27Forcing a Domain Reset 8-28Thereset Command 8-28Thesetkeyswitch Command 8-30Recovering From a Hung Domain 8-31Cannot Log In to the Domain 8-31Hard-Hung Domain 8-31Heartbeat Failure (Hung Host) 8-32Manual Intervention With a Hung Host 8-32Hung System Controller Console 8-33Domainstops and Recordstops 8-34Creating a Hardware State Dump File 8-35Viewing Error Messages 8-36Starting thesmshelp Utility 8-36Types of Errors 8-39Error Categories 8-39Using theredx Utility 8-42Modes of Operation 8-42Configuring theredx Utility 8-43Theredx Utility Run Control Files (.redxrc) 8-43Configuring the Offlineredx Standalone Script 8-44Verifying theredx Utility Configuration in Offline
Mode From the System Controller 8-44Running theredx Utility in Offline Modes 8-45Running theredx Utility From a Workstation 8-45Running theredx Utility From a System Controller 8-46Loading Domain Stop and Record Stop Logs 8-48Loading the Dump File 8-48Viewing a Loaded Dump File 8-50Thewfail Operation 8-50
Trang 15Auto-Diagnosis Engines 8-54Scenarios With and Without the Auto-Diagnosis
Engines 8-55Event Framework 8-56Fault Management Architecture (FMA) 8-56Summary of Diagnostic Processes 8-57SMS Diagnostic Engine 8-57POST of Domain After Component Failure 8-58POST Test Failure 8-59Solaris OS Diagnostic Engine 8-60Displaying and Setting the CHS 8-61Theshowchs Command 8-61Thesetchs Command 8-63Automatic Email Event Notification 8-64Thetestemail Command 8-69Using Sun Explorer Data Collector Software 8-70Installing and Running the Sun Explorer Utility 8-70Viewing a Sun Explorer Capture 8-71Reviewing Sun Fire HES Unique Files 8-72Reviewing Technical Information for Escalation 8-73General Information Needed 8-73Problem-Specific Information Needed 8-73Exercise 1: Troubleshooting the Sun Fire HES 8-75Preparation 8-75Task 1 – Running POST Diagnostics 8-75Task 2 – Managing the Blacklist 8-76Task 3 – Assessing a Domain Configuration 8-77Task 4 – Assessing adstop andrstop Dump File
Using theredx Utility 8-78Task 5 – Setting and Viewing the Component
Health Status 8-80Task 6 – Viewing the Interaction of CHS and POST 8-80Task 7 – Resetting the Component Health Status 8-81Exercise 2: Troubleshooting Practice 8-82Preparation 8-82Tasks 8-82Worksheet 1 8-87Worksheet 2 8-88Exercise Summary 8-89Exercise 1 Solutions 8-90Task 1 – Running POST Diagnostics 8-90Task 2 – Managing the Blacklist 8-91
Trang 16Task 5 – Setting and Viewing the ComponentHealth Status 8-94Task 6 – Viewing the Interaction of CHS and POST 8-95Task 7 – Resetting the Component Health Status 8-95
Sun Fire HES EIS Checklist A-1
Objectives A-1EIS Installation Checklist for Sun Fire 12K and 15K
Server Systems A-1
Examining Sun Fire HES Interconnect Architecture B-1
Objectives B-1Introducing System Interconnect Architecture B-2Snoopy Coherency B-2Sun Fireplane Interconnect Address Transactions B-6SSM Coherency B-9SSM Operations in Sun Fire HES B-11Understanding Interconnect Transaction Sequences B-12Read-to-Share From the Same Bus B-12Read-to-Share From a Different Bus B-14Read-to-Share of an Owned Cache Line B-16Read-to-Own From a Different Bus B-20Writeback to a Different Bus B-23Reviewing Interconnect Timing B-26Read From Memory on the Same Bus B-26Read From Memory on a Different Bus B-28
Examining Sun Fire HES Group Privileges C-1
Objectives C-1Additional Resources C-2Reviewing Sun Fire HES Administrator Privileges C-3
Multipathing I/O Management D-1
Objectives D-1Relevance D-2Additional Resources D-3How Sun StorEdge™ Traffic Manager Software Works D-4Sun StorEdge Traffic Manager Software Device Tree D-6Sun StorEdge Traffic Manager Software Device Paths D-6Sun StorEdge Traffic Manager Software Path Properties D-9Path States D-9SCSI Device Attributes D-10Configuring Sun StorEdge Traffic Manager Software D-11Domain Configuration Verification D-11Sun StorEdge Traffic Manager Software Installation D-13Post-Installation Setup and Verification D-14
Trang 17Future Upgrades to the Operating System D-16Automatic Path Discovery D-17Sun StorEdge Traffic Manager Software Management D-18Theluxadm Command D-18Theformat Command D-23Automatic Failover D-24IPMP D-26IPMP Features D-26IPMP Components D-27Sun Trunking Software, IPMP, and HAnet Comparison D-28IPMP Hardware and Software Requirements D-30Hardware Requirements D-30Software Requirements D-30IPMP Group Requirements D-31How IPMP Works D-32Network Path Failure Detection D-32Network Path Failover D-33Network Path Failback D-33Multipathing Configuration File D-34Settings D-35Starting thein.mpathd Daemon D-35Newifconfig Status Flags and Subcommands D-36Newifconfig Status Flags D-36Newifconfig Subcommands D-37Configuring IPMP D-38Creating an IP Multipathing Group D-38Adding and Deleting Physical Network Interfaces D-39Configuring Test Interfaces D-39Configuring Standby Interfaces D-40Configuring Logical Interfaces D-41IPMP Implementation D-42Configuring Single Active Path With Standby D-42Multiple Active Paths Without Standby D-46Verifying IPMP Load Spreading D-50Automating IPMP Configuration in IPv4 D-52Configuring IPMP in IPv6 D-53Manually Configuring IPMP in IPv6 D-53Automating IPMP Configuration in IPv6 D-53Optional Exercise 1: Installing Sun StorEdge Traffic
Manager Software D-54Preparation D-54Task 1 – Configuring Sun StorEdge Traffic
Trang 18Optional Exercise 2: Configuring and Managing IPMP D-56Preparation D-56Task 1 – Exploring IPMP D-57Task 2 – Configuring IPMP D-58Exercise Summary D-59Optional Exercise 1 Solutions D-60Task 1 – Configuring Sun StorEdge Traffic Manager
Software D-60Task 2 – Confirming Sun StorEdge Traffic Manager
Software D-61Optional Exercise 2 Solutions D-62Task 1 – Exploring IPMP D-62Task 2 – Configuring IPMP D-63
Reconfiguration Coordination Manager (RCM) E-1
Objectives E-1Additional Resources E-2Resource Consumer Plug-ins E-3RCM Scripts E-3RCM Script Commands E-5RCM Script Example E-7Installing a RCM Script E-10Removing a RCM Script E-10Testing a RCM Script E-11
Sun Enterprise™ 10000 Server-to-Sun Fire HES Dictionary F-1
Daemons F-2Commands F-3Configuration Files F-5Miscellaneous Names F-6
Trang 19About This Course
Course Goals
Upon completion of this course, you should be able to:
● Remove and install most Sun Fire™ high-end server (HES)field-replaceable units (FRUs)
● Configure the Sun Fire HES platform and domains
● Perform platform administration activities on the Sun Fire HES
● Perform dynamic reconfiguration operations on the Sun Fire HESdomains
● Use available Solaris™ Operating System (Solaris OS) andplatform-specific tools to troubleshoot the Sun Fire HES
● Identify and administer Capacity-on-Demand (COD) systems
Trang 20Course Map
The following course map enables you to see what you haveaccomplished and where you are going in reference to the course goals
Sun Fire Server Review and Installation
Sun Fire HES Configuration
Sun Fire HES
Configuring theSun Fire HES
Configuring
Exploring Sun Fire
HESExploring Dynamic
Reconfiguration
Troubleshooting
Sun Fire HES Administration and Troubleshooting
Sun Fire HES
Sun Fire HES
Managing the Sun Fire
Version 2.0
Installing Sun Fire High-End ServersArchitecture Review
and Assessment
Trang 21Topics Not Covered
This course does not cover the following topics Many of these topics arecovered in other courses offered by Sun™ Services:
● Solaris OS administration – Covered in SA-118: Fundamentals of
Solaris™ 8 Operating Environment for System Administrators
● Storage administration – Covered in ES-255: Sun™ Hardware RAID
and T3 Storage System Administration
● Sun Fire 3800, Sun Fire 48x0, Sun Fire 4810, and Sun Fire 6800 server administration – Covered in ES-420: Sun Fire™ Workgroup/Enterprise
Server Administration
This course does not cover advanced architecture theory
Refer to the education.central/ITTWeb site for specific informationand registration
Trang 22How to Use the Course Materials
To enable you to succeed in this course, these course materials use alearning module that is composed of the following components:
● Goals – You should be able to accomplish the goals after finishing
this course and meeting all of its objectives
● Objectives – You should be able to accomplish the objectives after
completing a portion of instructional content Objectives supportgoals and can support other higher-level objectives
● Lecture – The instructor will present information specific to the
objective of the module This information will help you learn theknowledge and skills necessary to succeed with the activities
● Activities – The activities take on various forms, such as an exercise,
self-check, discussion, and demonstration Activities are used tofacilitate mastery of an objective
● Visual aids – The instructor might use several visual aids to convey
a concept, such as a process, in a visual form Visual aids commonlycontain graphics, animation, and video
Trang 23The following conventions are used in this course to represent varioustraining elements and alternative learning resources
Icons
Additional resources – Indicates other references that provide additional
information on the topics described in the module
URL Resources – Indicates additional information on the subjects being
described can be found at the indicated Universal Resource Locators(URLs)
?
!
Discussion – Indicates a small-group or class discussion on the current
topic is recommended at this time
Power user – Indicates additional supportive topics, ideas, or other
optional information
Note – Indicates additional information that can help students but is not
crucial to their understanding of the concept being described Studentsshould be able to understand the concept or complete the task withoutthis information Examples of notational information include keywordshortcuts and minor system adjustments
Caution – Indicates that there is a risk of personal injury from a
nonelectrical hazard, or risk of irreversible damage to data, software, orthe operating system A caution indicates that the possibility of a hazard
Trang 24Caution – Indicates that either personal injury or irreversible damage of
data, software, or the operating system will occur if the user performs thisaction A warning does not indicate potential events; if the action isperformed, catastrophic events will occur
Typographical Conventions
Courieris used for the names of commands, files, directories,programming code, and on-screen computer output; for example:
Use ls -alto list all files
system% You have mail
Courieris also used to indicate programming constructs, such as classnames, methods, and keywords; for example:
The getServletInfomethod is used to get author information.The java.awt.Dialogclass contains Dialogconstructor
example:
To list the files in this directory, type:
# ls
referenced in a textual description; for example:
To delete a file, use thermfilenamecommand
be entered by the student as part of an activity; for example:
rights for filename to world, group, and users
Trang 25Palatino italic is used for book titles, new words or terms, or words that
you want to emphasize; for example:
Read Chapter 6 in the User’s Guide.
These are called class options.
Trang 27Sun Fire™ High-End Server Product and
Architecture Review and Assessment
For additional information concerning SMS 1.4.1 software, refer to thefollowing Universal Resource Locator (URL):
http://webhome.eng.sun.com/starcatdocs/SMS/SMS141/
mansched.htm
Trang 28Review and refresh your knowledge of the Sun Fire High-End Server
Product Introduction and Architecture Tech Talk.
Trang 29Additional Resources
Additional resources – The following references provide additional
details on the topics described in this module:
● Sun Microsystems, Inc Sun Fire™ 15K/12K Software Overview Guide,
part number 817-3075-10
● Sun Microsystems, Inc System Management Services 1.4 Reference
Manual, part number 817-3057-10.
● Sun Microsystems, Inc System Management Services 1.4 Administrator
Guide, part number 817-3056-10.
● Sun Microsystems, Inc System Management Services 1.4 Installation
Guide and Release Notes, part number 817-3055-10.
● Sun Microsystems, Inc Sun Fire™ 15/12K Service Manual, part
number 806-3512
Trang 30Reviewing Sun Fire HES
This course is designed to provide you with the information to:
● Remove and replace FRUs on the high-end servers
● Install and configure the System Management Services (SMS)software
● Manage the platform with the SMS software
● Create and manage domains on the HES
● Perform dynamic reconfiguration
● Perform basic troubleshooting using both manual and automaticdiagnosis techniques
Figure 1-1 Sun Fire HES
This module reviews the prerequisite Tech Talk introducing the productand the product architecture The first part is a self-assessment reviewquiz
Trang 31Self-Assessment Review
Answer the following questions:
1 What components comprise a board set?
_
2 True or false: All domains in the Sun Fire HES must be running thesame version of the Solaris OS
3 What can two running domains in the Sun Fire HES share?
a Memory
b Central processing unit (CPU) cycles
c Input/Output (I/O) cards
d None of the above
4 True or false: It is possible to set up a domain with no I/O boards
5 What are the types of boards that can occupy Slot 0 of a board set(circle all that apply)?
a MaxCPU board
b UltraSPARC®III system board
c UltraSPARC IV system board
d MaxIO board
6 Which of the following is true (circle all that apply)?
a There are eight board sets on the front and ten on the back(because the front has room for two system controller (SC)board sets)
b There are nine board sets on each side
c The SCs are facing each other (on the right-front and left-back)
d The SC on each side runs the board sets on that side
e You can run the whole platform with only a single SC Theother one is only for redundancy
Trang 327 What is the term used when Slot 0 and Slot 1 boards are configuredwithin a common expander board and are in different domains?
8 The Sun Fire HES data interconnect has four levels ofapplication-specific integrated circuits (ASICs) Match the datainterconnect levels to their definitions in Table 1-1
9 The three types of centerplane buses connecting the board sets are:
a Address, data, reset
b Address, data, response
c Analog, digital, central
d Address, greyhound, trailways
10 True or false: Level 1 connectivity is always on the same Slot 0 board
or Slot 1 board
11 True or false: Level 3 connectivity is all within the same board set
Table 1-1 Data Interconnect Levels and Their Definitions
2 The three-port system data interface connects two boards
to the system data crossbar
Trang 3312 Which of the following is true (circle all that apply)
a If there is more than CPU with access to the same randomaccess memory (RAM), no CPU caches any of the RAM to avoidcoherency problems
b CPUs can read any copy of the data that they like “Cachecoherency problem” is just a marketing buzzword
c It is possible that data in a CPU’s cache might be more “up todate” than the copy in RAM
d It would be acceptable to let a CPU read the RAM copy even ifanother CPU’s cached copy were more up-do-date
e It is not acceptable to let a CPU read the RAM copy if another
CPU’s cached copy is more up to date
13 The process by which address requests are broadcast and
“intercepted” by parties with more-up to date copies is called:
a Snoopy coherency
b Broadband coherency
c Woodstock coherency
d Interception coherency
14 True or false: Precursors to the Sun Fire HES (including the
midframe servers) use snoopy coherency across the entire OS
15 The reason the high-end servers do not use snoopy coherency acrossthe entire OS is:
a Each individual address transaction is way too slow with
snoopy coherency
b Snoopy coherency does not scale well as the number of addressrequests and the number of possible locations goes up
c The high-end servers do always use snoopy coherency across
the entire Solaris OS
16 If a requesting board set’s expander determines that an addressrequest needs to go to another board set, it goes:
a To the next highest board set
b To all other board sets
c There is no way an address request would need to go to anotherboard set
Trang 3417 True or false: It is guaranteed that the home board set can directlyreturn the data to a requesting board set.
18 The coherency directory cache (CDC):
a Contains the identity of the home board set for each memoryaddress
b Lets a home board set know which other board sets might havethe cached copy of the memory
c Contains the cached data corresponding to addresses on theboard
19 Which components comprise a system controller board set (circle allthat apply)?
a System board
b I/O board
c System controller peripheral board
d Control board
e System controller CPU board
20 The CP1500 and CP2140 are:
a The two system controller CPU boards supported
b Compact-peripheral component interface (PCI) -sized boardswith memory and I/O but no CPU
c The types of Ethernet chips in the system controller CPU board
24 The centerplane support board (circle all that apply):
a Is simply a piece of metal physically propping up thecenterplane
b Connects the control board to the centerplane
c Provides power conversion for the centerplane
d Connects the control board to its peripheral board
Trang 3525 True or false: Eighteen of the Ethernet chips inside the control boardconnect it on a physically separate networks to each possible
domain
Trang 36Product Review
The Sun Fire HES contains:
● A cabinet with up to nine board sets on each side
● A system controller board set on each side
Sun Fire HES Products
The Sun Fire 15K server was the original product containing up to 18board sets and using the UltraSPARC III+ family of processors
The Sun Fire 12K server is the same as Sun Fire 15K server, but with onlynine board sets It was introduced as a separate product as a marketingprice-point Nothing prevents you from buying a Sun Fire 12K server andadding more board sets You could call it whatever you want at that point
The Sun Fire E25K and Sun Fire E20K servers are the next generationhigh-end servers, including the UltraSPARC IV family of processorsamong other changes
The architecture and operation of the entire line of servers is the same.Mixing and matching UltraSPARC III and UltraSPARC IV system boards
in the same platform, and even in the same domain, will be supported
At the time of the writing of this revision of the course, the Sun Fire E25K and Sun Fire E20K have yet to be announced There are no pointers yet to reference guide or manuals that reference these products The references at the beginning of each module in this course still refer in same cases only to the Sun Fire 15K/12K products, and certain numbers (such as power statistics in Module 2) are not available for Sun Fire E25K and Sun Fire E20K servers at the time of writing.
Trang 37Domain Configurable Units (DCUs)
A domain configurable unit (DCU) is a unit of hardware that can beassigned to a single domain; DCUs are the hardware components fromwhich domains are constructed
The Sun Fire HES DCUs are:
● Slot 0 boards – CPU/memory boards
● Slot 1 boards – Sun Fire I/O assemblies and MaxCPU boards
Sun Fire HES hardware requires at least one of each type of board, onecontaining CPUs and memory, and at least one of the I/O board types ineach configured domain Centerplane support boards, expander boards,and the system controller are not DCUs
When Slot 0 and Slot 1 boards are configured within a common expanderboard and are in different domains, it is called a split expander Theexpander board keeps transactions separate for each system board
Because split-expander hardware is shared between two domains, itsfailure brings down both domains
Trang 38Reviewing the System Controllers
The Sun Fire HES architecture supports two integrated service processors.These service processors are implemented as system controller boards
The SC is literally the heart of Sun Fire HES, as it provides all criticalresources for the entire platform The system controller provides thefollowing capabilities:
● Provides an interface to the customer’s administrative network orother user network
● Provides a programmable system and processor clock
● Sets up the server and coordinates the boot process
● Monitors environmental sensors
● Indicates the status and control of power supplies
● Analyzes errors and takes corrective action
● Provides the server console functionality
● Provides a fully redundant system controller board
● Sets up the server domains
● Provides centralized time-of-day
● Provides centralized reset logic
Trang 39Reviewing the Fireplane Operations
The Sun Fire HES are members of the Sun Fire family, which comprisesthe Sun Fire E25K/15K and Sun Fire E20K/12K servers, as well as the SunFire 3800, 4800, 4810, and 6800 servers The Sun Fire family uses the sameUltraSPARC III-based technology, common to smaller systems as well,including the Sun Blade™ 1000 workstation, and the Sun Fire 280R, V480,and V880 workgroup servers
The members of the UltraSPARC III and UltraSPARC IV technologyfamilies use similar bus interconnects and nearly identical components atthe ASIC level They differ in the number of components, such as thenumber of CPUs, and in the complexity of their bus implementations
The smallest UltraSPARC III-based systems (workstation and workgroupservers) have all of their bus-level components connected by
point-to-point, single-level address and data buses The Sun Firemid-range servers use a two-level structure, where components connected
at the board level communicate with other boards by using second-levelbroadcast address and data buses
The Sun Fire high-end servers add a third-level interconnect that connects
as many as 18 individual domains Architecturally, the Sun Fire E25K andSun Fire 15K server appears as a collection of 18 small Sun Fire mid-rangeservers The Sun Fire E25K and Sun Fire 15K servers can be configuredwith 18 individual domains or can be configured with all of the expandersappearing as one large domain to the Solaris OS
The Sun Fire E20K and 12K servers are the same physical systems as theSun Fire E25K and Sun Fire 15K servers The difference is that the higher-numbered nine expander board slots (9 through 17) in the Sun Fire E20Kand Sun Fire 12K servers are not populated There are still two systemcontrollers Otherwise, all commands, configuration, operation, andinstallation are the same for the two servers
The system boards and I/O boards only receive the messages to and fromthe boards in their own board set, unless external messages are passed in
by the expander
Trang 40The Sun Fire HES components work together to provide a very highbandwidth coherent memory system Figure 1-2 shows a logical overview
of the system interconnect Note the small numbers in the block diagramare peak bidirectional data bandwidths at each level of the interconnect
Figure 1-2 Sun Fire 15K Server System Interconnect
Note – The Sun Fire 12K server has nine board sets.
Dual 3X3
PCI card PCI card PCI card PCI card
PCI ctl
CPU
Mem
Mem Mem
Mem
Dual 3X3
Dual 3X3
PCI ctl
PCI ctl
1.2
1.2
43 GB/s centerplane
System Control boardset