13 5.2.2 Availability during mission lifetime for a specified service .... 13 5.2.3 Availability at a specific time or time interval for a specified service .... This Standard also des
Terms from other standards
For the purpose of this Standard, the terms and definitions from ECSS-S-ST-00-01- apply.
Terms specific to the present standard
3.2.1 achieved availability probability that a system, subsystem or equipment, when used under stated conditions in an ideal support environment operates satisfactorily at a given time
NOTE The downtime is associated only to the active preventive and corrective maintenance
3.2.2 active redundancy every entity is operating and the system can continue to operate without downtime or defects despite the loss of one or more entities
3.2.3 corrective maintenance maintenance performed to restore system hardware integrity following anomalies or equipment problems encountered during system operations
3.2.4 flight segment product or a set of products intended to be operated in space
3.2.5 ground segment all ground infrastructure elements that are used to support the preparation activities leading up to mission operations, the conduct of mission operations and all post-operational activities
3.2.6 hot redundancy redundancy entity is “ON”, but not necessarily in the right configuration to accomplish the function
The intrinsic probability of an item being capable of performing a required function at a specific moment depends on the provision of necessary external resources under defined conditions.
NOTE Preventive maintenance is generally not taken into account for intrinsic availability
Operational probability refers to the likelihood that an item is capable of performing its required function at a specific moment, considering the maintenance strategy, spare parts policy, and associated logistical delays and constraints.
3.2.9 lead time (supplier delay) mean time for supplier to provide spares (including shipping time)
3.2.10 logistic delay mean time for human and material maintenance means to be available (call-out time)
The intrinsic percentage of time that a system, subsystem, or equipment operates satisfactorily under specified conditions, without any scheduled or preventive maintenance and with optimal logistical support, is defined for a particular time period.
percentage of defined time period in which a system, subsystem or equipment, operates satisfactorily used under stated conditions in an actual support environment
NOTE The down time is relevant to the corrective maintenance, preventive maintenance, logistic and administrative delays
3.2.13 mean down time mean time between service interruption and service resumption
NOTE See Figure 3-1 correct operation time
0 initial failure waiting start of work repair restart correct operation second failure
Figure 3-1: Relations between the various values that characterize the reliability, maintainability and availability of equipment
3.2.14 mean time between failures mean time between two consecutive failures
3.2.15 mean time between outages mean time of operation of an entity between two consecutive non-operational phases caused by corrective or preventive maintenance activities
3.2.16 mean time to failure mean time of working of an entity before its first failure
NOTE Also known as “mean time to first failure”
3.2.17 mean time to outage mean time of working of an entity before its first outage
3.2.18 mean time to repair mean duration to repair equipment with human and material maintenance means being available
3.2.19 mean up time mean time of working of an entity after corrective maintenance (covering repair and replacement)
3.2.20 outage state of an item of being unable to perform its required function
NOTE 1 Causes of outages can be failures, upsets or planned and unplanned events
NOTE 2 The failures can be due to cataleptic intrinsic events or external events
3.2.21 passive redundancy redundancy not activated before necessary
NOTE Also knows as “standby redundancy” or “cold redundancy”
3.2.22 preventive maintenance scheduled or on-condition maintenance actions performed on equipment to reduce its probability of failure or degradation
NOTE Preventive maintenance is performed to keep the system at designed reliability and safety levels before failure occurrence
3.2.23 steady-state availability (asymptotic availability) limit, if any, on the instantaneous availability as time approaches infinite
Abbreviated terms
For the purpose of this Standard, the abbreviated terms from ECSS-S-ST-00-01 and the following apply:
Abbreviations Meaning FMECA failure modes, effects and criticality analysis
MTBF mean time between failures
MTBO mean time between outages
MTTF mean time to failure
MTTFF mean time to first failure
MTTO mean time to outage
MTTR mean time to repair
RAM reliability availability and maintainability
TWT travelling wave tube w.r.t with respect to
The availability analysis is developed in order to
• verify the conformance of the selected system design with the applicable availability requirements, and
• provide inputs to estimate the life cycle cost of the system
The above design activity leads to the optimization of the system concept definition with respect to design baseline, operations and logistics provisions
The availability analysis identifies the unavailability contributors in order to quantify their impact in supporting the
• risk evaluation, reduction and control (see ECSS-M-ST-80)
The availability activity is fully integrated into the development programme to ensure the correct support to the other disciplines (e.g engineering, operations and logistics)
Specifying availability and the use of 5 metrics
General
Introduction
The criteria for mission success can be defined probabilistically in various ways Therefore, choosing the most suitable dependability requirement relies on the specific operational constraints and objectives of the mission.
Availability requirements
a Availability requirements shall respect the mandatory characteristics defined by the system engineering process
Each availability requirement must be traceable, identified, unique, and unambiguous, with a corresponding verification method These requirements should be quantitative and user-oriented, focusing on the availability of mission services rather than design aspects Additionally, the process for defining these requirements must encompass essential characteristics relevant to the project under development.
NOTE For example, what is the “threshold” between nominal behaviour and failure mode? What are the contributors to mission success under system visibility and responsibility?
NOTE For example, for which environment, interfaces, provisions,… shall the above objectives be met?
NOTE For example, for which period, at what date
4 Unavailability contributors to be taken into account in the analysis on the basis of the supplier’s visibility and responsibility for the logistic scenario or support
NOTE For example, detection, logistics, and administrative delays f Availability requirements shall be specified according to one or several of the following classes of availability specifications detailed in clause 5.2.
Different ways of specifying availability
Probability figure convention
For each type of availability requirement, it is essential to define specific figures as "mean" or "best estimate" probability figures, which represent point estimation Typically, unit failure rates are calculated using this method, often at a 60% confidence level.
Availability during mission lifetime for a specified service
Availability during the mission's lifetime is utilized for missions designed to provide a "steady-state" nominal service, allowing for a specific percentage of mission time to be defined as a measure of availability performance.
The availability during mission lifetime applies to maintainable, on-ground or in-orbit (e.g Space Station), and non-maintainable systems (e.g satellites)
Potential contributors to outage periods include maintenance activities (both preventive and corrective), periodic maneuvers, delays in reconfiguring redundant payloads, recoveries from safe mode, upsets, and eclipses.
In some applications, the mission lifetime can be subdivided into several periods for which the availability requirement applies
NOTE For example, “The system shall be operational during 11 months per year during the mission lifetime”
5.2.2.2 Requirements a If the operative scenario duration is longer than the system or equipment mean down time (more than 5 MDT), so that the instantaneous or mean requirement shall be formulated in terms of steady-state availability, assuring a simplifying (and generally conservative) approach b The availability during mission lifetime shall be computed as the ratio of time during which service is fulfilled over the total mission lifetime c For non-maintainable systems, the availability during mission lifetime requirements shall be established considering that the mission is still operational at end of life
Single point failures are not regarded as contributors to unavailability The mission's availability must consider the effects of radiation, including logic part upsets, single event transients (SET) for opto and linear components, and latch-up occurrences Additionally, it is essential to assess the functional impacts of radiation-induced single events on equipment or subsystems to provide quantified data for availability analyses.
NOTE ECSS-Q-ST-60 branch standard gives methodology to evaluate behaviour of electronic parts within their functional conditions.
Availability at a specific time (or time interval) for a specified service
Availability at a designated time is crucial for systems that have critical operations scheduled within their mission timelines Common examples include the availability of a launcher control bench at a specific moment and a scientific satellite engaged in a planned comet rendezvous mission.
5.2.3.2 Requirements a The availability at a specific time requirement shall address the probability that this “quasi-instantaneous” operation is successfully handled b For non-maintainable systems, the availability at a specific time requirement shall specify by a single requirement both the availability and reliability characteristic c Availability at a specific time shall be computed considering the mission loss probability.
Percentage or number of successfully delivered products
For some applications, the user-oriented approach characterizes the system in a
“black-box” manner and specifying availability according to the number off, for instance, delivered products, services, or mission data with respect to user demands or nominal scenario
5.2.4.2 Requirements a If the availability is specified by a percentage or number of successfully delivered products, the availability requirement shall be expressed as follows:
1 A ratio of successfully delivered products over number of requested products
NOTE For instance w.r.t the applicable criteria for performance, delay, and coverage
2 Cumulated service hours during the mission
NOTE For example, expected number of TWTs × hours operation from a 12 out of 16 redundant channels configuration over 10 years
3 Acquired database volume or percentage
NOTE For example, geographical coverage for an
Outage probability distribution
a In the specific case that the availability is specified by an outage distribution and duration, and if a maximum duration is specified, a probability of exceeding this duration shall be associated
NOTE 1 This can apply at subsystem level when a short service interruption is masked or filtered by the upper level function
NOTE 2 For example, typically a GPS receiver temporary outage is tolerated by a navigation model For this type of application, numerous short outages would be preferable to a few long ones b If several classes of outage are identified, an availability specified through an outage probability distribution shall be allocated for each class (associated duration and probabilities).
Metrics commonly used
a The availability requirements shall be quantified using one or several of the following metrics:
8 MUT and MDT (or mean time to restore);
9 MTBF or MTBO, and MTTR;
11 amount of successfully delivered products.
Metrics mapping
General
a Following the definition of the system level availability requirements according to clause 5.2, the availability metrics and supporting metrics shall be selected according to Table 5-1.
Metrics mapping at system or subsystem level
The selection of metrics at the system level is influenced by mission characteristics, particularly the decision between instantaneous, mean, or steady state availability, which should align with the mission's time schedule Additionally, the distinction between Inherent and Operational availability relies on the accessibility of information from logistic support analysis, essential for evaluating Operational availability.
If logistic and administrative delays hinder the assessment of operational availability, achieved availability can serve as a metric for considering preventive maintenance The decision to use Mean Uptime (MUT) with Mean Down Time (MDT) or outage distribution should depend on the frequency or duration of mission-specific events, or solely on the average values of uptime and downtime.
Metrics mapping at equipment level
The decision to use Mean Time Between Failures (MTBF) alongside Mean Time To Repair (MTTR) or outage distribution should be guided by the frequency or duration of mission-specific events, rather than solely relying on average values for uptime and downtime.
NOTE For availability considerations, the equipment level refers to the lowest level of replaceable unit (LRU level)
Table 5-1 Availability and supporting metrics applicable at system and subsystem level
Inherent instantaneous availability and operational instantaneous availability are crucial metrics for assessing system performance Additionally, inherent mean availability and operational mean availability provide insights into overall system reliability Inherent steady-state availability and operational steady-state availability reflect the system's performance during normal operations Key factors such as outage duration and occurrence, mean uptime (MUT), and mean downtime (MDT) are essential for understanding system efficiency Metrics like mean time between failures (MTBF), mean time between outages (MTBO), mean time to restore (MTTR), mean time to failure (MTTF), and mean time to outage (MTTO) further quantify system reliability Ultimately, the amount of successfully delivered product is a critical indicator of operational success.
System/ Subsystem level Availability during mission lifetime
Availability at a specific time interval
Percentage of successfully delivered products
Equipment level Availability during mission lifetime
Availability at a specific time interval
♦ Percentage of successfully delivered products Not applicable at equipment level
Overview of the assessment process
The availability assessment process is illustrated in Figure 6-1, with detailed steps discussed in clauses 6.2 to 6.4 and Annex A, which covers the methods for assessing availability.
Availability allocation
a The availability allocation shall be based on the following:
1 subsystem failure’s effect on the mission derived from the system analysis,
1 previous experience from similar programmes,
Compliance with requirements and assumptions still valid?
4 previously designed and developed subsystem
The order of priority for criteria is dependent on the application It is essential to address the availability requirement allocation process early in the design phases, as outlined in clause 7, to effectively assess the criticality of each system section and determine the most suitable baseline.
Iterative availability assessment
a A preliminary availability evaluation based on previous experience or judgement expertise shall be performed in order to assess a risk of not meeting the requirements
NOTE Such a preliminary availability evaluation is performed during the allocation process if a realistic allocation cannot be achieved b The assessment process shall be conducted as follows:
2 Identification of the most appropriate method for availability assessment (see Annex A)
1 Collection and verification of data coming from the lower level analyses
2 System availability assessment (including compliance verification) and identification of the project criticalities
3 Architecture, operations or logistics modifications or more accurate analysis to reach the availability objective
NOTE 1 This can imply the subsystem or equipment level contribution
NOTE 2 Example of a more accurate analysis is a refinement of the working hypothesis on the stand-by failure rates, more realistic modelling of the functional redundancies
4 Decision making process to eliminate (or reduce the impact of) the criticalities
The assessment process should be reiterated in each project phase to align with the evolution of the system design It is essential to select an appropriate assessment method, such as Analytic, Markovian, or Monte-Carlo simulation, and provide a clear explanation and justification for the chosen approach Additionally, it is important to include sources of numerical data to support the assessment.
When estimating the availability of each equipment item, it is essential to consider both random and deterministic events, utilizing internal databases from supplier data and referencing standard handbooks like MIL HDBK 217 or UTEC 80810 for calculations.
Dynamic behavior models can be illustrated as depicted in Figure 6-2, with the potential for more intricate flow charts based on system architecture and renewal process characteristics Availability analyses must be consistently updated throughout the design, integration, and operational engineering phases to accurately represent the current system baseline Additionally, for flight equipment, it is essential to consider radiation effects in the availability analysis.
Radiation single events can cause functional effects on flight equipment, necessitating an evaluation to provide quantified inputs for availability analysis Additionally, it is important to consider the impact of upset in logic components, such as SET for opto and linear parts, as well as latch-up issues.
NOTE The ECSS-Q-ST-60 branch standard describes a methodology to evaluate behaviour of electronic parts within their functional conditions
Remove a spare from the stock and order another one if relevant
Figure 6-2: Example of a dynamic behaviour model
Availability report content
a The availability analysis performed in each project phase shall contribute to the preparation of the following:
The availability assessment report must include a dedicated section detailing the specifications and requirements established during the allocation process This section should also encompass additional information such as logistics constraints, operational provisions, and reference mission scenarios essential for the effective implementation of the requirements Furthermore, the report must clearly articulate the availability evaluations and considerations, supported by relevant data and assumptions It should provide comprehensive information to facilitate a proper understanding of the evaluations conducted and enable the integration of these results with higher-level analyses Key aspects covered in the availability assessment report are crucial for ensuring clarity and coherence in the evaluation process.
1 A self-standing description of the system or equipment baseline, logistics support and operations
2 The content, derived from the relevant reports, useful for acquiring all the elements taken into account in the availability model
3 The availability requirements description and interpretation (to enable the verification of the correct requirement implementation)
4 The availability model description (including details of the selected mathematical approach and relevant assumptions or hypotheses)
5 Inputs (e.g reliability data, logistic times, and working hypotheses)
7 The conclusions and recommendations g The availability assessment reports shall be delivered at project review as per business agreement’s SOW
Overview
Availability is regularly integrated into the design process The availability characteristics can be traded with other system attributes such as cost and performance during the optimization of the design
Availability teams are regularly integrated into the development teams during the design process Availability analysis should be performed in close interaction with the following functions:
Availability activities and programme phases )
Feasibility phase (Phase A)
a During Phase A, the availability analysis shall cover the following aspects:
1 Identification of the methodology for the most realistic evaluation of the availability figures
NOTE The methodology can be improved or even changed in the following phases
2 Support to the preliminary design definition in terms of trade-off studies, rough availability estimations, identification of critical areas
3 Evaluation of the availability performance of the selected reference system or equipment baseline
4 Allocation (where necessary) of the applicable requirements at lower level
5 Planning of the availability tasks for the design definition phase (Phase B or Phase C).
Preliminary definition phase (Phase B)
a During Phase B, the availability analysis shall cover the following aspects:
1 Finalization of the availability methodology
2 A review of the lower level analyses
3 Support to local trade-off studies and design definition
4 Contribution to maintenance strategy definition
5 Definition of input data for the availability model
NOTE E.g manufacturer data, lower level outputs, data sources, and logistics information
6 Evaluation of the availability performance of the selected reference system or equipment baseline
7 Revision of the allocation process (where necessary)
8 Support to preparation of availability specifications
9 Identification of the critical areas and support to the decision making process
10 Planning of the availability tasks for the detailed design definition phase and development and preparation of the relevant section in the PA plan.
Detailed definition and production phases (Phase C/D)
(Phase C/D) a During Phase C/D, the availability analysis shall cover the following aspects:
1 A review of the lower level analyses
2 Consolidation of the input data (input data consistency check)
3 Support to the design, logistics and operations activities
5 Evaluation of the availability performance of the system or equipment baseline
6 Identification of the critical parameters or points to be monitored or controlled
7 Support to quality assurance activity during manufacture, integration and test, nonconformance review board (NRB) and failure review board
Utilization phase (Phase E)
a During Phase E, the availability analysis shall cover the following aspects:
1 Support to ground and flight operations
2 Evaluation of the design and operational changes and their impacts on availability
3 Collection of availability data during operation to assess the operational availability and issue of the operational availability report (when required)
Annex A (informative) Suitable methods for availability assessment
Overview
This annex provides a short description of the main methods available to assess availability performance
The use of probability theory in addressing availability issues has resulted in various methodologies that effectively manage practical situations with the precision demanded by clients The choice of a specific mathematical approach is influenced by multiple factors.
• a probability density function associated with the parameters involved;
• complexity of the system design and associated operations and logistics support;
• time constraints for project development;
• preventive maintenance planned during the system’s operating life;
The main methods are listed in this annex; for further details, refer to the technical literature on reliability and availability engineering.
Analytical method
The calculations use the following mathematical modelling:
MDT MUT ty MUT availabili state
This generic formula can be adapted to the application (e.g for operational or intrinsic for system as well as equipment level)
For components or functions that are physically independent, the resulting availability is evaluated using the basic formulae shown in Figure A-1, depending on the redundancy scheme
Function 1 Availability: A1 (full time) operational use: X %
Markov process
This approach, shown in Figure A-2, is based on the exponential law for the time to failure and the time to repair Markov process theory is important because:
• it provides a good representation of system behaviour for communication with the engineering teams, and
• it allows the estimation of good approximations for the asymptotic (or steady-state) availability of some space applications, and has, for example, been efficiently applied to space ground segments
The complexity of the system can lead to a significant number of expected states, affecting both calculation time and accuracy Additionally, accurately representing logistic times, typically linked to normal or log-normal distributions, poses challenges A Markov Graph illustrates a straightforward parallel model, where states 1 and 2 denote a functional system that may or may not have redundancy available in each state.
Figure A-2: Example of Markov graph
Monte-Carlo simulation
This numerical technique allows the evaluation of availability taking into account, in a realistic way, all aspects associated with the design, logistics and operations
Petri nets are widely utilized in various applications to model system operating scenarios, as illustrated in Figure A-3 One of the key benefits of Monte Carlo simulation is its capability to manage complex system scenarios that include both deterministic and probabilistic delays, along with providing one-shot reliability Nonetheless, this method may present certain challenges.
• heavy effort for system modelling (not recommended for short-term programmes), and
• long calculation times (not acceptable during the trade-off or feasibility study)
Figure A-3: Example of Petri net modelling
Element 2 λ Failure rate à: Repair rate
Annex B (informative) Typical work package description for availability activities
The RAM group at the system or subsystem level is tasked with several critical activities as outlined in the business agreement's SOW These include reviewing and verifying availability requirements through preliminary evaluations to prevent the implementation of unrealistic expectations, identifying suitable availability models based on mission scenarios and project constraints, and preparing detailed specifications to translate system availability needs Additionally, the group will define the system availability model, review lower-level availability reports, and consolidate inputs from various design areas such as engineering and logistics They will also evaluate system availability, conduct trade-off analyses, and support project management in finalizing operational costs Progress reporting on availability activities, assistance during design reviews, and audits of subcontractors' knowledge in the availability discipline are also essential tasks Furthermore, the RAM group will support the logistics and operations department with probabilistic assessments and provide assistance during the system exploitation phase.
EN reference Reference in text Title
EN 16601-00 ECSS-S-ST-00 ECSS system – Description, implementation and general requirements
EN 16602-30 ECSS-Q-ST-30 Space product assurance - Dependability
EN 16601-10 ECSS-M-ST-10 Space project management – Project planning and implementation
EN 16601-80 ECSS-M-ST-80 Space project management - Risk management
MIL HDBK 217 Military handbook - Reliability prediction of electronic equipment
UTEC 80810 Modèle universel pour le calcul de la fiabilité prévisionnelle des composants, cartes et équipements électroniques, CNET