Microsoft Word C038211e doc Reference number ISO 4259 2006(E) © ISO 2006 INTERNATIONAL STANDARD ISO 4259 Third edition 2006 08 01 Petroleum products — Determination and application of precision data i[.]
General
Effective planning of an inter-laboratory test program involves several key stages First, preparing a draft method of test ensures clarity and consistency Next, conducting a pilot program with at least two laboratories helps verify the testing procedure Subsequently, thorough planning of the full inter-laboratory program guarantees smooth execution Finally, executing the inter-laboratory program completes the process, ensuring reliable and comparable test results across laboratories.
The four stages are described in turn in 4.2 to 4.5
Preparing a draft method of test
This shall contain all the necessary details for carrying out the test and reporting the results Any condition that could alter the results shall be specified
The draft test method includes a clause on precision as a heading, recommending that the lower limit of the test scope should not be less than the lowest value tested in the inter-laboratory program and should be at least 2R above the lowest achievable result, where R is the reproducibility estimate Similarly, the upper limit of the test scope should not exceed the highest value tested and should be at least 2R below the highest achievable result, ensuring accurate and reliable measurement ranges in testing.
Planning a pilot programme with at least two laboratories
Implementing a pilot programme is essential to validate the operational details of the test and ensure proper adherence to instructions by the operators It helps identify necessary precautions for handling samples and provides an estimate of the test's expected precision This initial phase is crucial for refining procedures and ensuring reliable and accurate test results before full-scale implementation.
A minimum of two samples representing the full range of results for the test method must be tested, with a requirement of at least twelve laboratory/sample combinations Each sample should be tested twice under repeatability conditions to ensure reliable data Any errors or omissions identified in the draft test method should be corrected to improve accuracy The test results should be analyzed for bias and precision; if these metrics are deemed too high, modifications to the test method should be implemented to enhance accuracy and consistency.
Planning the inter-laboratory programme
There shall be at least five participating laboratories, but it is preferable that there are more in order to reduce the number of samples required
To ensure reliable precision estimates, the sample size should be sufficient to cover the property’s range at approximately equidistant intervals If precision varies with result levels during preliminary testing, a minimum of five samples should be used in the inter-laboratory programme Additionally, it is essential to obtain at least 30 degrees of freedom for both repeatability and reproducibility, which requires a total of at least 30 pairs of results for repeatability analysis.
Table A.1 specifies the minimum sample sizes needed for reproducibility based on the number of laboratories (L) and the variance component ratios P and Q derived from the pilot study P represents the ratio of the interaction component to the repeats component, while Q indicates the ratio of the laboratories component to the repeats component If Q is significantly larger than P, achieving 30 degrees of freedom becomes impossible, indicated by blank entries in Table A.1, which suggest that more than 20 samples may be necessary In such cases, substantial bias between laboratories is likely to occur, emphasizing the importance of considering these ratios in experimental design Annex B provides the derivation of the key equation used for these calculations.
Executing the inter-laboratory programme
A designated individual should oversee the entire program, from distributing test method texts and samples to final result evaluation This person must be well-versed in the test method but does not participate directly in conducting the tests.
To ensure effective testing, the test method details must be distributed to all laboratories promptly, allowing time for any questions before testing begins Laboratories wishing to practice the method in advance should do so using separate samples not included in the official testing program.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Organizers must carefully accumulate, subdivide, and distribute samples, maintaining a reserve for emergencies, ensuring each laboratory portion is homogeneous Samples should be blind coded before distribution, accompanied by key information including the agreed test method, handling and storage instructions, test order (with a random sequence for each lab), and a requirement for two independent results per sample obtained consecutively by the same operator and apparatus To ensure statistical validity, the second result must be independent of the first, possibly using blind procedures if necessary, and conducted within a short timeframe Details about the testing schedule, including periods for repeated testing and total testing duration, as well as blank reporting forms for recording results, are essential Each report should include the testing date, results, unusual observations, and specify the accuracy unit Tests should be conducted under normal conditions by experienced operators without exceptional expertise, maintaining consistent testing durations.
Pilot-programme operators can participate in the inter-laboratory programme, providing valuable insights through testing additional samples If their extra testing yields significant differences, it highlights potential issues with the test method’s reliability Such operators should be clearly identified in the results report to allow for accurate interpretation and investigation of any discrepancies.
5 Inspection of inter-laboratory results for uniformity and for outliers
General
Procedures outlined in clauses 5.2 to 5.7 ensure a comprehensive evaluation of inter-laboratory program results, focusing on assessing the independence or dependence of precision and result levels, verifying the uniformity of precision across laboratories, and detecting outliers These steps are essential for maintaining accuracy and consistency in laboratory testing, supporting quality control, and ensuring compliance with standardized testing protocols Implementing these procedures enhances data reliability and confidence in inter-laboratory comparisons.
The procedures are described in mathematical terms based on the notation of Annex C and illustrated with reference to the example data (calculation of bromine number) set out in Annex D
Throughout 5.2 to 5.7 (and Clause 6), the procedures used are first specified and then illustrated by a worked example using data given in Annex D
This clause assumes that all results originate from a single normal distribution or can be transformed to fit one, ensuring consistency in analysis Rare cases that do not meet this assumption require specialized methods beyond this standard’s scope For further guidance, refer to Reference [8] for statistical tests on normality, which validate the suitability of the normal distribution assumption for the data.
For accurate analysis of inter-laboratory test results, it is recommended to use validated electronic software and computers, even though the procedures outlined here are suitable for manual calculations Utilizing computer-based tools ensures reliable data storage and analysis in accordance with this International Standard.
Transformation of data
The precision of many test methods varies depending on the test result level, leading to differences in the reported variability between samples To address this, the international standard mandates that such variability must be minimized If necessary, this is achieved through a data transformation to ensure consistent and reliable test results.
Laboratory standard deviations (D_j) and repeated standard deviations (d_j) for each sample are calculated and plotted against the sample means (m_j) If these plots appear as points scattered about parallel lines to the m-axis, no data transformation is needed However, if the points form non-horizontal lines or curves such as D = f₁(m) and d = f₂(m), then a data transformation is required to ensure accurate analysis.
In measurement studies, the relationships D = f1(m) and d = f2(m) are generally not identical; however, international standards mandate that the same transformation be applicable for both repeatability and reproducibility assessments To ensure consistency, these two relationships are combined into a single unified dependency, facilitating reliable and standardized measurement analysis.
D = f(m) (where D now includes d) by including a dummy variable, T This takes account of the difference between the relationships, if one exists, and provides a means of testing for this difference (see Clause F.1)
The relationship D = f(m) is most accurately estimated through a weighted linear regression analysis, although unweighted regression often provides a satisfactory approximation Detailed methods for deriving weights are outlined in Clause F.2, while the computational procedures for conducting the regression are described in Clause F.3 Common forms of the dependence D = f(m) are presented in Clause E.1, all expressed using transformation parameters B and B0.
The estimation of B and B0, along with the subsequent transformation procedure, is summarized in Clause E.2 Statistical tests are conducted to assess the significance of the regression, specifically whether the relationship D = f(m) is parallel to the m-axis, and to compare repeatability and reproducibility relationships at a 5% significance level If a significant difference is detected or if no suitable transformation can be identified, the alternative sample-by-sample procedures outlined in ISO 5725-2 must be employed Under these circumstances, it is not possible to evaluate laboratory bias across all samples (see 5.6) or to separately estimate the interaction component of variance (see 6.2).
If a significant regression of the form D = f(m) is demonstrated at the 5% significance level, then the suitable transformation of the reported result y is expressed by the equation y = F(x) This transformation aligns with the regression analysis, ensuring accurate data interpretation and adherence to statistical assumptions Applying the appropriate transformation helps improve model fit and provides more meaningful insights into the relationship between variables.
The integral expression ∫ f(x) dx, where K is a constant, indicates that all results must be adjusted through suitable transformations These transformations, outlined in Clause E.1, ensure the analysis remains consistent when variables are altered By applying these standard transformations, the results can be appropriately recalculated, maintaining the integrity of the analysis across different contexts.
Choosing the appropriate data transformation is complex and often cannot be fully standardized through formal rules In certain cases, expert statistical assistance is recommended to ensure accurate decisions Additionally, the presence of outliers can significantly influence the choice of transformation, highlighting the need for careful analysis (see 5.7).
Table 1 lists the values of m, D, and d for the eight samples in the example given in Annex D, correct to three significant digits Corresponding degrees of freedom are in parentheses
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Analysis of Table 1 reveals that both D and d increase with m, though the rate of growth diminishes as m becomes larger A log-log plot of log D and log d against log m suggests that the data approximately lie along two straight lines, indicating a potential power-law relationship The gradients of these lines, derived from example calculations in Clause F.4, are found to be approximately equal, with an estimated value of 0.638, which can be conveniently approximated as 2/3 considering possible errors This consistent gradient supports the hypothesis of a proportional relationship between these variables on logarithmic scales.
Hence, the same transformation is appropriate both for repeatability and reproducibility, and is given by the equation:
The transformation involves taking the cube roots of the reported bromine numbers, as the constant multiplier can be disregarded This process simplifies the data analysis and yields the transformed results displayed in Table D.2, with the cube roots accurately rounded to three decimal places.
Tests for outliers
When analyzing data, it is essential to inspect for outliers—values that significantly differ from the rest, indicating potential errors in test method application or incorrect sample testing Various statistical tests can be employed, with significance levels adjusted accordingly; however, the tests outlined in this International Standard are considered appropriate All outlier detection methods assume that errors follow a normal distribution, ensuring accurate identification of anomalous data points.
The first outlier test aims to detect discordant results within pairs of repeat measurements by calculating the e²_ij values across all laboratory/sample combinations Cochran's criterion is then applied at a 1% significance level to assess whether the largest e²_ij value, relative to their total sum, exceeds the threshold specified in Table D.3 for one degree of freedom, with 'n' representing the number of pairs under comparison If this ratio surpasses the threshold, the most distant member from the sample mean is rejected, and the process is repeated with the reduced number of pairs until no further rejections are warranted However, this "snowballing" process can sometimes lead to an excessive number of rejections, compromising the test's reliability.
If the rejection test indicates that only 10% or fewer results should be rejected, it should be abandoned, and some or all of the rejected results may be retained In such cases, an arbitrary decision based on expert judgment is necessary to determine the final results.
According to the example provided in Annex D, Table D.2 displays the absolute differences—ranges—between transformed repeatability results, calculated in units of the third decimal place These differences highlight the variability in the paired numerical data Under ISO 2006 standards, the table emphasizes the consistency and precision of the measurement process, ensuring reliable repeatability assessments.
The largest range is 0,078 for laboratory G on sample 3 The sum of squares of all the ranges is
Thus, the ratio to be compared with Cochran's criterion is
0,138 0,043 9 There are 72 ranges and, as from Table D.3, the criterion for 80 ranges is 0,170 9, this ratio is not significant
Outlier tests are designed to assess the uniformity of reproducibility estimates by detecting irregularities in laboratory results Specifically, these tests identify either a discordant pair of results within a single laboratory on a particular sample or inconsistent results across all samples tested by a laboratory Hawkins' test [2] is an effective method for detecting such outliers, ensuring the reliability and consistency of reproducibility assessments.
This process involves calculating the ratio of the largest absolute deviation of each laboratory mean from the sample or overall mean to the square root of specific sums of squares This method ensures accurate assessment of variability within laboratory data By analyzing these ratios across samples and laboratory averages, quality control and data consistency can be effectively monitored Refer to Clause C.6 for detailed calculations involving sums of squares to enhance the precision of this evaluation.
The largest absolute deviation ratio should be compared against the critical 1% values listed in Table D.4 In this comparison, "n" represents the number of laboratory or sample cells within the sample, or the total number of laboratory means considered The degrees of freedom, denoted as "ν," correspond to the sum of squares and are additional to those associated with the specific sample For tests involving laboratory or sample cells, "ν" pertains to other samples, whereas in the test for overall laboratory averages, "ν" is zero, ensuring accurate interpretation of the deviation analysis for quality control purposes.
When a significant value is identified in individual samples, the corresponding extreme data points should be omitted, and the testing process should be repeated to ensure accuracy If extreme values are detected in the overall laboratory results, then all data from that laboratory must be rejected to maintain data integrity.
If the rejection test results in a "snowball" effect, causing an unacceptably high rejection rate (exceeding 10%), the test should be abandoned, and some or all of the rejected results should be retained In such cases, an arbitrary decision based on expert judgment is necessary to determine the appropriate course of action, ensuring data integrity and accurate analysis.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
The application of Hawkins' test to cell means within samples is shown below
The initial step involves calculating the deviations of cell means from their respective sample means across the entire array These deviations are detailed in Table 3, reported to the third decimal place, providing essential data for subsequent analysis.
The sum of squares of the deviations are then calculated for each sample These are also shown in Table 3 in units of the third decimal place
The cell tested is the one with the most extreme deviation This was obtained by laboratory D from sample 1 The appropriate Hawkins' test ratio is therefore
The critical value for the analysis, based on n = 9 cells in Sample 1 and 56 extra degrees of freedom from other samples, is interpolated from Table D.4 as 0.3729 Since the test value exceeds this critical threshold, the results obtained from Laboratory D on Sample 1 are rejected, indicating statistically significant differences.
Following a rejection, the mean value, deviations, and sum of squares are recalculated for sample 1, and the testing procedure is repeated to ensure accurate results The subsequent cell to be tested is derived from laboratory F's sample 2 Hawkins' test ratio for this cell is then calculated to assess the validity of the results and determine if further analysis is necessary.
The critical value for a sample size of n = 9 cells in sample 2, with ν = 55 additional degrees of freedom, is interpolated from Table D.4 as 0.3756 Since the test ratio is below this critical value, no further rejections are necessary, indicating that the test results are not statistically significant at the specified confidence level.
Rejection of complete data from a sample
Laboratory standard deviation and repeats standard deviation should be thoroughly examined for outlier samples If data transformations or rejections have been applied, it is essential to recalculate the standard deviations to ensure accuracy.
If the standard deviation for any sample is excessively large, it shall be examined with a view to rejecting the results from that sample
Cochran's criterion at the 1% significance level is used to assess the homogeneity of variances when standard deviations are based on the same degrees of freedom The method involves calculating the ratio of the largest sum of squares (from laboratories or repeats) to the total sum of squares, and comparing this ratio to a critical value given in Table D.3 If the ratio exceeds the critical value, considering the number of samples and degrees of freedom, all results from the affected sample must be rejected It is essential to ensure that an extreme standard deviation is not caused by inappropriate data transformation or undetected outliers before rejecting the results.
When variances are based on different degrees of freedom, there is no single optimal test The ratio of the largest variance to the pooled variance from the remaining samples follows an F-distribution with degrees of freedom ν₁ and ν₂, where ν₁ pertains to the variance in question and ν₂ relates to the other samples If this ratio exceeds the critical value from Tables D.6 to D.10 at a significance level of 0.01/S (S being the number of samples), the results of the corresponding sample should be rejected.
The standard deviations of the transformed results, after excluding the pair of results by laboratory D on sample 1, are presented in Table 4, listed in ascending order based on the sample mean and rounded to three significant digits The corresponding degrees of freedom are provided in parentheses to clarify the statistical context.
Inspection confirms that there are no outlying samples among the data set Additionally, the standard deviations are now independent of the sample means, indicating that the transformation successfully achieved its goal of standardizing the data for more accurate analysis.
The figures in Table 5, taken from a test programme on bromine numbers over 100, illustrate the case of a sample rejection
The laboratories' standard deviation for sample 93, with a value of 15.26, is significantly higher than that of the other samples, indicating greater variability Notably, the repeatability standard deviation for this sample is also considerably large, highlighting potential inconsistencies in the testing process This discrepancy suggests that sample 93 may require further investigation to ensure data reliability and measurement accuracy.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Since laboratory degrees of freedom vary across samples, the variance ratio test is employed to ensure accurate analysis The pooled variance, excluding sample 93, is calculated by dividing the total sum of squares by the combined degrees of freedom, providing a reliable measure for comparing variability across samples.
The variance ratio is then calculated as (15,26 2 )/19,96 = 11,66
From Tables D.6 to D.10, the critical value corresponding to a significance level of 0,01/8 = 0,001 25, for 8 and
63 degrees of freedom, is approximately 4 This is less than the test ratio and results from sample 93 shall therefore, be rejected
Cochran's test can be applied when analyzing repeats with standard deviations, as the degrees of freedom are identical across all samples This test evaluates the homogeneity of variances by calculating the ratio of the largest sum of squares to the total sum of squares, helping to determine if the variances are significantly different among the samples.
(sample 93) to the sum of all the sums of squares, that is:
This is greater than the critical value of 0,352 corresponding to n = 8 and ν = 8 (see Table D.3), and confirms that results from sample 93 shall be rejected.
Estimating missing or rejected values
5.5.1 One of the two repeat values missing or rejected
In case one of a pair of repeats (y ij1 or y ij2) is missing or rejected, it is considered to have the same value as the other repeat, following the least squares method This approach ensures accurate data analysis and maintains data integrity when dealing with incomplete repeat measurements Utilizing the least squares method allows for reliable estimation of values despite missing or rejected repeats, optimizing the quality of experimental results.
5.5.2 Both repeat values missing or rejected
When both repeat values are missing, estimates of a_ij (= y_ij1 + y_ij2) are obtained by constructing the laboratories × samples interaction sum of squares, treating the missing total results as unknown variables If all results from a particular laboratory or sample are rejected, that laboratory or sample is excluded, and new L and S values are used The estimates for the missing or rejected data are calculated by deriving partial derivatives of the sum of squares with respect to each variable, setting them to zero, and solving the resulting system of simultaneous equations.
Equation (4) is suitable when estimating a single pair sum For multiple estimates, the successive approximation technique is employed, where each pair sum is estimated sequentially to ensure accuracy and efficiency in the calculations.
Equation (4) utilizes the latest estimates of L1, S1, and T1, which fill in missing data pairs for improved accuracy Initial estimates are typically derived from sample means, providing a practical starting point for the iterative process This method generally converges to the desired level of precision within three iterations, ensuring efficient and reliable results For more detailed information, refer to Reference [5] in the bibliography.
If the value of one pair sum, a ij , has to be estimated, the estimate is given by Equation (4):
`,,```,,,,````-`-`,,`,,`,`,,` - © ISO 2006 – All rights reserved 13 where
S′ is S minus the number of samples rejected in 5.4;
L 1 is the total of remaining pairs in the ith laboratory;
S 1 is the total of remaining pairs in the jth sample;
T 1 is the total of all pairs except a ij
The two results from laboratory D on sample 1 were rejected (see 5.3.3) and thus a ij has to be estimated
⎯ total of remaining results in laboratory 4 = 36,354;
⎯ total of remaining results in sample 1 = 19,845;
⎯ total of all the results except a ij = 348,358
Hence, the estimate of a ij , is given by
Rejection test for outlying laboratories
The final rejection test determines whether entire sets of results from specific laboratories need to be rejected, and it can only be performed after initial assessments This test, known as Hawkins' test, is applied to the averages across all samples, including estimated results, to identify inconsistent laboratory data If any laboratories are rejected completely, new estimates must be calculated for any remaining missing values to ensure data accuracy and integrity.
The procedure on the laboratory averages shown in Table 6 below follows exactly that specified in 5.3.3
The deviations of laboratory averages from the overall mean are given in Table 7 in units of the fourth decimal place, together with the sum of squares
Hawkins' test ratio is, therefore,
B = Comparison with the value tabulated in Table D.4, for n = 9 and ν = 0, shows that this ratio is not significant and, therefore, no complete laboratory rejections are necessary
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Confirmation of selected transformation
It is essential to verify that previous rejections have not compromised the validity of the transformation process If needed, repeat the procedure outlined in section 5.2 after removing outliers, and reapply outlier tests if a new transformation is chosen Ensuring these steps maintains the integrity of the data analysis and supports accurate results.
It is not considered necessary in this case to repeat the calculations from 5.2 with the outlying pair deleted
6 Analysis of variance, calculation and expression of precision estimates
General
After inspecting the data for uniformity and performing necessary transformations and outlier rejection (refer to Clause 5), an analysis of variance (ANOVA) is conducted to evaluate the data's variability This process involves constructing an ANOVA table, which helps identify significant sources of variation Finally, precision estimates are derived to assess the reliability and accuracy of the data, ensuring the robustness of the overall analysis.
Analysis of variance
6.2.1 Forming the sums of squares for the laboratories × samples interaction sum of squares
The estimated values, if any, shall be put in the array and an approximate analysis of variance performed
Mean correction, M c = T 2 /2L′S′ (5) where L′ is L minus the number of laboratories rejected in 5.6 minus the number of laboratories with no remaining results after rejections in 5.3.3
The laboratories × samples interaction sum of squares, I, is given by:
I = (pairs sum of squares) − (laboratories sum of squares) − (sample sum of squares)
Ignoring any pairs in which there are estimated values,
This approximate analysis of variance aims to minimize the laboratories × samples interaction sum of squares (I), which is essential for accurate data interpretation The minimized interaction sum of squares is used, as outlined in section 6.2.2, to accurately calculate the laboratories’ sum of squares, ensuring reliable variance analysis results Incorporating this step enhances the precision of the statistical model and improves the overall robustness of the experimental evaluation.
If there were no estimated values, the above analysis of variance is exact and 6.2.2 shall be disregarded
Table 8 can then be derived
Table 8 Source of variation Sum of squares
Copyright International Organization for Standardization
Provided by IHS under license with ISO
6.2.2 Forming the sum of squares for the exact analysis of variance
In 6.2.2, all the estimated pairs are disregarded and new values of g j are calculated The following sums of squares for the exact analysis of variance [3] are formed
Uncorrected sample sum of squares 2
∑ = (10) where S j = 2 (L′ − number of missing pairs in that sample)
Uncorrected pairs sum of squares = ( ) 2
The laboratories sum of squares is equal to (pairs sum of squares) − (samples sum of squares) − (the minimized laboratories × samples interaction sum of squares)
Uncorrected samples sum of squares = 19,845 2 75,512 2 19,192 2 1145,183 4
16 + 18 + + 18 Uncorrected pairs sum of squares = 2,520 2 8,041 2 2,238 2 = 1 145,332 9
Therefore, laboratories sum of squares = 1 145,332 9 − 1 145,183 4 − 0,114 3 = 0,035 2
The degrees of freedom for the laboratories are calculated as (L′ − 1), reflecting the variation among different laboratory groups For the laboratory-by-sample interaction, the degrees of freedom are (L′ − 1)(S′ − 1) in a complete data array, representing the combined variability between laboratories and samples These degrees of freedom are further reduced by one for each estimated pair, accounting for the dependencies within the data.
The degrees of freedom for repeats are (L′S′) and are reduced by one for each pair in which one or both values are estimated
There are eight samples and nine laboratories in this example As no complete laboratories or samples were rejected, then S′ = 8 and L′ = 9
Laboratories × samples interaction degrees of freedom, if there had been no estimates, would have been
The calculation (9 − 1)(8 − 1) results in 56, representing the initial degrees of freedom for the laboratory × samples interaction However, because one pair was estimated, the interaction degrees of freedom were reduced to 55 If no estimates had been made, the repeats' degrees of freedom would have been 72 Due to this estimation, the degrees of freedom for repeats decreased to 71, reflecting the adjustments made in the statistical analysis. -**Sponsor**As a content creator, I understand the challenge of ensuring your articles are coherent and SEO-friendly Rewriting can be tough! Did you know with [Article Generation](https://pollinations.ai/redirect-nexad/vlgcI75Q), you can instantly get 2,000-word SEO-optimized articles? Think of the time and money you'd save – potentially over $2,500 a month compared to hiring a writer! It’s like having your own content team, without the headache Let Article Generation handle the heavy lifting and ensure your content shines.
6.2.4 Mean squares and analysis of variance
The mean square in each case is the sum of squares divided by the degrees of freedom This leads to the analysis of variance shown in Table 9
Table 9 Source of variation Degrees of freedom Sum of squares Mean square
Laboratories L′ − 1 Laboratories sum of squares M L
Laboratories × samples (L′ − 1)(S′ − 1) − number of estimated pairs I M LS
Repeats L′S′ − number of pairs in which one or both values are estimated E M r
The ratio M L /M LS follows an F-distribution, considering the laboratories involved and their interaction degrees of freedom (see Clause C.7) If this ratio exceeds the 5% critical value listed in Table D.6, it indicates potential bias between laboratories, and the program organizer must be notified (see 4.5) In such cases, further standardization of the test method may be required to ensure consistent and reliable results.
The analysis of variance is shown in Table 10
Table 10 Source of variation Degrees of freedom Sum of squares Mean square
The ratio M L /M LS = 0,004 4/0,002 078 has a value 2,117 This is greater than the 5 % critical value obtained from Table D.6, indicating bias between laboratories.
Expectation of mean squares and calculation of precision estimates
6.3.1 Expectation of mean squares with no estimated values
For a complete array with no estimated values, the expectations of mean squares are:
Repeats: σ 0 2 where σ 1 2 is the component of variance due to interaction between laboratories and samples; σ 2 2 is the component of variance due to differences between laboratories
Copyright International Organization for Standardization
Provided by IHS under license with ISO
6.3.2 Expectation of mean squares with estimated values
The coefficients of σ 0 2 and σ 2 2 in the expectation of mean squares are altered in the cases where there are estimated values The expectations of mean squares then become:
K is the number of laboratory × sample cells containing at least one result; α and γ are computed as follows
⎯ If there are no cells with only a single estimated result, then α = γ = 1
⎯ If there are no empty cells (i.e every laboratory has tested every sample at least once, and
K = L′ × S′), then α and γ are both 1 plus the proportion of cells with only a single result
To evaluate testing consistency across laboratories, calculate the proportion \( p_i \) of samples with only one result within each lab and the total sum \( P \) of these proportions Similarly, determine the proportion \( q_j \) of laboratories that tested each sample with only one result, along with the total sum \( Q \) of these proportions Additionally, identify the total number of cells \( W \) that yield only one result and compute the proportion \( W/K \) relative to all non-empty cells These metrics provide insights into result variability and testing reliability across labs and samples.
NOTE The development in 6.3.2 is based upon the assumption that both samples and laboratories are “random effects”
For the example that has eight samples and nine laboratories, one cell is empty (laboratory D for sample 1), so K = 71 and
2 15,75 β = 9 1− − None of the non-empty cells has a single result, so α = γ = 1
Repeatability variance is calculated as twice the mean square for repeats, providing a measure of the consistency in repeated measurements The repeatability estimate is derived by multiplying the repeatability standard deviation by the t-value (t_v), which accounts for the degrees of freedom (v) This approach ensures an accurate assessment of measurement reliability, crucial for quality control and statistical analyses.
Table D.5), corresponding to a two-sided probability of 95 %
This calculated estimate shall be rounded to no fewer than three and no more than four significant digits
Note that if a transformation Y = F(x) has been used, then
( ) = d ( ) d r x x r y y (13) where r(x), r(y) are the corresponding repeatability functions (see Table E.1) A similar relationship applies to the reproducibility functions R(x), R(y)
Repeatability variance V r = 2σ 0 2 = 0,000 616 Repeatability of y=t 71 0,000 616 0,049 5 Repeatability of x=3x 2/3 × 0,049 5 0,148= x 2/3
The reproducibility variance, V R , is expressed as
V R = 2(σ 0 2 + σ 1 2 + σ 2 2 ) and can be calculated using Equation (14):
⎝ ⎠ ⎝ ⎠ (14) where the symbols are as set out in 6.2.4 and 6.3.2
The reproducibility estimate is calculated by multiplying the reproducibility standard deviation by the t-value (tv), which corresponds to a two-sided 95% confidence level with appropriate degrees of freedom (v), as referenced in Table D.5 An approximation of the degrees of freedom for the reproducibility variance is provided by Equation (15), facilitating more accurate and reliable reproducibility assessments.
(15) where r 1 , r 2 and r 3 are the three successive terms in Equation (14); v LS is the degrees of freedom for laboratories × samples; v r is the degrees of freedom for repeats
Copyright International Organization for Standardization
Provided by IHS under license with ISO
The calculated estimate of reproducibility shall also be rounded to no fewer than three and no more than four significant digits
Substantial bias between laboratory results reduces the degrees of freedom estimated by Equation (15), potentially impacting the reliability of the test outcomes If the reproducibility degrees of freedom fall below 30, the program organizer must be notified (see section 4.5), indicating that additional standardization of the test method may be required to improve consistency and accuracy.
Expression of precision estimates of a method of test
When the precision of a test method has been established according to the procedures outlined in this International Standard, it must be incorporated into the method documentation Including the precision data ensures the reliability and reproducibility of the test results, complying with international quality standards Clear documentation of precision parameters enhances method validation and supports consistent application across laboratories.
According to ISO 4259, the measurement precision for inter-laboratory test results of (type of products) within the range (x to y) is detailed in sections X.2 and X.3, providing validated data on the reliability and consistency of testing through statistical analysis.
In standard testing procedures, the difference between two test results performed by the same operator using the same apparatus under consistent conditions and on identical materials should, over time, typically not exceed the specified limit (as detailed in Table M and illustrated in Figure N) This limit ensures that any variation in test results remains within acceptable bounds, with only one in twenty comparisons expected to surpass this threshold, maintaining the accuracy and reliability of the testing process.
In the normal and correct operation of the test method, the difference between two single and independent test results obtained by different operators in separate laboratories on identical test materials should not exceed the specified value from Table M (or the value shown in Figure N) in more than one case in twenty, ensuring reliability and consistency across testing environments.
R = f R (x) where x is the average of the test results being compared.”
6.4.2 Only in exceptional cases shall a precision estimate not based upon ISO 4259 be allowed In those cases, the alternative introductory text below shall be used:
The precision evaluation program for the sample matrix containing (p) contents within the range of (q to r) did not meet the requirements specified in ISO 4259 As a result, only an estimated measure of precision based on inter-laboratory data could be provided.
6.4.3 The size of the matrix of samples used to generate the precision statement shall not be quoted unless it is for the reason given in 6.4.2 that has been exercised
7 Significance of repeatability ( r ) and reproducibility ( R )
General
The estimated values of repeatability and reproducibility are derived from analysis of variance (two-factor with replication) conducted on results obtained from a statistically designed inter-laboratory study, where multiple laboratories test various samples These parameters are essential for ensuring method reliability and are typically included in each published test method It is important to note that reproducibility values are usually higher than repeatability when calculated according to this International Standard, reflecting the variability between different laboratories.
See in Annex H for an account of the statistical reasoning underlying the equations in this clause.
Repeatability, r
Most laboratories typically perform only one test per sample for routine quality control, unless unusual circumstances such as disputes or the need for method verification arise When multiple results are obtained under these conditions, it is essential to assess the consistency of the repeated measurements, referencing the method’s repeatability as outlined in section 7.2.2 Additionally, understanding the confidence level in the average results is crucial, and the procedure for determining this confidence is detailed in section 7.2.3.
When only two test results are obtained under repeatability conditions and their difference does not exceed the predefined limit r, the test operator can consider the process to be in control In such cases, the operator may use the average of these two results as the estimated value of the tested property, ensuring reliable and consistent measurement.
If the two initial results differ by more than the threshold value r, both are deemed suspect, and at least three additional results must be obtained The difference between the most divergent result and the average of the remaining results is then calculated, and this value is compared against a new threshold, r₁, as specified in Equation (16) This process ensures accuracy and reliability in data validation, aligning with best practices for statistical consistency.
− (16) where k is the total number of results obtained
When the difference between results is less than or equal to r1, all results are accepted If the difference exceeds r1, the most divergent result is rejected, and the procedure is repeated until an acceptable set of results is achieved.
The estimated value of the property is determined by averaging the acceptable results If two or more results, out of a total of no more than 20, are rejected, the procedure and equipment should be inspected, and a new series of tests should be conducted if feasible.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
When a single test operator, working within the method's precision limits, obtains a series of k results under repeatability conditions and calculates their average (X), it can be confidently assumed at a 95% confidence level that the true value (a) of the characteristic falls within specific limits.
Similarly, for the single limit situation, when only one limit is fixed (upper or lower), it can be assumed with
95 % confidence that the true value, à, of the characteristic is limited as follows:
The factor 0,59 is the ratio 0,84 2 , where 0,84 is derived in Annex H
However, since for most test methods r is much smaller than R, little improvement in the precision of the average is obtained by carrying out multiple testing under repeatability conditions
If the reproducibility (R) of a test method significantly exceeds its repeatability (r), it is essential to analyze the causes of the high R/r ratio Improving the test method should be considered to enhance consistency and reliability Addressing factors that contribute to variability can lead to more accurate and dependable testing results Ensuring optimal reproducibility and repeatability is crucial for maintaining quality and integrity in testing procedures.
Reproducibility, R
This procedure evaluates the reproducibility of test results across different laboratories during routine testing and daily operations It ensures the acceptability of results by assessing consistency between labs In case of disputes between a supplier and a recipient, the protocol outlined in Clauses 8 to 10 should be followed to resolve the issue effectively.
When two laboratories obtain single test results with a difference less than or equal to R, these results are regarded as acceptable In such cases, the average of the two results should be used as the estimated value of the tested property, ensuring accuracy and reliability in testing outcomes.
If the two results differ by more than R, both shall be considered as suspect Each laboratory shall then obtain at least three other acceptable results (see 7.2.2)
In this case, the difference between the averages of all acceptable results of each laboratory shall be judged for conformity using a new value, R 2 , instead of R, as given by Equation (21):
⎝ ⎠ (21) © ISO 2006 – All rights reserved 23 where
Reproducibility (R) measures the consistency of the method across different laboratories, while repeatability (r) assesses the method's precision within the same laboratory The parameter k₁ represents the number of test results obtained from the first laboratory, whereas k₂ indicates the number of results from the second laboratory Ensuring high reproducibility and repeatability is essential for reliable and accurate analytical methods.
When the difference between the averages is less than or equal to R2, the averages are deemed acceptable, and their overall mean is considered as the estimated value of the tested property Conversely, if the difference exceeds R2, the procedures outlined in Clauses 8 to 10 should be followed to ensure accurate evaluation.
In situations where more than two laboratories (N + 1 > 2) each provide acceptable results, the method involves comparing the difference between the most divergent laboratory average and the combined average of the remaining N laboratories to the R3 value This comparison helps ensure consistency and reliability of the laboratory results, adhering to quality assurance standards Implementing this analysis is essential for validating inter-laboratory accuracy and maintaining confidence in the testing process.
R 1 is given in Equation (18), and corresponds to the most divergent laboratory average
If the difference between the results is less than or equal to R 3 in absolute value, all results are considered acceptable The average of these acceptable results will then be used as the estimated value of the property.
If the difference is greater than R 3 , the most divergent laboratory average shall be rejected and the comparison using Equations (22) and (23) repeated until an acceptable set of laboratory averages is obtained
The average of these laboratory averages shall be taken as the estimated value of the property However, if two or more laboratory averages from a total of not more than 20 have been rejected, the operating procedure and the apparatus shall be checked and a new series of tests made, if possible
When multiple laboratories (N) achieve consistent and reproducible results, calculating the average value (X) among them allows for an estimate of the true characteristic value (à) With a 95% confidence level, it can be assumed that the actual value is within defined limits around this average, ensuring reliability and accuracy in the measurement process.
Similarly for the single limit situation, when only one limit is fixed (upper or lower), it may be assumed with
95 % confidence that the true value à of the characteristic is limited as follows:
Copyright International Organization for Standardization
Provided by IHS under license with ISO
These equations also allow a given laboratory (N = 1) to determine the confidence level that can be assigned to the average of results by comparison with the true value
Aim of specifications
A specification aims to define limits for the true value of a property, but in practice, the exact true value can never be determined precisely Laboratory measurements are conducted using standardized test methods, which can produce results with some variability due to factors like repeatability and reproducibility As a result, there is inherent uncertainty regarding the true value of the property being tested.
Petroleum product specifications are controlled in accordance with Clauses 9 and 10 By prior agreement, a supplier and recipient can use the alternative procedures described in Annex I
It is important that a test method is selected that is sufficiently precise to determine whether or not the product satisfies the specifications.
Construction of specifications limits in relation to precision
Specifications typically define acceptable ranges for property values, establishing clear boundaries to prevent ambiguity These limits are usually expressed using terms like "not less than" or "not greater than" to specify minimum and maximum thresholds There are two main types of limits: minimum limits and maximum limits, which help ensure products or processes meet quality standards and regulatory requirements Clear specification limits are essential for maintaining consistency, quality control, and compliance across industries.
⎯ a double limit, upper and lower, for example viscosity not less than 5 mm 2 /s and not greater than
⎯ a single limit, upper or lower, for example sulfur content not greater than 2 %; lead content not greater than 3,0 g/l; solubility of bitumen not less than 99 %
A single limit situation becomes relevant when an additional implied limit effectively transforms it into a double limit scenario, as illustrated by examples with implied limits of 0%, 0 g/l, and 100% In true single limit cases, such as a flash point of not less than 60°C, these considerations do not apply Clauses 8 to 10 specify that A1 represents the upper limit while A2 denotes the lower limit.
The value chosen for a specification limit shall take into account the reproducibility of the test method adopted, as follows:
⎯ for a double limit (A 1 and A 2 ), the specified range (stated or implied) shall be not less than four times the reproducibility R, i.e (A 1 − A 2 ) W 4R;
For a single limit (A1 or A2), the specified limit must be positioned at least twice the reproducibility (R) away from the implied limit Specifically, if the upper implied limit is 100%, the specified limit should be at least (100 - A1) × 2 × R; similarly, if the lower implied limit is zero, the specified limit should be at least A2 × 2 × R This ensures appropriate compliance with measurement reliability and accuracy standards.
The requirements of this International Standard apply to specifications drawn up in accordance with these principles
When the difference (A1 − A2) is less than 4R for practical reasons, the results become unreliable for determining if a sample meets the specified requirements To ensure meaningful assessments, it is statistically advisable for (A1 − A2) to be significantly greater than 4R, enhancing the accuracy and significance of the results.
To address potential issues, two main strategies should be considered: first, evaluate whether the specification limits can be widened to align with the precision of the test method; second, assess if the test method's precision can be improved or if an alternative testing method should be implemented to ensure accurate and reliable results.
To ensure accurate petroleum testing, it is recommended that the lower limit of test method scope be set at least 2R above the lowest achievable result, while the upper limit should be no more than 2R below the highest achievable result This approach optimizes test accuracy within the defined specification limits, aligning with best practices for reliable petroleum analysis.
General
Clause 9 offers essential information for the supplier or recipient to assess product quality based on available test results, ensuring compliance with specifications If further action is required after reviewing these results, the process outlined in Clause 10 should be followed.
Testing margin at the supplier
A supplier relying solely on a single test result to determine the true value of a characteristic should consider the product as meeting the specification limit with 95% confidence only if the measured result, X, falls within a specified range This approach emphasizes the importance of statistical confidence in quality control, ensuring that a single measurement is sufficient to confirm compliance with product specifications.
⎯ in the case of a single upper limit, A 1 :
⎯ in the case of a single lower limit, A 2 :
⎯ in the case of a double limit (A 1 and A 2 ), both these conditions are satisfied (see 7.2.3)
Equations (27) and (28) serve as guidance for the supplier and do not impose any mandatory obligations A reported value falling between the specified value and the limit defined by these equations should not be considered as evidence of non-compliance.
Testing margin at the recipient
A recipient relying solely on a single measurement to determine the true value of a characteristic should consider the product as failing the specification limit with 95% confidence only if the result, X, indicates that the characteristic does not meet the specified criteria This approach ensures a statistically sound decision-making process when assessing product quality based on limited data.
⎯ in the case of a single upper limit A 1 ,
⎯ in the case of a single lower limit A 2 ,
⎯ in the case of a double limit (A 1 and A 2 ), either of these conditions applies
If the supplier and recipient cannot agree on the product quality based on their current assessments, the procedures outlined in sections 10.2 to 10.5 should be followed to resolve the dispute.
Each laboratory must reject its original results and obtain at least three additional acceptable results from their own check samples to confirm that the testing was performed under repeatability conditions Ensuring consistent and reliable results is essential for maintaining quality and compliance The average of these accepted results provides a confirmed measurement, reinforcing the accuracy and repeatability of the testing process.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
The acceptable results in each laboratory are calculated by discarding divergent results, as outlined in section 7.2.2 If re-testing does not resolve discrepancies, the process proceeds with the following procedures to ensure accurate and reliable outcomes.
X S be the average of the supplier;
X R be the average of recipient;
A 1 be the upper limit of the specification;
A 2 be the lower limit of the specification where
This means that X S and X R should be compared as follows with A 1 and A 2
⎯ product meets specification if X S −X R u0,84R 2 (for R 2 , see 7.3.1);
When a product's compliance with specification limits cannot be confidently determined, resolving the dispute often requires negotiation This approach helps facilitate a clear resolution when testing results are ambiguous, emphasizing the importance of effective communication between parties Understanding the limitations of measurement precision is crucial for ensuring quality control and maintaining compliance with industry standards.
In the event of a dispute, both laboratories are required to communicate and compare their operating procedures and equipment After these initial assessments, a correlation test using their check samples must be conducted to ensure consistency The laboratories will calculate the average of at least three acceptable results in each, and these averages will then be compared to verify agreement, following the guidelines outlined in section 10.2.
If disagreement persists, a third neutral, expert laboratory accepted by both parties will be invited to conduct testing using a third sample The average of the acceptable results from the three laboratories will be denoted as X E If the difference between the most divergent laboratory average and the mean of the other two laboratory averages is less than or equal to R 3 (as specified in 7.3.1), the following procedures will be followed.
10.5 If the difference between the most divergent laboratory average and the average, X, of the two other laboratory averages is more than R 3 ,the following procedure shall be adopted:
If X uA 1 or WA 2 , product meets specification
If X > A 1 or < A 2 , product fails specification
Copyright International Organization for Standardization
Provided by IHS under license with ISO
informative) Specifications that relate to a specified degree of criticality
Determination of number of samples required
Table A.1 — Determination of number of samples required
L = number of participating laboratories interaction variance component repeats variance component
P = laboratories variance component repeats variance component
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Derivation of equation for calculating the number of samples required
An analysis of variance is carried out on the results of the pilot programme This yields rough estimates of the three components of variance, namely:
Substituting the above into Equation (15) (see 6.3.3.3) for calculating the reproducibility degrees of freedom, this becomes:
Q is the ratio σ 2 2 /σ 0 2 ; ν is the reproducibility degrees of freedom;
L is the number of laboratories;
S is the number of samples
The equation rearranges into the form: aS + b = 0 where a = νQ 2 − (1 + P + Q) 2 (L − 1)
−a (B.2) gives the values of S for given values of L, P, Q and ν
Table A.1 is based on ν = 30 degrees of freedom For non-integral values of P and Q, S can be estimated by © ISO 2006 – All rights reserved 31
Throughout this International Standard the following notation is used:
⎯ S is the number of samples;
⎯ L is the number of laboratories;
⎯ i is the subscript denoting laboratory number;
⎯ j is the subscript denoting sample number;
⎯ x is an individual test result;
⎯ a is the sum of duplicate test results;
⎯ e is the difference between duplicate test results;
⎯ ν is the degrees of freedom
C.2 Array of duplicate results from each of L laboratories on S samples and corresponding means m j
2 x 212 x 222 x 2j2 x 2S2 x i11 x i21 x ij1 x iS1 i x i12 x i22 x ij2 x iS2 x L11 x L21 x Lj1 x LS1
NOTE If a transformation y = F(x) of the reported data is necessary (see 5.2), then corresponding symbols y ij1 and y ij2 are used in place of x ij1 and x ij2
Copyright International Organization for Standardization
Provided by IHS under license with ISO
C.3 Array of sums of duplicate results, of laboratory totals h i and sample totals g j
Total g 1 g 2 g j g S T a ij = x ij1 + x i j2 (or a ij = y i j1 + y ij2 , if a transformation has been used) e ij = x ij1 − x ij2 (or e ij = y ij1 − y ij2 , if a transformation has been used)
If any results are missing from the complete array, then the divisor in the expression for m j is correspondingly reduced
C.4 Sums of squares and variances
The equation ∑ (C.1) relates to statistical analysis, where L represents the degrees of freedom for repeats in sample j When one or both results from a laboratory or sample pair are missing, the corresponding term in the numerator is omitted, and the degrees of freedom L are decreased by one This adjustment ensures accurate calculation of variability and consistency across sample results, which is crucial for reliable scientific analysis Understanding how missing data impacts degrees of freedom helps maintain the integrity of statistical evaluations in laboratory studies.
Between cells variance for sample j:
=⎜⎝ −∑ ⎟⎠ ⎣ − ⎦ n ij is the number of results obtained by laboratory i from sample j;
S j is the total number of results obtained from sample j;
L is the number of cells in sample j containing at least one result
Laboratories degrees of freedom for sample j is given approximately [4] by
(rounded to the nearest integer) (C.4)
If either or both of a laboratory/sample pair of results is missing, the factor L is reduced by 1
If both of a laboratory/sample pair of results is missing, the factor (L − 1) is reduced by 1
The largest sum of squares, SS k , out of a set of n mutually independent sums of squares each based on ν degrees of freedom, can be tested for conformity in accordance with
The test ratio remains unchanged when sum of squares values are replaced by mean squares (variance estimates) If the calculated ratio exceeds the critical value from Table D.3, it indicates that the sum of squares, SS k, is significantly larger than the others with a 99% confidence level Examples of SS i include e ij 2 and d j 2, which are key components in the analysis This approach is essential for understanding variance significance in statistical testing and ensures accurate interpretation of experimental data.
An outlier in a data set can be identified by comparing its deviation from the mean to the square root of the sum of squared deviations, using a ratio for assessment Additional insights into data variability can be gained by incorporating independent sums of squares, which are calculated based on ν degrees of freedom and share the same population variance as the dataset This method enhances the accuracy of outlier detection and variability analysis in statistical data.
Table C.3 shows the values which are required to apply Hawkins' test to individual samples
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Sum of squares SS 1 SS 2 SS j SS S where n j is the number of cells in sample j which contains at least one result; m′ j is the mean of cell means in sample j;
SS j is the sum of squares of deviations of cell means, a ij /n ij , from the mean of cell means, m′ j , and is given by:
The test procedure begins by identifying the sample, denoted as k, and calculating the cell mean, a_ik / n_ik, which exhibits the most extreme absolute deviation from the overall mean, |m′_k − a_ik / n_ik| This cell is then selected as the candidate for the outlier test, regardless of whether it is a high or low outlier Subsequently, the total sum of squares of deviations is calculated to assess the variability within the data set, aiding in the detection of potential outliers.
= ′ − (C.7) d) Compare the test ratio with the critical value from Table D.4, for n = n k and extra degrees of freedom ν where
=∑ − , j ≠ k (C.8) e) If B* exceeds the critical value, reject results from the cell in question (sample k, laboratory i), modify the n k , m′ k and SS k values accordingly, and repeat from list item a)
Hawkins' test is designed primarily to detect a single outlier in a sample While repeated testing for multiple outliers—starting with the most deviating data point—affects the interpretation of critical values, Hawkins demonstrated that for sample sizes greater than five and total degrees of freedom exceeding twenty, these effects become negligible Moreover, issues such as masking, where one outlier conceals another, and swamping, where rejecting one outlier leads to the rejection of others, are minimal under these conditions, ensuring the test's reliability for practical applications.
When analyzing laboratory test results averaged across all samples, Table C.3 simplifies to a single column where "n" represents the number of laboratories (L), and "m" denotes the overall mean calculated as T divided by N, with N being the total number of results in the dataset.
SS is the sum of squares of deviations of laboratory means from the overall mean, and is given by
∑ , where n i is the number of results in laboratory i (C.9)
In the test procedure, therefore, identify the laboratory mean, h i /n i , that differs most from the overall mean, m
The corresponding test ratio then becomes:
Compare the test statistic with the critical value from Table D.4, now considering zero degrees of freedom (ν = 0) If the laboratory is rejected, update the values of n, m, and SS, then recalculate to ensure accurate results Proper adjustment and iteration are essential for valid statistical testing.
A variance estimate, V 1 , based on ν 1 degrees of freedom, can be compared with a second estimate, V 2 , based on ν 2 degrees of freedom, by calculating the ratio:
When the ratio surpasses the critical value specified in Tables D.6 to D.9, with ν₁ representing the numerator (larger variance estimate) and ν₂ the denominator, it indicates that V₁ significantly exceeds V₂ at the selected level of significance This process is essential for conducting variance ratio tests to determine if variances differ statistically, which is a key aspect in statistical analysis Ensuring the ratio exceeds the critical value confirms the presence of a significant difference between the variances, aiding in informed decision-making and hypothesis testing.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Example results of test for determination of bromine number and statistical tables
Table D.1 — Bromine number for low boiling samples
Table D.2 — Cube root of the bromine number for low boiling samples
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Table D.3 — Critical 1 % values of Cochran's criterion for n variance estimates and ν degrees of freedom
These values are conservative approximations based on Bonferroni's inequality, representing the upper 0.01/n fractile of the beta distribution To obtain intermediate values along the n-axis, linear interpolation of the reciprocals of tabulated values is recommended, while second-order interpolation of reciprocals is suitable for intermediate values along the ν-axis © ISO 2006 – All rights reserved.
Table D.4 — Critical values of Hawkins' 1 % outlier test for n = 3 to 50 and ν = 0 to 200
Copyright International Organization for Standardization
Provided by IHS under license with ISO
NOTE The critical values given in Table D.4 are correct to the 4th decimal place in the range n = 3 to 30 and ν = 0, 5,
15 and 30 [2] Other values were derived from the Bonferroni inequality as:
The value of t is the upper 0.005/n fractile of a t-distribution with n + ν − 2 degrees of freedom These computed critical values are only slightly conservative, with a maximum error of approximately 0.0002 above the true value When critical values for intermediate values of n and ν are needed, they can be estimated through second-order interpolation using the squared reciprocals of the tabulated values Additionally, second-order extrapolation can be employed to approximate critical values beyond n = 50 and ν = 200.
See Reference [7] in the bibliography for the source of Tables D.6 to D.9
Copyright International Organization for Standardization
Provided by IHS under license with ISO
D.1.2 Approximate equation for critical values of F
Critical values of F for untabulated values of ν 1 and ν 2 may be approximated by second order interpolation from Tables D.6 to D.9
Critical values of F corresponding to ν 1 > 30 and ν 2 > 30 degrees of freedom and a significance level
100 (1 − p) %, where p is the probability, can also be approximated from the equation:
Values of A(p), B(p) and C(p) are given in Table D.10 for typical values of significance level 100 (1 − p)
Copyright International Organization for Standardization
Provided by IHS under license with ISO
Table D.10 — Typical values of equation parameters
For values of p not given in Table D.10, critical values of F may be obtained by second order interpolation/extrapolation of log 10 (F) (either tabulated or estimated from the equation) against log 10 (1 − p)
D.2 Critical values of the normal distribution
Critical values, Z, corresponding to a single-sided probability, p, or to a double-sided significance level 2(1 − p), are given in Table D.11 in terms of the “standard normal deviate”, where
= − (D.3) and where à and σ are the mean and standard deviation respectively of the normal distribution
Table D.11 — Critical values of the normal distribution p 0,70 0,80 0,90 0,95 0,975 0,99 0,995
When p is less than 0,5, the appropriate critical value is the negative of the value corresponding to a probability (1 − p)
Types of dependence and corresponding transformations
The dependence forms outlined in Table E.1 are visually represented in Figures E.1 through E.8, providing clear graphical insights These models incorporate a positive constant, K, and utilize Naperian logarithms ("ln") to ensure accurate transformations Additionally, the fitted line includes a dummy variable, T, enabling tests for differences in the transformation process applied to both repeatability and reproducibility assessments, thereby enhancing the robustness of the analysis.
Form of dependence Transformation Form of line to be fitted d d x y Remarks
Care shall be taken if (x + B) is small, since rounding becomes critical
The fitted line will pass through the origin
The fitted line will not pass through the origin
This case often arises when results are reported as percentages or qualitatively as “scores” If x is always small, the transformation reduces to y = x, a special case of 2 above
This case arises when results are reported on a rating scale of 0 to B
If x is always small, then the transformation reduces to y = ln(x), a special case of 1 above
The fitted line does not pass through the origin If B is small, the transformation reduces to y = 1/x, a special case of 2 above
Copyright International Organization for Standardization
Provided by IHS under license with ISO
To identify the correct type of transformation and its parameters B and B', begin by plotting the laboratories' standard deviations (D) and repeats' standard deviations (d) against sample means using scatter diagrams, as shown in Figures E.1 to E.8 This visual analysis helps determine whether a transformation is necessary For transformations other than power transformations (types 2 and 3 in Table E.1), estimate the transformation parameter B directly from these scatter plots Notably, for the arcsin and logistic transformations (types 4 and 5 in Table E.1), B is already known, as it represents the upper limit of the rating scale.
To ensure accurate result interpretation, calculate parameter B from the estimated intercept and slope in the log transformation (Type 1), and from the intercept in the arctan transformation (Type 6), rounding B to meaningful values aligned with laboratory and repeat standard deviations For the power-with-intercept transformation (Type 3), also estimate B₀ from scatter diagrams, following the procedures outlined in Annex F.3 The coefficient b₁ should significantly differ from zero for both power transformations, providing a reliable estimate of B, which must be rounded appropriately For the arcsin transformation, b₁ should not significantly differ from 0.5, while in logistic, log, and arctan transformations, b₁ should not significantly differ from 1, ensuring proper model fitting.
In every case, the test specified in Table E.1 shall be applied at the 5 % significance level
Failure of this test indicates incorrect transformation type or parameters B and/or B′, and the coefficient b3 should always be tested as zero; if the test fails, it suggests differences between repeatability and reproducibility, potentially due to outliers If the tests in steps a) to c) are satisfactory, transform all results accordingly, recalculate means and standard deviations, and generate new scatter diagrams to display consistent laboratory standard deviations and a uniform level for repeatability, with a statistical test for uniformity provided in section 5.4.
For the power-with-intercept transformation, it is important to note that parameters B and B0 cannot be estimated simultaneously using the linear least squares method outlined in Clause F.3 Instead, a non-linear, iterative approach, such as the Nelder and Mead simplex procedure [10], is necessary This process requires the use of specialized computer software [9] to effectively perform the estimation.
Copyright International Organization for Standardization
Provided by IHS under license with ISO
F.1 Explanation for the use of a dummy variable
Two different variables, Y 1 and Y 2 , when plotted against the same independent variable, X, in general give different linear relationships of the form:
Y 2 = b 20 + b 21 X (F.1) where the coefficients b ij are estimated by regression analysis In order to compare the two relationships, a dummy variable, T, can be defined such that:
T = T 1 , a constant value for every observation of Y 1
T = T 2 , a constant value for every observation of Y 2
T 1 ≠ T 2 Letting Y represent the combination of Y 1 and Y 2 , plot a single relationship,
Y = b 0 + b 1 X + b 2 T + b 3 TX (F.2) where, as before, the coefficients b i are estimated by regression analysis By comparing Equations (F.1) and (F.2), it is evident that b 10 = b 0 + b 2 T 1 b 20 = b 0 + b 2 T 2 (F.3) and that therefore b 10 − b 20 = b 2 (T 1 − T 2 ) (F.4)