ASTM Proficiency Testing Programs for Biofuels 1

Một phần của tài liệu Astm mnl 77 2016 (Trang 186 - 200)

Proficiency testing programs (PTPs) play an important role in maintaining the quality of data produced in a laboratory. PTPs are useful statistical quality assurance tools to monitor the strengths and weaknesses of a laboratory’s performance. Many laboratory accreditation agencies require laboratory participa- tion in such programs. Two significant PTPs sponsored by ASTM Committee D02 on Petroleum Products and Lubricants have been those on biodiesel and fuel ethanol that were launched in the past few years.

ASTM Proficiency Testing Programs

A PTP can be defined as a testing program where identical samples are sent to participating laboratories where they are analyzed using standard test methods and the results reported for statistical treat- ment. PTPs, also known as crosschecks, check schemes, and round-robins, provide an expedient means of assessing the perfor- mance bias of one laboratory relative to other laboratories with the same industrial focus.

Several papers have been published describing the details of ASTM Committee D02’s proficiency testing programs [1–8]. A brief summary follows. In general, ASTM PTPs are designed to provide participating laboratories (who pay a fee to participate) with a way to periodically compare their level and consistency of testing with that of other participating laboratories utilizing ASTM standard test methods. Participants receive sample material and specific instructions for performing the tests. Each laboratory electronically records and submits its test data from a program trial to ASTM for processing into an overall statistical summary report.

Robust statistics are used to calculate the data mean and stan- dard deviation [9]. The use of robust statistics versus conventional statistics limits the data rejected as outliers given the wide diversity of laboratories involved and sometimes the wide range of results received. However, because of the use of robust statistics, for a num- ber of program tests, the reproducibility of the data generated from these crosschecks is worse than the expected reproducibility esti- mates published in the ASTM standard test methods. In spite of a sometimes wide range of results reported, no outliers are rejected,

resulting in overall poor precision. The Anderson-Darling (AD) statistic is used to test if a sample of data came from a population with a specific distribution. Essentially, for all ASTM PTP test data generated, the AD statistical application performs a “goodness of fit” or “normalcy” test to determine if the data are from a normal distribution. The PTP test data population is tested against normal behavior and measurement resolution adequacy using the AD statistic. Gross violations (AD >1.0) are flagged and a comment is added to the data tables stating that the data should not be inter- preted using conventional normal distribution based probabilities.

It is suspected that some of the factors contributing to this poor reproducibility are due to laboratories not correctly following the test method details or not performing adequate and frequent calibrations and routine quality control (or to both). One final information tool found useful is that for each test a calculation of the test performance index (TPI) is performed and reported. This calculation is defined as the ratio of test method reproducibility divided by the robust reproducibility of the PTP data set. Index criteria for deciding what the TPI means are as follows, as excerpted from ASTM Standard Guide D7372 (Table 8.1).

TPI values in the range <0.8 to 1.2 should lead all laboratories to consider their own contribution to the relatively poor perfor- mance by the industry group. The subcommittee responsible for the test should also review if the published precision reflects reality.

Whether improvement in the test methods used in a PTP is needed or not can be judged from the TPI. Some TPIs are satisfactory, but many are not, indicating (1) improvement in the laboratory perfor- mance is needed, (2) test method improvement is needed, (3) alter- nate test methods need be investigated, or (4) product specification limits need to be reviewed to be in conformance with the expected test variability (or any combination of these factors).

ASTM final statistical summary reports are provided to each participating laboratory who submitted data for the trial. Each laboratory’s data are coded with only the participant knowing its own laboratory identification code. The various standards-writing subcommittees of Committee D02 also receive these anonymous data reports as a review of the status of their test methods to deter- mine if any revisions to the method are warranted.

Because of electronic enhancements, the entire cycle of data reporting and the publishing of the final report is done electroni- cally, thus reducing the turnaround time of final reports for the

1Some text in this chapter is excerpted from and updated from Reference 8.10.

DOI: 10.1520/MNL772015001108

data submission deadline. Typically, all final summary reports for each test trial are delivered within 30 days of the data submission deadline. This enables the participants to quickly take all correc- tive actions necessary that immediately impact or affect their qual- ity assurance activities.

Each laboratory analyzes the crosscheck sample only once—

except for the ultra-low sulfur diesel (ULSD) program where it is analyzed in duplicate because of the request from the U.S.

Environmental Protection Agency. Most programs are conducted three times a year, with the exception of the reformulated gasoline and ULSD programs, which are conducted monthly.

The potential benefits of participating in a PTP exercise are multiple:

• It provides a laboratory with a means to measure its compe- tence against industry laboratories.

• It provides a consistent foundation of performance that cus- tomers and data users can rely on.

• It monitors the strengths and weaknesses of a laboratory’s test performance.

• It provides a quality control tool for long-term performance.

• It provides a material with assigned mean values and variance limits that can later be used for laboratory quality control.

• It provides information to the sponsoring standards-writing subcommittees on how well their test methods are performing in the “real world” to measure product properties of interest with sufficient precision and accuracy.

• Participation in a PTP satisfies laboratory accreditation require- ments for many organizations (e.g., American Association for Laboratory Accreditation, American Petroleum Institute [API], American Chemistry Council, etc.).

The final statistical summary report contains the number of valid data points, each laboratory’s (coded for anonymity) test results, their deviations from the mean value, flagged outliers, and other pertinent information. It contains a summary box giving robust mean, robust standard deviation, the crosscheck reproducibility versus the test method reproducibility, an Anderson-Darling statistic, and calculated TPI value (Fig. 8.1). Because only a single

aliquot is analyzed, only reproducibility can be calculated, not repeatability, except for the ULSD program where duplicate sam- ples are analyzed by each laboratory and a repeatability value is reported. The summary report also contains Z-scores for each lab- oratory. This is a standardized and dimensionless measure of the difference between an individual result in a data set and the robust mean of the data set. This score is expressed in units of standard deviation of the data set by dividing the actual difference between the lab result and the robust mean by the robust standard deviation of the overall data set.

Lab Z-Score = (Lab Result – Robust Mean) Overall Robust Standard Deviation The sign (+ or −) preceding the Z-score reflects the magnitude of the relative bias of the individual result versus the robust mean of the sample group. The sign and the magnitude of the mean Z-score for a series over a selected time period are an indication of the long-term relative bias. The significance of bias increases as the Z-Score mean moves farther away from zero.

Additionally, each report contains a chart plotting test results by laboratory code for each test (e.g., Fig. 8.2 for sulfur determination by ASTM D5453 in analysis of ETOH 1304 sample) and a histogram distribution of the results about the sample mean (e.g., Fig. 8.3 of pHe determination by ASTM D6423 in ETOH 1304 sample). Other fig- ures demonstrate different histogram patterns, such as Fig. 8.4 of flash point determination by ASTM D93 C in a sample of biodiesel (BIOD) 1308 and Fig. 8.5 of methanol content by gas chromatogra- phy ASTM D5501 in a sample of ETOH 1308. Not all the histograms are in normal or in ideal bell shape, indicating uneven and biased distribution of the data.

In cases where the same parameter is determined using multi- ple test methods, a box and whisker plot is provided to compare alternate methods for the same analysis (e.g., Fig. 8.6 showing acid number determination by ASTM D664 and ASTM D974 in a sam- ple of BIOD 1308, Fig. 8.7 showing results for carbon residue by ASTM D189 and ASTM D4530 in a sample of BIOD 1308, and Fig.

8.8 showing the results of flash point determination in a sample of BIOD 1308). Ideally, in all these examples, the mean of the results for a sample using alternate tests should be equivalent. This may or may not be the case in these real examples. Precisions of individual test methods purportedly determining the same parameter may also be quite different. Thus, a best test method chosen for an anal- ysis should take this history into account.

A number of D02 Committee statistical standards are utilized in processing the large amount of data produced in this program (Table 8.2).

Thus, the PTPs offer a way for those involved in laboratory operations, writers of product specifications, and those involved in development of precision statements for the test methods to gain insights into performance in the operations of the real world.

ASTM D7372, Standard Guide for Analysis and Interpretation of Proficiency Test Program Results, can be used to evaluate the per- formance of a laboratory or a group of laboratories participating in any of the 28 proficiency testing programs involving petroleum products and lubricants.

Table 8.1 Implications of Test Performance Indexes

Test Performance

Index Implications

>1.2 The performance of the group providing data is probably satisfactory relative to the corresponding ASTM published precision.

0.8 to 1.2 The performance of the group providing data may be marginal and each laboratory should consider reviewing the test method procedures to identify opportunities for improvement.

<0.8 The performance of the test method as practiced by the group is not consistent with the ASTM published precision and laboratory method and performance improvements should be investigated by all the laboratories.

ASTM International has been a pioneer in PTP programs for a number of years. The most prominent among such programs is the one sponsored by ASTM Committee D02 on Petroleum Products and Lubricants, entitled the ASTM Interlaboratory Crosscheck Program (ILCP). This program was initiated in 1993 and now covers 28 products spanning virtually all products pro- duced or analyzed in the oil industry (or both). More than 3,300 laboratories worldwide are participating in these programs; of these, more than 57 % are non-U.S.-based.

Biofuel Proficiency Testing Programs

Biofuel PTP programs [10] were added to the scope of Committee D02’s PTPs. The biodiesel PTPs began in 2005 and now involve

118 labs and about 30 test methods. The fuel ethanol program was started in 2007 and now involves 115 labs and 14 test meth- ods. Each program is conducted three times a year: April, August, and November for biodiesel, and April, August, and December for fuel ethanol. The cost of each program (in 2015) was $689 and $717 per year per participant, respectively. In recent years, the most significant growth in the PTPs for petro- leum products has been in these two programs. The status of these two programs is given in Table 8.3. However, it should be noted that in spite of the seemingly substantial number of total labs participating in these two programs, the actual data returned for individual tests in these programs are far less. The tests used in these two ILCPs are included in Table 8.4.

Most of the tests in each program appear in the biofuel specifications for each product (Table 8.4). Some of the tests selected from the specifications are not appropriate for the FIg. 8.1 Analytical data report page for ASTM D7328, Determination of Total Inorganic Chloride in ETOH 1304 Sample.

specific analysis desired on the biofuel. Not all laboratories report results for each of these program tests. Laboratory partic- ipants are allowed the freedom to report the results for those tests they typically perform and that they are capable of carrying out. Table 8.5 summarizes the data for fuel ethanol test trials, and Table 8.6 summarizes the data for biodiesel test trials, in both cases from 2012 through 2014.

Several of the analyses can be done using alternate test meth- ods where available. Generally, the alternate test methods give equivalent results within their individual test method variability (see Table 8.7 for biodiesel and Table 8.8 for fuel ethanol). Some of the data are shown as box and whisker graphs in Figs. 8.6–8.8 for acid number (ASTM D664 and ASTM D974), carbon residue

(ASTM D189 and ASTM D4530), and for flash point (ASTM D93A, ASTM D93C, and ASTM D3828), respectively, for sample BIOD 1308.

As mentioned earlier, TPI is a way of measuring not only a laboratory’s proficiency but also as a way of checking on how well the industry as a group is performing, including how well the selected test method is suited for the desired analysis. These data are summarized in Table 8.9 for the six trials conducted in 2013 for both biofuels.

As one can easily interpret from this table, a number of tests required in the product specifications for both biodiesel fuel and fuel ethanol have not produced satisfactory precision (i.e., >1.00 TPI). In addition, a large number of tests are not being conducted by FIg. 8.2 S-chart for sulfur determination by test method ASTM D5453 in ETOH 1304 sample.

3 Sigma Range Limit ASTM Standard Reproducibility

3 Sigma Range Limit ASTM Standard Reproducibility

3 Sigma Range Limit These Test Data

3 Sigma Range Limit These Test Data 0.3 0.78 1.26 1.74 2.22 2.7 3.18 3.66 4.14

-1.92 -1.44 -0.96 -0.48 0 0.48 0.96 1.44 1.92

7 0 4

46 21 3 4 9 8

8 2 38

87 19

68 78

26 81

72 88

54 94

69 09

95461 02 9 11

59 34 4 7

7 83 09

43 76

53 46

31 01

00 8 05

98 14

58 44

02972 92

13 9

Deviation

Lab Code (Presented Vertically)

FIg. 8.3 Histogram of pHe determination results using test method ASTM D6423 in ETOH 1304 sample.

Median

1% 99%

0 6 12 18 24 30 36 42

6.98 - 7.11 7.12 - 7.25 7.26 - 7.39 7.4 - 7.53 7.54 - 7.67 7.68 - 7.81 7.82 - 7.95 7.96 - 8.09 8.1 - 8.23 8.24 - 8.37

Number of Labs

Distribution Range of Test Results

many laboratories. It would be interesting to find out how the prod- uct batches are certified as meeting the product specification requirements when these tests are not being carried out by many of the production laboratories participating in the PTP. More impor- tantly, it is recognized that many of the tests required in the product specifications do not have biofuels as applicable in their scopes and no precision data for biofuels are given for such analyses. Thus, how would the product quality disagreements be resolved, especially since this requires reproducibility data per ASTM D3244, Practice for Utilization of Test Data to Determine Conformance with Specifications, or its equivalent ISO 4259 standard? Some of these tests are conducted by a large number of laboratories; however, no TPI can be calculated because of the lack of a reproducibility value

or calculation provided in the test methods (e.g., ASTM D445 kine- matic viscosity, ASTM D874 sulfated ash, ASTM D4951 phospho- rus, ASTM D482 ash, etc.). For a large number of tests, however, the TPIs leave much to be desired (Table 8.9). This could be because the laboratories do not have enough experience in conducting these tests, the tests need modifications to suit the biofuel matrix, or the test method reproducibility used in calculating the TPI is not valid for this matrix. Also, some designated tests are completely wrong for the required analyses (e.g. ASTM D4951, inductively coupled plasma atomic emission spectrometry method for metals is meant for additives and lubricating oils at a high level of phosphorus not for trace quantities present in biodiesel; the same is true for ASTM D1688 for copper in fuel ethanol).

FIg. 8.4 Histogram of flash point determination results using test method ASTM D93 Procedure C in BIOD 1308 sample.

Median

1% 99%

0 4 8 12 16 20 24 28

87.5 - 98.65 98.7 - 109.85 109.9 - 121.05 121.1 - 132.25 132.3 - 143.45 143.5 - 154.65 154.7 - 165.85 165.9 - 177.05

Number of Labs

Distribution Range of Test Results

FIg. 8.5 Histogram of ethanol/methanol results using test method ASTM D5501 in ETOH 1308 sample.

Median

1% 99%

0 4 8 12 16 20 24

-0.01 - 0 0.01 - 0.02 0.03 - 0.04 0.05 - 0.06 0.07 - 0.08 0.09 - 0.1 0.11 - 0.12 0.13 - 0.14 0.15 - 0.16 0.17 - 0.18

Number of Labs

Distribution Range of Test Results

Overall ranges of TPIs obtained over the years of the biofuels programs are qualitatively summarized in Table 8.10. Several tests can use improvement in them for biofuels analysis.

In Figs. 8.1–8.8, data generated in this program are shown as examples of charts produced. The examples are typical of those obtained for biodiesel and ethanol fuel samples routinely analyzed in this program. Fig. 8.1 shows a page of the results from all partic- ipating laboratories for the April 2013 cycle of ethanol fuel for total inorganic chloride using ASTM D7328. Fig. 8.2 depicts the results of sulfur by ASTM D5453 in an ethanol fuel sample analyzed in April 2013. Fig. 8.3 shows the distribution of results for pHe by ASTM D6423 using a histogram. The histogram is reasonably close to a normal bell-shaped curve. Although most of the histograms for all test analyses look satisfactory, scrutiny of individual dia- grams shows some unsatisfactory results—such as abnormal dis- tribution of data and worrisome trends in the data sets. In Fig. 8.4 of flash point determination using the ASTM D93 Procedure C of a biodiesel sample from August 2013, some laboratories are getting

very high or very low results. Similarly, Fig. 8.5 shows a wide varia- tion in the results distribution for methanol content as determined by ASTM D5501 in an ethanol fuel sample of August 2013. The histogram is in a very poor shape, far from a normal distribution.

These laboratories need to check whether they are doing some- thing different than the rest of the laboratories and take corrective actions where necessary.

Figs. 8.7 and 8.8 show the data from biodiesel samples ana- lyzed in August 2013, comparing the alternate test methods used for determination of acid number by ASTM D664 and ASTM D974 for the determination of carbon residue by ASTM D189 and ASTM D4530 and the determination of flash point by ASTM D93 Procedures A and C, and ASTM D3828. These box and whisker graphs are meant to provide a cross reference of test data gener- ated from different procedures and methods that measure the same test parameter. The box and whisker graph separates the test data by standard, with the shaded box representing the mid- dle 50 % of test data centered around the median. The horizontal FIg. 8.6 Box and whisker graph of acid number results in BIOD 1308 sample.

0.18

D664 - Method B(72 Labs)

D974(30 Labs) 0.2

0.22 0.24 0.26 0.28 0.3 0.32 0.34

Results

line within the box represents the median of the reported data.

The whisker length has been adjusted to the last data point that falls within 1.5 times the difference between the upper and lower value of the shaded box. Data points above or below the whisker are included, unless the data is off the “Y” axis scale. Data that would be plotted off the graph are indicated with an arrow.

Outlier data are not included in this graph. The “Y” axis scale represents the absolute value; the “X” axis scale provides an identification of the test method and number of reporting data points. In these particular cases, the alternate test methods used for these analyses by different laboratories have given results within the reproducibility of the methods.

In these box and whisker examples, acid number results over- lap for each of the test methods, although the precision of each test method is quite different. In the example of carbon residue deter- mination, the means are about the same, but there is a wide dispar- ity between the precisions obtained for the ASTM D189 and ASTM

D4530 test methods for the determination of carbon residue.

Finally, in the case of flash point determination, three test methods have produced significantly different mean results as well as a wide spread of the results.

In summary, there are many advantages in participating in the ASTM Proficiency Testing Program, particularly for new matrices such as biofuels. The biggest take-home advantage is that two parties gain from the ASTM Proficiency Testing Program. The first party is the participating laboratory who can readily identify and correct their testing weaknesses, if necessary, and ultimately improve their overall quality assurance. The second party is the ASTM standardization authorities who are responsible for the technical credibility and use of the published ASTM product spec- ifications and supporting test methods. Through both ASTM pro- cesses, the standards for the industry are being improved to best serve the new marketplace needs through utilizing new product matrices.

FIg. 8.7 Box and whisker graph of carbon residue results in BIOD 1308 sample.

–0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

Results

D189 Conradson(7 Labs)

D4530 Micro Method(32 Labs)

Một phần của tài liệu Astm mnl 77 2016 (Trang 186 - 200)

Tải bản đầy đủ (PDF)

(212 trang)