Statistical Analysis Methods for Chemists: A Software Based Approach - Softcover

Gardiner, William P

 
9780854045495: Statistical Analysis Methods for Chemists: A Software Based Approach

Synopsis

This useful book gives unique coverage of the statistical skills and techniques required in modern chemical experimentation and will prove invaluable to students and researchers.

"synopsis" may belong to another edition of this title.

Excerpt. © Reprinted by permission. All rights reserved.

Statistical Analysis Methods for Chemists

A Software-based Approach

By William P. Gardiner

The Royal Society of Chemistry

Copyright © 1997 The Royal Society of Chemistry
All rights reserved.
ISBN: 978-0-85404-549-5

Contents

Glossary, xiv,
Chapter 1 Introduction,
Chapter 2 Simple Chemical Experiments: Parametric Inferential Data Analysis,
Chapter 3 One Factor Experimental Designs for Chemical Experimentation,
Chapter 4 Factorial Experimental Designs for Chemical Experimentation,
Chapter 5 Regression Modelling in the Chemical Sciences,
Chapter 6 Non-parametric Inferential Data Analysis,
Chapter 7 Two-level Factorial Designs in Chemical Experimentation,
Chapter 8 Multivariate Analysis Methods in Chemistry,
Appendix A: Statistical Tables, 329,
Appendix B: Tables of Large Data Sets, 344,
Answers to Exercises, 352,
Subject Index, 362,


CHAPTER 1

Introduction


1 INTRODUCTION

Most analytical experiments produce measurement data which require to be presented, analysed, and interpreted in respect of the chemical phenomena being studied. For such data and related analysis to have validity, methods which can produce the interpretational information sought need to be utilised. Statistics provides such methods through the rich diversity of presentational and interpretational procedures available to aid scientists in their data collection and analysis so that information within the data can be turned into useful and meaningful scientific knowledge.

Pioneering work on statistical concepts and principles began in the eighteenth century through Bayes, Bernoulli, Gauss, and Laplace. Individuals such as Francis Galton, Karl Pearson, Ronald Fisher, Egon Pearson, and Jerzy Neyman continued the development in the first half of the twentieth century. Development of many fundamental exploratory and inferential data analysis techniques stemmed from real biological problems such as Darwin's theory of evolution, Mendel's theory of genetic inheritance, and Fisher's work on agricultural experiments. In such problems, understanding and quantification of the biological effects of intra- and inter-species variation was vital to interpretation of the findings of the research. Statistical techniques are still developing mostly in relation to practical needs with the likes of artificial neural networks (ANN), fuzzy methods, and structure-activity relationships (SAR) finding favour in the chemical sciences.

Statistics can be applied within a wide range of disciplines to aid data collection and interpretation. Two quotations neatly summarise the role statistics can play as an integral part of chemical experimentation, in particular:

'The science of Statistics may be defined as the study of chance variations, and statistical methods are applicable whenever such variations affect the phenomena being studied. 'Statistics is a science concerned with the collection, classification, and interpretation of quantitative data, and with the application of probability theory to the analysis and estimation of population parameters.


Both quotations highlight that statistics is a scientifically-based tool appropriate to all aspects of experimentation from planning through to data analysis to help understand the data and to provide interpretations relevant to experimental objectives. Since all chemical measurements are subject to inherent variation, statistical methods provide a beneficial tool for explaining the features within the data accounting for such inherent variation. Knowledge of statistical principles and methods (strengths as well as weaknesses) should therefore be part of the skills of any scientist concerned with collecting and interpreting data and should also be an integral part of design planning. Statistics should not be considered as an afterthought only to be brought into play after data are collected, the 'square peg into round hole' syndrome, which is how the application of statistical methods is often viewed within the scientific community.

Applied chemical experimentation generally falls into one of three categories: monitoring, optimisation, and modelling. Monitoring is primarily Concerned with process checking such as monitoring pollution levels, investigating how data are structured, quality assurance of analytical laboratories, and quality control of experimental material such as house reference materials (HRMs) and certified reference materials (CRMs). Optimisation, often through exploratory or investigative studies, comes into play when wishing to optimise a chemical process which may influenced by a number of inter-related factors. Instances where such experimentation may occur include optimisation of analytical procedures, optimisation of a new chemical process, and assessment of how different chemical factors cause changes to a chemical outcome. Often, this type of experimentation is based on the classical one-factor-at-a-time (OFAT) approach which is inefficient and provides only partial outcome information. Through simple and logical modification of the OFAT structure to ensure that all possible factor combinations are tested, the experiment can be made more efficient and provide more relevant information on factor effects, such as factor interaction. Modelling, on the other hand, attempts to build a model of the chemical process under investigation for predictive purposes. It is often also based on the results obtained from an optimisation experiment where the importance of factors has been assessed and the most important factors retained for the purpose of model building.

I will consider all of these forms of applied chemical experimentation in relation to illustrating how statistical methods can be used to provide understanding and interpretations of collected data in relation to the experimental objectives. Chapter 2 provides an introduction to exploratory data analysis (plots and summaries) and inferential data analysis (hypothesis testing and estimation) for one- and two-sample experimentation. Chapters 3 and 4 extend this introduction into more formal design structures for one-, two-, and three-factor experimentation with Chapter 4 concentrating on factorial designs, the easily implemented alternative to the classical OFAT approach. An introduction to modelling is provided in Chapter 5 through regression methods for the fitting of relationships (linear, multiple) to chemical data. Analytical applications of these techniques in the form of calibration and comparison of two linear equations will also be discussed. Chapter 6 introduces non-parametric methods as alternatives to the previously discussed parametric procedures. Experimental methods pertaining to optimisation are further developed in Chapter 7 through two-level factorial designs for multi-factor experimentation. The final chapter, Chapter 8, introduces multivariate methods appropriate to the handling of multi-response data sets. Many of the techniques and principles that will be explored are often discussed under the heading of Chemometrics, the name given to the cross-disciplinary approach of using mathematical and statistical methods to help extract relevant information from chemical data.

The increased power and availability of computers and software has enabled statistical methods to become more readily available for the treatment of chemical data. On this basis, all analysis concepts will be geared to using software (Excel and Minitab) to provide the data presentation on which analysis can be based. The mathematical and calculational aspects of statistics will be ignored, intentionally so, in order to be able to build up a picture of how statistics can turn chemical measurements into chemical information through interpretation of software output. Most of the methods discussed are of classical type though application methods are still developing.


2 WHY USE STATISTICS?

A question often asked by chemists is 'What use and relevance has statistics for chemistry?'. Statistics can best be described as a combination of techniques which cover the design of experiments, the collection of experimental data, the modes of presentation of data, and the ways in which data can be analysed for the information they contain. Statistical concepts, therefore, are relevant to all aspects of experimentation ranging from planning to interpretation. The latter can be subjective (exploratory data analysis, EDA) as well as objective (inferential data analysis, estimation) but the basic rule must be to understand the data as fully as possible by presenting and analysing them in a form whereby the information sought can be readily found.

Examples where statistical methods could be useful include:

• Assessing whether analytical procedures and/or laboratories differ in accuracy (systematic error) and precision (random error) of reported measurements,

• Assessing how changing experimental conditions affect a particular chemical outcome,

• Assessing the effect of many factors on the fluorescence of a chemical complex.


Such experimentation would produce numerical data which would require to be presented and analysed in order to extract the information they provide in respect of the experimental objective. Statistics, through its presentational and interpretational procedures, can provide such means of turning data into useful chemical information which explain the phenomena investigated.

Statistics can also provide tools for designing experiments ranging from simple laboratory experiments to complex experiments for analytical procedures. As assessment of chemical data is becoming more technical and demanding, this, in turn, is requiring chemists to consider more actively design structures that are efficient and to put greater emphasis on how they present and analyse their data using statistical methods. Such pressure encourages a greater awareness of the role of statistics in scientific experimentation together with a greater level of usage.

Use of statistical techniques are advocated by professional bodies such as The Royal Society of Chemistry (RSC) and the Association of Official Analytical Chemists (AOAC) for the handling and assessment of analytical data to ensure their quality and reliability. Statistical procedures appropriate to this type of approach form the basis of the Valid Analytical Measurement (VAM) scheme produced by the Laboratory of the Government Chemist (LGC) the National Measurement and Accreditation Service (NAMAS) of the United Kingdom Accreditation Service (UKAS), and other schemes including IS09000, BS5750, and GLP for the reporting of analytical measurements. These support initiatives and accreditation schemes highlight the importance placed on using statistical methods as integral to chemical data handling.


3 PLANNING AND DESIGN OF EXPERIMENTS

In designing an experiment, we need to have a clear understanding of the purpose of the experiment (objective), how and what response data are to be collected (measurements to be made), and how these are to be displayed and analysed (statistical analysis methods). Design and statistical analysis must be considered as one entity and not separate parts to be put together as necessary. A well planned experiment will produce useful chemical data which will be easy to analyse by the statistical methods chosen. A badly designed and planned experiment will not be easy to analyse even if statistical methods are applied.

Why is design so important? Inadequate designs provide inadequate data, so if we wish to assess experimental objectives properly, we need to design the experiment so that appropriate information for assessing the experimental objective is forthcoming. In addition to the statistical considerations of design structure, we also need to ensure that instruments are properly calibrated, experimental material is uncontaminated, the experiment is performed properly, and the data being recorded are suitable for their intended purpose. We must also ensure that there are no trends in the data through, for example, technicians operating instruments differently and batches of material being non-uniform, and that the influence of unrecognised causal factors is minimised. In comparing the measurement of two analytical procedures, for instance, it would be advisable to use comparable samples of known chemical content or else it may be impossible to know whether the procedures are efficient in their recording of the chemical response. In the chemical sciences, reduction in response variability (improved precision) by appropriate choice of factor levels may also be an important consideration. Cost, problem knowledge, and ease of experimentation also come into play when designing a chemical experiment.

It is therefore important that an experiment be carefully planned before implementation and data collection. If necessary, advice on structure and analysis should be sought in order to ensure that choice of, for example, number of samples to be tested, amount of replication to carry out, statistical analysis routine, and software are most appropriate for the experimentation planned. With such advice, experimentation, data collection, and data analysis can readily take place with the experimenter knowing how each part comes together to address the experimental objectives. Planning of experiments is not an easy process but by producing an experimental plan, or protocol as it is referred to in clinical trials, we can develop a useful step-by-step guide to the experimentation and subsequent data analysis. The four aspects associated with the specification of an experimental plan are as follows:

1 Statement of the objectives of the investigation

This refers to a clear statement of the aims and objectives of the proposed experiment. Specification of the experimental objective(s) is the most important and fundamental aspect of scientific experimentation as it lays down the question(s) the experiment is going to try to answer. This, in turn, helps focus the subsequent planning, data collection, and data analysis towards the goal(s) of the experiment.

2 Planning of the experiment

Planning entails considering how best to implement the experiment to generate relevant chemical responses. It encompasses choice of factors and ranges for experimentation, how such are to be controlled, how the experimental material is to be prepared, choice of most appropriate chemical outcome best reflective of the objective(s), the decision on how many measurements to collect, and how best to display and analyse the outcome measured (the statistical data analysis). These decisions are largely within the control of the experimenter through their knowledge of the subject area and any constraints affecting experimentation such as instrument usage and preparation of experimental material. The statistical data analysis components chosen may also influence these aspects of experimental planning.

3 Data collection

This refers to the physical implementation aspect of the experiment which will produce the chemical response data. Consideration must be given to whether instrument calibration is necessary, how experimental material is to be prepared and stored, how the experiment itself is to be conducted, and how the chosen chemical response is to be recorded through either measurement or observation.

4 Data analysis

Statistical methods, incorporating exploratory and inferential data analysis, should be employed in the analysis of the experimental data though choice of which technique(s) depends on the experimental objective(s), the design structure, and the type of chemical response to be measured. Inferential data analysis (significance tests and confidence intervals) enable conclusions to be objective rather than subjective, providing an impartial basis for deciding on the chemical implications of the findings. The relevance and chemical validity of these conclusions hinge on the experimenter's ability to translate the statistical findings into useful and meaningful chemical information.


Choice of experimental design structure is important to the conduct of a good experiment. Why design choice is so important in chemical experimentation can be simply summarised through the following points:

• The experiment should have specified objective(s) to assess in respect of the chemical phenomena associated with it.

• The design should be efficient by maximising the information gained using the minimum of experimental effort (small and efficient designs).

• The design should be practical (easy to implement and analyse) and, where practicable, follow a well documented design structure (commonly used design, known structure to data analysis).


These points reinforce the need to consider a planned experiment carefully and to try to use a design structure which will provide requisite data as efficiently as possible. In addition, they show that structure should also be such that the data collected can be analysed using simple and easily understood statistical methods.

Design efficiency can be measured by the experimental error which arises from the variation between experimental units and the variation from the lack of uniformity in the execution of the experiment. The smaller the experimental error the more efficient the design. By introducing various kinds of control such as increasing number of experimental units and number of factors in the experiment, the effects of this uncontrolled variation (noise) may be reduced and the design made more efficient. Statistical methods essentially attempt to separate the signal (the response) from the noise (the error) so that the level of the signal relative to the noise can be measured, large values being indicative of significant explanatory effect and small values providing evidence of chance, and not explanatory, effect.


(Continues...)
Excerpted from Statistical Analysis Methods for Chemists by William P. Gardiner. Copyright © 1997 The Royal Society of Chemistry. Excerpted by permission of The Royal Society of Chemistry.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

"About this title" may belong to another edition of this title.