The petroleum industry increasingly requires to process petroleum being more corrosive than that traditionally refined, a part of the corrosion occurring during such processing being associated with the presence of naphthenic acids.
Naphthenic corrosion occurs in the temperature band between 180° C. and 370° C., which band is attained in atmospheric and vacuum distillation heating furnaces, in transfer lines from such furnaces to towers, in some distillation tower trays and cut lines, and in some heat exchangers.
The method presently employed to determine petroleum acidity is ASTM D-664, “Standard Test Method for Acid Number of Petroleum Products by Potentiometric Titration”, being realised by means of potentiometric titration with potassium hydroxide. The total acid number (TAN) determined in this manner corresponds to the quantity of potassium hydroxide (KOH) required to neutralise each and every type of acid present in 1 g of sample. However the method is not selective for organic acids such as, for example, naphthenic acids, and may even measure acidity arising from the presence of phenols and inorganic compounds having an acid reaction, generating a result which does not correlate with the corrosiveness of the petroleum.
It has been observed that petroleum having a relatively low TAN, in respect whereof promotion of naphthenic corrosion was unexpected, presents such corrosive characteristic in some of the fractions thereof. This occurs by virtue of the fact that this type of corrosiveness does not solely depend on the quantity of naphthenic acids present in petroleum but additionally on the molecular weight of such acids, the types of bonds in the structure thereof, and the boiling points thereof. In addition, in spite of its being widely utilised, the ASTM D-664 method is not the most appropriate for use with petroleum, but is appropriate for use with petroleum derivatives. Such fact prejudices the precision of the said method and m renders the same of limited use in monitoring the processes of removal of naphthenic corrosion from the petroleum currently being studied.
With a view to eliminating such difficulties of correlation of TAN with the corrosiveness of petroleum and its derivatives, a method has been developed for determination of the acidity of petroleum samples arising solely from carboxylic acids present in the sample. The method, published in the journal Analytical Chemistry, v. 73, pp. 703-707, 2001, by JONES, D. M., et al., “Determination of naphthenic acids in crude oils using non-aqueous ion exchange solid-phase extraction” discloses a technique wherein organic acids are extracted from the sample by a solid absorbent, being then quantitatively desorbed and eluted, the concentrations thereof being quantified by mid-infrared spectroscopy. Such quantification, on stoichiometric conversion into milligrams of KOH per gram of sample, is denominated naphthenic acid number (NAN). Said publication additionally discloses that the infrared calibration curve is generated through reading solutions of known concentration of a standard commercial range of naphthenic acids through the maximum absorption thereof at approximately 1710 cm−l, considering that absorbency of the carbonyl is not influenced by the chemical structure of the acids present.
The objective of infrared (IR) spectroscopy is to identify functional groups in a material by virtue of the fact that each of such groups absorbs at a characteristic infrared radiation frequency. The IR spectrum is created as a consequence of absorption of electromagnetic radiation at frequencies related with the vibration of a given group of chemical bonds in a molecule. The IR spectrum region corresponds to the range in the electromagnetic spectrum lying between visible and radio wave radiation. Said region is divided into three parts: far-infrared (FIR) between 20 and 400 cm−1, wherein principally the rotational spectrum of molecules appears; mid-infrared (MIR) between 400 and 4000 cm−1, the bands whereof are generated by fundamental transitions and wherein practically all functional groups of organic molecules absorb; and near-infrared (NIR) between 4000 and 12 800 cm−1, wherein bands are generated by harmonic transitions and combinations of the fundamental transitions observed in MIR, principally occurring by virtue of the presence of functional groups containing chemical bonds with atoms of hydrogen.
The vibration spectrum of a product is considered as being a unique physical property characteristic of a molecule. Thus an IR spectrum may be used akin to a fingerprint for identification of an unknown pure substance through comparison with a reference spectrum. In the absence of a database including such sample, or in the case of a mixture of products, such analysis assists chemical characterisation through qualitative identification of the functional groups present.
One of the first applications of infrared spectroscopy as an analytical tool occurred during the decade of the 1940s in quality control in German chemical industries (COATES, J. P. “Appl. Spectrosc. Rev.”, v. 31, p. 179, 1996). Although infrared spectroscopy furnishes a large quantity of data with respect to the sample the use thereof in the resolution of quantitative analytical chemistry problems became popular only with the technological advances in instrumentation and computing in the 1980s. Instruments capable of generating large quantities of data of high complexity led to the development of chemometry, comprising the application of mathematical and statistical methods for the analysis of chemical and instrumental data (COSTA FILHO, P. A.; POPPI, R. J. “Aplicação de algorítimos genéticos na seleção de variáveis em espectroscopia no infravermelho médio. Determinação simultânea de glicose, maltose e frutose” [“Application of genetic algorithms in the selection of variables in mid-infrared spectroscopy. Simultaneous determination of glucose, maltose and fructose”]—Química Nova, v. 25, pp. 46-52, 2002).
The chemometric methods most utilised for multivariate quantitative analysis are: principal component regression (PCP) and partial least squares regression (PLSR) and analogues thereof.
The principal advantage of utilising multivariate analytical methods is the ability to exactly and precisely predict the value of the desired property of matrices of complex samples subject to chemical or physical interference. This is possible by virtue of the fact that such methods are capable of minimising the influence of interference by means of modelling the spectral variations caused by the interferent, being realised through inclusion of samples possessing the interferents in the multivariate regression model.
Through such methods deriving from principal component analysis (PCA), spectral data (absorption spectrums in the mid-infrared region) is grouped in a matrix X of data, wherein the samples are recorded in the lines, the independent variables (absorbency read at different wave numbers) being in the columns. Consequently the values of the property of interest, also known as dependent variables, are grouped in a matrix Y.
The matrix X is decomposed into various components also known as latent variables or factors constituted by two vectors denominated “loadings” and “scores”. The manner whereby such decomposition is realised is the principal difference between the method based on PLSR rather than on PCR.
In PLSR latent variables are found through an interactive process wherein there is an exchange of data between the data of matrix X and of matrix Y. Such process leads to rapid convergence of results and maximises the relationship between dependent and independent variables. This renders the use of PLSR more advantageous than PCR, wherein the decomposition of matrix X is independent of Y.
The mathematical model responsible for predicting the property of interest, also being called multivariate regression model, is constructed from the product of decomposition of matrix X and the data from matrix Y.
The ASTM E-1655 method “Infrared Multivariate Quantitative Analysis” and diverse articles such as, for example, those by GELADI and B. R. KOWALSKI, “Partial Least Squares Regression: A Tutorial”, Analytica Chimica Acta, v. 185, 1-17 (1986), K. R. BEEBE and B. R. KOWVALSKI, “An introduction to Multivariate Calibration and Analysis”, Analytical Chemistry, v. 59, no 17, Sep. 1, 1987, pp. 1007A-1017A, and MATENS H. and NAES T., “Multivariate Calibration”, John Wiley & Sons, New York, 1989, describe the functioning of the PLSR mathematical algorithm responsible for decomposition of matrices X and Y, resulting in the multivariate regression model.
Construction of a multivariate regression model based on PLSR may be divided into two stages: calibration and validation. Calibration utilises the absorption spectrums of the samples of the calibration population for construction of a mathematical model better adjusted to the spectral data and the values of the desired property. In this stage it is customary to make a selection of the independent variables which shall be utilised in calibration of the muitivariate regression model, the objective whereof is to increase the robustness thereof. Through validation the robustness of the model constructed is verified. This is carried out evaluating the prediction error of the samples of the calibration population (internal validation), also known as cross validation, or of external samples not participating in said calibration (external validation). Normally multivariate regression models are evaluated based on the values of the correlation coefficient (R2) between the values obtained by the proposed alternative technique and the reference values which should be as close as possible to one.
Another parameter employed is prediction error. There exist several ways is of expressing the quality of the model based on the prediction error such as, for example, root mean square error of cross validation (RMSECV) and root mean square error of prediction (RMSEP). Between them, that most utilised is RMSEP.
Equations 1, 2 and 3 show the definitions most used in such evaluation, wherein yi is the value of the property determined by the method of reference, ŷi is the value predicted for sample i of the validation population for the calculation of RMSEP, and N is the number of samples in the population.
                              Residuals          ⁡                      (                          prediction              ⁢                                                          ⁢              error                        )                          =                  yi          -                                    y              ^                        ⁢            i                                              (        1        )                                          Residual          ⁢                                          ⁢          variation                =                              ∑                                          (                                  yi                  -                                                            y                      ^                                        ⁢                    i                                                  )                            2                                N                                    (        2        )                                RMSEP        =                                            ∑                                                (                                      yi                    -                                                                  y                        ^                                            ⁢                      i                                                        )                                2                                      N                                              (        3        )            
RMSEP quantifies the magnitude of residuals of the predicted property for the validation samples and is used to determine the precision of predictions for unknown samples. RMSEP should approximate to the error of the method of reference utilised for calibration, and is the error, in the original units, expected in future predictions.
Low RMSECV or RMSEP values may indicate that the model constructed is suitable for predicting the desired parameter of unknown samples, whilst high values suggest that such model is of poor quality.
In the validation stage it is additionally very important to determine the number of latent variables required for construction of the PLSR-based models. For this purpose it is common to utilise the graph of the number of latent variables against the RMSECV value or RMSEP value, depending on the type of validation utilised.
A very low number of factors may result in high prediction errors by virtue of exclusion of variables having important information with reference to the property of interest. Thus excessive use of factors in addition to increasing complexity of the model, may lead to an increase in prediction error due to the excessive adjustment of the model wherein noise has been included.