Laboratory of Computational and Theoretical Chemistry, Faculty of Chemistry, University of Havana, Havana 10400, Cuba
2.
Departamento de Química, and Centro de Química Universidade de Coimbra, 3004-535 Coimbra, Portugal
3.
Departamento de Física, Facultad de Ciencias, Universidad de Chile, Casilla, 653 Santiago, Chile
4.
Centro para el desarrollo de la Nanociencias y Nanotecnología, CEDENNA, Av. Ecuador 3493, Santiago, Chile
5.
Escuela Superior de Física y Matemáticas, Instituto Politécnico Nacional, Edificio 9, U.P. A.L.M., Col. San Pedro Zacatenco, C.P. 07738, Ciudad de México, México
6.
Research Computing Center, University of North Carolina, Chapel Hill, NC 27599-3420, USA
7.
Department of Chemistry & Chemical Biology; McMaster University; Hamilton, Ontario L8S 4M1, Canada
8.
Department of Inorganic and Physical Chemistry, Ghent University, Krijgslaan 281 (S3), 9000 Gent, Belgium
9.
Center for Molecular Modeling, Ghent University, Technologiepark 903, 9052 Zwijnaarde, Belgium
Received Date:
13 September 2017 Revised Date:
13 October 2017 Accepted Date:
30 October 2017 Available Online:
2 June 2017
Fund Project:
CC acknowledges support by FONDECYT (1140313), Financiamiento Basal para Centros Científicos y Tecnológicos de Excelencia-FB0807, and project RC-130006 CILIS; Chile. PWA acknowledges support from NSERC, the Canada Research Chairs, and Compute Canada; Canada
Quantitative correlation of several theoretical electrophilicity measures over different families of organic compounds are examined relative to the experimental values of Mayr et al. Notably, the ability to predict these values accurately will help to elucidate the reactivity and selectivity trends observed in charge-transfer reactions. A crucial advantage of this theoretical approach is that it provides this information without the need of experiments, which are often demanding and time-consuming. Here, two different types of electrophilicity measures were analyzed. First, models derived from conceptual density functional theory (c-DFT), including Parr's original proposal and further generalizations of this index, are investigated. For instance, the approaches of Gázquez et al. and Chamorro et al. are considered, whereby it is possible to distinguish between processes in which a molecule gains or loses electrons. Further, we also explored two novel electrophilicity definitions. On one hand, the potential of environmental perturbations to affect electron incorporation into a system is analyzed in terms of recent developments in c-DFT. These studies highlight the importance of considering the molecular surroundings when a consistent description of chemical reactivity is needed. On the other hand, we test a new definition of electrophilicity that is free from inconsistencies (so-called thermodynamic electrophilicity). This approach is based on Parr's pioneering insights, though it corrects issues present in the standard working expression for the calculation of electrophilicity. Additionally, we use machine-learning tools (i.e., symbolic regression) to identify the models that best fit the experimental values. In this way, the best possible description of the electrophilicity values in terms of different electronic structure quantities is obtained. Overall, this straightforward approach enables one to obtain good correlations between the theoretical and experimental quantities by using the simple, yet powerful, interpretative advantage of c-DFT methods. In general, we observed that the correlations found at the HF/6-31G(d) level of theory are of semi-quantitative value. To obtain more accurate results, we showed that working with families of compounds with similar functional groups is indispensable.
The concepts of electrophilicity and nucleophilicity are cornerstones of our understanding of chemical reactivity 1, 2. Usually, they are merely considered in a qualitative way, together with the notions of electron accepting/donating groups, to rationalize chemical reactivity trends. However, their remarkable utility has also inspired researchers to find ways to quantify the electrophilic and nucleophilic character of a given molecule. These efforts have deepened our understanding of what makes a compound a better or worse electrophile/nucleophile, how these properties could be tuned, and how they relate to other quantities. We view the work of Mayr et al.3-7 as especially important. These authors have established electrophilicity and nucleophilicity scales by means of extensive experimental tests. The defining element of these studies is the following expression:
logk20∘C=s(EMayr+NMayr)
where k20 ℃ is the rate constant of the reaction (at 20 ℃) between the nucleophile and the electrophile, while s and NMayr are parameters specific to the former, and EMayr to the latter. Analyzing several combinations of electrophile-nucleophile pairs, the corresponding parameters can be determined. These definitions of the nucleophilicity and electrophilicity is clearly based on the concept that strong electrophiles react with strong nucleophiles rapidly and vigorously, while weak electrophiles and weak nucleophiles react slowly and reluctantly, if at all.
Parallel to these experimental determinations, there is steadfast interest in finding mathematical equations to quantify the electrophilicity of a molecule 8-10. Perhaps the best example of this is the early work of Maynard et al. 11, which proposed a simple expression for the electrophilicity in terms of electron attachment/removal energies. Parr et al.12 then provided a theoretical justification for Maynard's proposal within the framework of conceptual density functional theory (c-DFT) 13-23. This landmark contribution gave chemists a simple, yet powerful, tool that has been successfully applied to the analysis of regioselectivity of cycloaddition reactions 24-27, the study of leaving groups 28, 29, transition states 30-32, and redox reactions 33-35. This c-DFT electrophilicity measure has also been used as a descriptor in several quantitative structure activity relationship (QSAR) studies 36-39. But arguably the biggest impact of the precise quantitative definition of the electrophilicity is that it provided a way to prove the minimum electrophilicity principle (MEP) 40-44. Expanding on Parr's work, and also within the c-DFT formalism, Gázquez et al. 45, and Chamorro et al. 46, 47 have generalized this index. Recently, new working expressions for this index has been proposed, which are free from some inconsistencies found in previous versions 48, 49.
The proliferation of theoretical electrophilicity measures makes it desirable to assess their performance. This is main goal of this work. As a reference for comparison, we will use Mayr's parameter. Since the theoretical and experimental measures of electrophilicity refer to different effects, we do not expect them to be exactly the same. However, we believe that a good theoretical measure of electrophilicity must capture the trends from Mayr's work. In addition to testing existing theoretical definitions, we will also attempt to use machine learning techniques to find a mathematical expression that reproduces the Mayr electrophilicity.
2.
Conceptual DFT electrophilicity measures
Parr et al. 12 defined the electrophilicity, ω, as the stabilization energy associated with the electronic saturation of the system (e.g., molecule). In other words, how the energy of a system changes when it reaches equilibrium with a perfect electron donor. Evaluating this definition requires a model for how the electronic energy, E, changes with respect to the number of electrons, N. If we neglect the changes in the external potential (e.g., molecular geometry) during the electron saturation process, we can estimate energy variation in terms of the following (truncated) Taylor expansion 50:
E(N)=E(N0)+(∂E∂N)v(r)(N−N0)+12(∂2E∂N2)v(r)(N−N0)2
Within the c-DFT framework is customary to identify the coefficients of this expansion with different reactivity descriptors. For example, Parr et al. 51 proposed to identify the coefficient of the linear term with the (electronic) chemical potential of the system, μ, which is in turn defined as the additive inverse of the electronegativity, χ:
μ≡−χ≡(∂E∂N)v(r)
In a similar way, Parr and Pearson 50 identified the coefficient of the quadratic term with the chemical hardness, η, which is in turn defined as the multiplicative inverse of the chemical softness, S:
η≡1S=(∂2E∂N2)v(r)
In c-DFT, a perfect electron donor is defined as a system with zero electronegativity (e.g., zero chemical potential). This means that Parr's electrophilicity definition can be written as:
ω=Eμ=μ0−Eμ=0
where μ0 is the chemical potential of the isolated system.
The standard way to approximate this index is to notice that, according to Sanderson's electronegativity equalization principle 52, the system will reach electronic saturation when:
(∂E∂N)v(r)=0
Substituting Eq. (2) in this expression we can calculate the maximum number of bound electrons, △Nmax:
Notice that we have explicitly indicated that this expression corresponds to the quadratic energy model used in Eq. (2). (It is possible to generalize this expression to differentiate between the electrophilicity of spin up and spin down electrons 53-60, but this will not be considered here.)
To obtain a working expression for the electrophilicity we need to express the chemical potential and chemical hardness in terms of quantities from electronic structure theory that can be computed using standard quantum chemistry software. This can be done by viewing Eq. (2) as a parabolic interpolation of the electronic energy of the system with N0, N0
+ 1, and N0 -1 electrons or, equivalently, using a three-point finite differences approximation with △N = 1 61:
\mu = - \frac{{I + A}}{2}
\eta = I - A
here, I and A are the ionization energy and electron affinity, respectively. Substituting these expressions in Eq. (8) we obtain:
Motivated by the success of the parabolic model, Gázquez, Cedillo, and Vela (GCV) proposed a two-parabolas model that allowed them to describe electron transfer processes in a more realistic way 45. The key insight in their work was to distinguish between the electron-accepting and the electron-donating processes. (Since we will be only concerned here with the ability of a system to incorporate electron, we will neglect the latter.) In their formulation, the electron-rich and electron-poor regimes are described by different parabolas (e.g., different EvsN models) with the same curvature. Imposing different conditions on this simple model, they obtained the expressions for the chemical potentials of the system when is gaining, μGCV+ electrons:
Their model is completed with the assumption that the chemical hardness for both electron-accepting and electron-donating processes is equal:
\eta _{\text{GCV}}= \frac{{I - A}}{2}
These new expressions can be substituted in the general working equation obtained by Parr (cf. Eq. (8)), resulting in the following electroaccepting, ωGCV+, power45:
Inspired by the work of GCV, Chamorro, Duque, and Pérez (CDP) 46, 47 rewrote Eq. (8) using the electron accepting, μ+, chemical potential obtained from the piecewise linear interpolation of Perdew, Parr, Levy, and Balduz 62:
Additionally, these authors also proposed a simpler ansatz, where the chemical hardness of a system gaining electrons could be related to the chemical potential of the inverse process. In other words, making 46, 47:
Recently, it has been pointed out that there are several problems with the classical way to calculate the electrophilicity, Eq. (11) 49. Most notably, this formula gives unreasonable results for polycations (where △Nmax > > 1) and for species with small electron affinities (where △Nmax should be close to zero, but may not be). These, and other, problems could be traced back to the fact that while the parabolic interpolation is mathematically consistent 63, it lacks a rigorous physical justification 64, 65 (except for some cases) 66, 67. This situation motivated the analysis of the electrophilicity index using a finite temperature formulation of the grand-canonical ensemble (an approach that has provided rigorous foundations to several c-DFT results) 48, 68-75. Within this formulation, it is easy to obtain an exact working expression that preserves the physical meaning of Parr's definition. This thermodynamic electrophilicity, ωTD, is simply given by 49:
where M0Ak is the kth electron affinity of the system with M0 particles, and θ(x) is Heaviside (step) function. If we are only considering the same three states used in the parabolic model (e.g., "neutral",
"cationic", and "anionic" states), this result is reduced to:
{{\omega }_{\text{TD}}} = A\, \theta \left( A \right)
This merely takes the electrophilicity to be equal to A when A > 0, and zero in the other cases. This is the simplified expression that we will be using in the following.
Finally, it has been also shown that a consistent description of chemical reactivity within c-DFT requires the descriptors to incorporate the effects of the molecular surroundings 75-79. The simplest way to do so is using the insights obtained from the Klopman-Salem frontier molecular orbital treatment 80-86, which allows us to obtain model expressions for the perturbed chemical potential, μP, and the perturbed chemical hardness, ηP76:
where γ and ζ are non-negative parameters that model the interaction with the environment. Including these modified descriptors in Eq. (8) we obtain a more general, perturbed electrophilicity, ωP:
In the following, we will analyze how the previously discussed electrophilicity measures correlate with Mayr's electrophilicity parameter. To do this, we will work with a simple linear correlation:
{{E}_{\text{Mayr}}} = m\omega + b
Since the effect of the \frac{1}{{2\xi {{(1 + \gamma )}^2}}} factor in Eq. (23) can be included in the slope, m, of the linear regression, we will work with a simplified version of the perturbed electrophilicity, namely:
{\tilde \omega _\text{P}} = \frac{{{{(\gamma I + A)}^2}}}{{I - A}}
3.
Electrophilicity measures from symbolic regression (SR)
The indices presented in the previous section are chemically motivated and, as such, they help us determine the factors which govern the acceptance of electrons by an electrophile. While this is a valuable approach, sometimes we are more interested in obtaining an accurate estimate of the electrophilicity, regardless of the insight, it might provide. An attractive alternative in this respect is the use of machine learning (ML) tools 87-91.
Here, we will work with a very powerful data mining tool: symbolic regression (SR) 92-100. SR is a supervised method, which means that we provide the algorithm with a "training set" in order to find correlations between different descriptors and a property of interest. The goal of SR is to use a training set to obtain a mathematical function that relates the desired properties to the data. In this way, SR gives the best functional form (of a given pre-specified complexity), along with the best numerical parameters, for a given training set. In more technical terms, given a set of data values \{ \vec x;y\}
(training set), with \vec x representing a vector of independent variables (e.g., descriptors), and y the dependent variable (which will be predicted from \vec x ), SR finds the function \bar y = f(\vec x) that best describes the overall data. The optimization process takes place over a function space, expanded by mathematical operators and constants 100. If the function space includes all possible functional forms, that is, the length of the mathematical expression can go up to infinity and there is no restriction on the operators used, the method will find many different functions which fit all the data exactly. However, this might cause problems when describing points/observations outside the training set. It is therefore common to enforce some constraints during the SR procedure (e.g., the length of the functions, the number of operands and numerical parameters used, etc.)
Once we restrict the accessible function space, an intuitive choice would be to select a set of basis functions to expand it. However, while this will turn the SR problem into a more familiar linear regression, the size of the basis needed makes this impractical in most cases. This demands the use of different optimization methods, with evolutionary algorithms being a popular choice. Here we will focus on two such algorithms: genetic programming (GP) 100-103 and grammatical evolution (GE) 104-106.
3.1
Genetic programming (GP)
Genetic programming (GP) 100-102 draws inspiration from the mechanism of DNA replication, with new "generations" (e.g., functional forms) leading towards better solutions. On this algorithm, each possible solution is referred to as an "individual". In the case of an SR, these are functions proposed to fit the data. These individuals are commonly represented as a tree (see Fig. 1). The leaves represent constants or data variables, and the nodes represent the allowed mathematical operators. It is important to remark that it is possible if desired, to use basically any function as an operator, and not only the four basic arithmetic operators.
图 1
图 1
Tree representation of Parr's electrophilicity equation.
Figure 1.
Tree representation of Parr's electrophilicity equation.
The algorithm starts with a random generation of possible solutions (trees) that fulfill the restrictions imposed (e.g., the operators allowed and the maximum length of the solution). Then, a new generation of solutions is created, following three steps: breeding, evaluation, and selection. The breeding involves the creation of a new set of alternative solutions. This can be done in several ways 107: copying a solution to the new generation; combining two solutions to produce two new ones by exchanging two randomly selected subtrees between the parents (crossover) 108; changing part of a solution (subtree) for a new, randomly generated, one (mutation) 103 (see Figs. 2 and 3). The crossover operator ensures that the individuals of the new generation share characteristics of their parents; while the mutation process prevents premature convergence of the algorithm by adding randomness. These changes occur in an uncontrolled way. Then, the evaluation step consists in the evaluation of the fitness function for each of the newly created candidate solutions. The last step consists of the selection of some of the newly created solutions in order to conform the population of the new generation. The probability to select an individual increases with the quality of the fitness calculated in the evaluation step. This process is repeated until a termination criterion is fulfilled (typically given by the number of iterations or a fitness threshold).
图 2
图 2
Effect of the mutation operator on a tree.
Figure 2.
Effect of the mutation operator on a tree.
Grammatical evolution (GE) 104-106, a more recent evolutionary algorithm, uses a grammar to map an integer array into a program (in the case of an SR, into a function). To do the mapping, GE uses a context-free grammar 109-111, which along with an alphabet of terminal symbols, a set of non-terminal symbols, production rules, and a start symbol, is capable of deriving a model from an integer string. These integer strings are then optimized by an evolutionary technique. At every step, the composition rules are used to map from the array to the corresponding model, in order to evaluate the fitness of the latter. This resembles the natural process in which the genotype is separated from the phenotype. The optimization procedure can mimic the steps used in the GP case, but other algorithms such as particle swarm optimization 112 can also be used.
4.
Model systems and computational tools
The different electrophilicity measures were tested over a set of 58 molecules for which Mayr's EMayr parameter is known113 (see Table 1). This global set includes several families of electrophiles, grouped following Mayr's "star classification system" 114. According to this criterion, we have a group of 3 1-star electrophiles, 21 3-stars electrophiles, 22 5-stars electrophiles, and 12 neutral electrophiles.
表 1
表 1
Studied molecules, their classification according to Mayr's system, their experimental EMayr values 113, and their calculated I and A (given in eV).
Table 1.
Studied molecules, their classification according to Mayr's system, their experimental EMayr values 113, and their calculated I and A (given in eV).
The structures of all the molecules were optimized at the HF/6-31g(d) level of theory. Then, single-point calculations were performed for these geometries after adding one electron, and after removing one electron. In this way, we obtain the necessary data to calculate the vertical ionization potentials and electron affinities (see Table 1). All the calculations were performed using Gaussian 09 115.
The SR calculations were performed using GP and GE algorithms, with different "tree lengths". The mutation probability was set to 15%, and the maximum number of generations was set to 50. For each case, three runs were performed, as a way to check the stability of the SR results. The final expressions were simplified, when possible. These calculations were performed using the HeuristicLab-3.3.12 software package 116.
5.
Results
5.1
Conceptual DFT results
In Table 2 we show the values of the fitted parameters for every tested model. We have performed 5 different fits, one for each of the 4 separate families of compounds, and one that takes into account all the molecules. It is interesting to note that in the case of the 1-star family the slope of the linear regression (cf. Eq. (24)) is negative when we use the ωparabolic,
ωTD, and {\tilde \omega _{\rm{P}}} indices. This is a surprising result since these are precisely the electrophilicity measures that have the strongest theoretical justification. This implies that these indices can even give qualitatively wrong results, namely, they are in some cases incapable of predicting the relative order of electrophilicity observed experimentally. This is a reminder that these simple electronic structure-based indices miss many of the factors that govern the tendency of a system to gain electrons during the course of an actual chemical reaction. Nonetheless, it is reassuring that when we work with bigger datasets, all the studied models have positive slopes. This means that, overall, these indices are able to recover the rough trends in Mayr's electrophilicity parameter.
表 2
表 2
Fit parameters for all the c-DFT-based electrophilicity measures studied.
Table 2.
Fit parameters for all the c-DFT-based electrophilicity measures studied.
With the parameters given in Table 1, it is easy to evaluate the quality of the different fits (see Table 3). In general, good correlations are obtained in all cases, with the exception of the 3-stars family. This is probably because the 3-star family is the most structurally diverse group. For example, here we can find heterosubstituted carbocations (like flavylium and 1, 3-dithianylium), methal-stabilized carbocations (like the 1, 3-diphenylpropyn-1-ylium-Cr(CO)3), benzhydril cations (like pop(tol) CH+ and (pfp)2CH+), cyclic conjugated carbocations (like the tropylium ion), and other π-delocalized carbocations (like 1, 1, 3-triphenylallylium and 1-(4-chlorophenyl) cyclopent-2-enylium). In this case, even the best fit (given by ωTD) still results in an RMSD of 2.950, and a correlation coefficient lower than 0.6. This is a strong indication that to predict the electrophilicity of a given compound we should use a model trained on a set of molecules with similar characteristics. An alternative, in the case we are especially interested in structurally diverse sets, could be to include information regarding condensed local reactivity descriptors 117-120.
表 3
表 3
Average error (AE), mean absolute error (MAE), root mean square deviation (RMSD), and correlation coefficient (R2) for all the c-DFT-based electrophilicity measures studied.
Table 3.
Average error (AE), mean absolute error (MAE), root mean square deviation (RMSD), and correlation coefficient (R2) for all the c-DFT-based electrophilicity measures studied.
This insight is supported by the very good fits obtained for the 5-stars and neutral groups (particularly, when we consider the values of R2). These families are mostly formed by structurally similar compounds: benzhydryl cations in the 5-star case, and quinone methides (with a few alkenes substituted with electron-withdrawing groups) in the neutral case. The structural variety of the neutral family is reflected instead in the RMSD values, which are significantly higher than those found in the 5-star case. Overall, all the electrophilicity measures studied have very similar performances over these sets. Nonetheless, in terms of RMSD and R2, the ωCDP1+ and ωCDP2+ indices are outperformed by others. These trends are preserved when we perform the fits including all the studied molecules. Finally, we should note that the simple model based on ωTD consistently ranks among the best. Recall that is simply equal to the electron affinity; it is not surprising that the electrophilicity and the electron affinity are closely related.
5.2
SR results
In Tables 4 and 5 we show the expressions obtained using GE and GP for all families, respectively. In cases where the optimization algorithm resulted in more than one expression, we will only consider the two that have the highest correlation coefficient values. All the solutions with tree lengths up to 4 are explicitly presented. Additionally, we will also discuss some examples of more complex expressions obtained with greater tree lengths.
表 4
表 4
Equations obtained using grammatical evolution.
Table 4.
Equations obtained using grammatical evolution.
- 29.00367 - 13.69916\left( { - A + \frac{A}{{\rm{I}}}} \right)
all
3
ge_A3a
- 14.29564 + 0.42896{A^2}
4
ge_A4a
- 13.01861 + 0.06894{A^3}
The equation codes reflect the method (ge: grammatical evolution), the group ("1": 1-star, "3": 3-stars, "5": 5-stars, "N": neutral, "A": all), the three length (3 or 4), and an identifier that indicates if the equation corresponds to the best ("a") or second best ("b") value of R2.
表 5
表 5
Equations obtained using genetic programming.
Table 5.
Equations obtained using genetic programming.
The equation codes reflect the method (gp: genetic programming), the group ("1": 1-star, "3": 3-stars, "5": 5-stars, "N": neutral, "A": all), the three length (3 or 4), and an identifier that indicates if the equation corresponds to the best ("a") or second best ("b") value of R2.
Unsurprisingly, given the choice to select between I or A to construct a function to fit EMayr, both algorithms choose the latter. However, here we see much more general functional forms including the electron affinity. For example, now we have a greater variety of terms with different powers of A (ranging from -1 to +3). In all these cases (except in the pathological 1-star family), the final equations are monotonically increasing functions of A.
The results of the different statistical measures corresponding to the SR fits can be found in Table 6. Overall, the performance of these methods is very similar to the c-DFT electrophilicity measures. For example, for the 3-stars family, the RMSD and R2 values obtained both by GE and GP algorithms are virtually identical to those corresponding to the best c-DFT indices (ωTD and {\tilde \omega _{\rm{P}}}). For this family, the quality of the fit remains essentially unaltered if we increase the tree length using the GE algorithm (increasing the tree length up to 7 only increases R2 up to 0.539182). On the other hand, increasing the tree length does improve the fit quality when we use GP. In this case, a tree length of 7 gave an R2 of 0.703165. The corresponding expression (which, in our notation, would be equation gp_73a) is:
表 6
Average error (AE), mean absolute error (MAE), root mean square deviation (RMSD), and correlation coefficient (R2) for all the SR-based electrophilicity measures studied.
Table 6.
Average error (AE), mean absolute error (MAE), root mean square deviation (RMSD), and correlation coefficient (R2) for all the SR-based electrophilicity measures studied.
Nonetheless, requiring an extremely complicated expression to get a semi-quantitative fit is a reminder of the difficulty of predicting experimental electrophilicities with a single expression over a structurally diverse set of compounds.
For the 5-stars family the SR results are excellent, and increasing the tree length in the latter case does not produce a noticeable improvement in the quality of the fit. On the other hand, it is interesting to note that when we work with all the molecules at the same time, the c-DFT indices perform slightly better than the SR expressions. However, when we increase the tree length both GP and GE algorithms provide significantly better solutions. For example, the equation ge_A7a would be:
These results indicate that we need to increase the flexibility of SR methods when the complexity of the system under study increases. This is to be expected since these methods need to "learn" the physical insights that are already built in the c-DFT descriptors.
6.
Conclusions
In this work we have assessed how different theoretical electrophilicity measures can be used to predict the experimental values of this index determined by Mayr et al. The different c-DFT indices used provided acceptable results, as long as they were applied to molecules with similar structures. From a quantitative point of view, it is reassuring that a measure as simple as the electron affinity, ωTD, provides results of comparable (if not better) quality than models based on more complex indices. This means that we can predict experimental electrophilicity values without computing the ionization energy, which reduces the computational effort.
Perhaps the biggest appealing of the c-DFT approach is that one uses simple expressions, with chemically-motivated terms. In this way, we gain valuable insights into the factors that determinate the electrophilicity at a molecular level. However, we noticed that there are cases when the c-DFT measures were unable to provide the correct qualitative ordering corresponding to the experimental electrophilicity. This serves as an indication that still more effort is needed to improve the c-DFT descriptors, so they can provide a more realistic description of the electron uptake processes. For example, local reactivity indicators are probably needed in order to compare the electrophilicities of diverse families of molecules that include several different types of electrophilic reactive sites (A similar observation has been made for quantitative studies of the quality of leaving groups 28).
To the best of our knowledge, we are reporting the first application of symbolic regression techniques to the prediction of experimental electrophilicity values and the first study of symbolic regression within the conceptual DFT framework. Symbolic regression has the advantage that the resulting models can be improved systematically, by removing restrictions from the optimization process. We showed that this is a key factor for making adequate predictions over relatively complex sets. The simplest models once again proved that the best way to predict the electrophilicity, which often only requires calculating the electron affinity. In general, the expressions obtained using GE are simpler than those obtained using GP, but in most cases the corresponding fits are equivalent. Finally, it should be remarked that, while attractive, these models could be prone to overfitting.
In general, when used carefully, the different theoretical methods studied here provide adequate predictions of the experimental electrophilicity. Further studies are necessary, however, to assess their utility over a broader range of compounds, including their validation using independent "test" sets of molecules. It is also interesting to check whether using the exact electrophilicity formula given in Eq. (19) will improve the quantitative predictions.
References
[1]
Miller, B. Advanced Organic Chemistry: Reactions and Mechanisms; Prentice-Hall: Upper Saddle River, NJ, USA, 1998.doi: 10.1021/ed075p1558
doi: 10.1021/ed075p1558
[2]
March, J. Advanced Organic Chemistry; Wiley-Interscience: New York, NY, USA, 1992. doi: 10.1002/0470084960
doi: 10.1002/0470084960
[3]
Mayr, H.; Kempf, B.; Ofial, A. R. Acc. Chem. Res. 2003, 36, 66.doi: 10.1021/ar020094c
doi: 10.1021/ar020094c
[4]
Mayr, H.; Bug, T.; Gotta, M. F.; Hering, N.; Irrgang, B.; Janker, B.; Kempf, B.; Loos, R.; Ofial, A. R.; Remennikov, G.; et al. J. Am. Chem. Soc. 2001, 123, 9500. doi: 10.1021/ja010890y
doi: 10.1021/ja010890y
Mayr, H.; Ofial, A. R. J. Phys. Org. Chem. 2008, 21, 584.doi: 10.1002/poc.1325
doi: 10.1002/poc.1325
[7]
Mayr, H.; Ofial, A. R. Pure Appl. Chem. 2005, 77, 1807.doi: 10.1351/pac200577111807
doi: 10.1351/pac200577111807
[8]
Liu, S. B. In Chemical Reactivity Theory: A Density Functional View; Chattaraj, P. K., Ed.; Taylor and Francis: Boca Raton, FL, USA, 2009; p. 179.
[9]
Chattaraj, P. K.; Giri, S. Ann. Rep. Prog. Chem. C 2009, 105, 13.doi: 10.1039/B802832J
doi: 10.1039/B802832J
[10]
Chattaraj, P. K.; Giri, S.; Duley, S. Chem. Rev. 2011, 111, PR43.doi: 10.1021/cr100149p
doi: 10.1021/cr100149p
[11]
Maynard, A. T.; Huang, M.; Rice, W. G.; Covell, D. G. Proc. Natl. Acad. Sci. USA 1998, 95, 11578. doi: 10.1073/pnas.95.20.11578
doi: 10.1073/pnas.95.20.11578
[12]
Parr, R. G.; von Szentpály, L.; Liu, S. B. J. Am. Chem. Soc. 1999, 121, 1922. doi: 10.1021/ja983494x
doi: 10.1021/ja983494x
Parr, R. G.; Yang, W. Density-Functional Theory of Atoms and Molecules; Oxford UP: New York, NY, USA, 1989.
[15]
Geerlings, P.; De Proft, F.; Langenaeker, W. Chem. Rev. 2003, 103, 1793. doi: 10.1021/cr990029
doi: 10.1021/cr990029
[16]
Johnson, P. A.; Bartolotti, L. J.; Ayers, P. W.; Fievez, T.; Geerlings, P. Modern Charge Density Analysis; Gatti, C., Macchi, P., Eds.; Springer: New York, NY, USA, 2012; p. 715.
[17]
Ayers, P. W.; Anderson, J. S. M.; Bartolotti, L. J. Int. J. Quantum Chem. 2005, 101, 520. doi: 10.1002/qua.20307
doi: 10.1002/qua.20307
[18]
Ayers, P. W.; Parr, R. G. J. Am. Chem. Soc. 2000, 122, 2010.doi: 10.1021/ja9924039
doi: 10.1021/ja9924039
[19]
Miranda-Quintana, R. A. Conceptual Density Functional Theory and its Applications in the Chemical Domain; Islam, N., Kaya, S., Eds.; Apple Academic Press: NJ, USA, in press.
[20]
Chemical Reactivity Theory: A Density Functional View; Chattaraj, P. K., Ed.; CRC Press: Boca Raton, FL, USA, 2009.
[21]
Fuentealba, P.; Cárdenas, C. Chemical Modelling; Springborg, M., Ed.; The Royal Society of Chemistry: London, UK, 2015; Vol. 11, p. 151.
[22]
Liu, S. B. Acta Phys. -Chim. Sin. 2009, 25, 590. doi: 10.3866/PKU.WHXB20090332
doi: 10.3866/PKU.WHXB20090332
[23]
Gazquez, J. L. J. Mex. Chem. Soc. 2008, 52, 3.
[24]
Domingo, L. R.; Perez, P.; Saez, J. A. RSC Adv. 2013, 3, 1486.doi: 10.1039/C2RA22886F
doi: 10.1039/C2RA22886F
[25]
Domingo, L. R.; Zaragoza, R. J.; Saez, J. A.; Arno, M. Molecules 2012, 17, 1335. doi: 10.3390/molecules17021335
doi: 10.3390/molecules17021335
[26]
Domingo, L. R.; Perez, P.; Contreras, R. Tetrahedron 2004, 60, 6585. doi: 10.1016/j.tet.2004.06.003
doi: 10.1016/j.tet.2004.06.003
[27]
Domingo, L. R.; Aurell, M. J.; Perez, P.; Contreras, R. J. Phys. Chem. A 2002, 106, 6871. doi: 10.1021/jp020715j
doi: 10.1021/jp020715j
[28]
Anderson, J. S. M.; Liu, Y. L.; Thomson, J. W.; Ayers, P. W. J. Mol. Struct.: THEOCHEM 2010, 943, 168.doi: 10.1016/j.theochem.2009.12.013
doi: 10.1016/j.theochem.2009.12.013
[29]
Ayers, P. W.; Anderson, J. S. M.; Rodriguez, J. I.; Jawed, Z. Phys. Chem. Chem. Phys. 2005, 7, 1918. doi: 10.1039/B500996K
doi: 10.1039/B500996K
[30]
Chamorro, E.; Chattaraj, P. K.; Fuentealba, P. J. Phys. Chem. A 2003, 107, 7068. doi: 10.1021/jp035435y
doi: 10.1021/jp035435y
[31]
Parthasarathi, R.; Elango, M.; Subramanian, V.; Chattaraj, P. K. Theor. Chem. Acc. 2005, 113, 257. doi: 10.1007/s00214-005-0634-3
doi: 10.1007/s00214-005-0634-3
[32]
González, M. M.; Hernández-Castillo, D.; Montero-Cabrera, L. A.; Miranda-Quintana, R. A. Int. J. Quantum Chem. 2017, e25444.doi: 10.1002/qua.25444
doi: 10.1002/qua.25444
[33]
Moens, J.; Jaque, P.; De Proft, F.; Geerlings, P. J. Phys. Chem. A 2008, 112, 6023. doi: 10.1021/jp711652a
doi: 10.1021/jp711652a
[34]
Moens, J.; Geerlings, P.; Roos, G. Chem. -A Eur. J. 2007, 13, 8174. doi: 10.1002/chem.200601896
doi: 10.1002/chem.200601896
[35]
Moens, J.; Roos, G.; Jaque, P.; Proft, F.; Geerlings, P. Chem. -A Eur. J. 2007, 13, 9331. doi: 10.1002/chem.200700547
doi: 10.1002/chem.200700547
[36]
Parthasarathi, R.; Padmanabhan, J.; Subramanian, V.; Maiti, B.; Chattaraj, P. K. Curr. Sci. 2004, 86, 535.
[37]
Parthasarathi, R.; Subramanian, V.; Roy, D. R.; Chattaraj, P. K. Biorg. Med. Chem. 2004, 12, 5533. doi: 10.1016/j.bmc.2004.08.013
doi: 10.1016/j.bmc.2004.08.013
Galvan, M.; Vela, A.; Gazquez, J. L. J. Phys. Chem. 1988, 92, 6470. doi: 10.1021/j100333a056
doi: 10.1021/j100333a056
[54]
Vargas, R.; Galvan, M.; Vela, A. J. Phys. Chem. A 1998, 102, 3134. doi: 10.1021/jp972984t
doi: 10.1021/jp972984t
[55]
Galvan, M.; Vargas, R. J. Phys. Chem. 1992, 96, 1625.doi: 10.1021/j100183a026
doi: 10.1021/j100183a026
[56]
Ghanty, T. K.; Ghosh, S. K. J. Am. Chem. Soc. 1994, 116, 3943. doi: 10.1021/ja00088a033
doi: 10.1021/ja00088a033
[57]
Chamorro, E.; Santos, J. C.; Escobar, C. A.; Perez, P. Chem. Phys. Lett. 2006, 431, 210. doi: 10.1016/j.cplett.2006.09.072
doi: 10.1016/j.cplett.2006.09.072
[58]
Chamorro, E.; Perez, P.; De Proft, F.; Geerlings, P. J. Chem. Phys. 2006, 124, 044105. doi: 10.1063/1.2161187
doi: 10.1063/1.2161187
[59]
Perez, P.; Chamorro, E.; Ayers, P. W. J. Chem. Phys. 2008, 128, 204108. doi: 10.1063/1.2916714
doi: 10.1063/1.2916714
[60]
Miranda-Quintana, R. A.; Ayers, P. W. Theor. Chem. Acc. 2016, 135, 239. doi: 10.1007/s00214-016-1995-5
doi: 10.1007/s00214-016-1995-5
[61]
Cardenas, C.; Heidar Zadeh, F.; Ayers, P. W. Phys. Chem. Chem. Phys. 2016, 18, 25721. doi: 10.1039/C6CP04533B
doi: 10.1039/C6CP04533B
[62]
Perdew, J. P.; Parr, R. G.; Levy, M.; Balduz, J. L., Jr. Phys. Rev. Lett. 1982, 49, 1691. doi: 10.1103/PhysRevLett.49.1691
doi: 10.1103/PhysRevLett.49.1691
[63]
Heidar Zadeh, F.; Miranda-Quintana, R. A.; Verstraelen, T.; Bultinck, P.; Ayers, P. W. J. Chem. Theory Comp. 2016, 12, 5777. doi: 10.1021/acs.jctc.6b00494
doi: 10.1021/acs.jctc.6b00494
[64]
Miranda-Quintana, R. A.; Ayers, P. W. Conceptual Density Functional Theory and Its Applications in the Chemical Domain; Islam, N., Kaya, S., Eds.; Apple Academic Press: NJ, USA, in press.
[65]
Miranda-Quintana, R. A.; Ayers, P. W. J. Chem. Phys. 2016, 144, 244112. doi: 10.1063/1.4953557
doi: 10.1063/1.4953557
[66]
Ayers, P. W.; Parr, R. G. J. Chem. Phys. 2008, 129, 054111. doi: 10.1063/1.2957900
doi: 10.1063/1.2957900
[67]
Ayers, P. W.; Parr, R. G. J. Chem. Phys. 2008, 128, 184108. doi: 10.1063/1.2918731
doi: 10.1063/1.2918731
[68]
Franco-Pérez, M.; Ayers, P. W.; Gazquez, J. L.; Vela, A. J. Chem. Phys. 2017, 147, 094105. doi: 10.1063/1.4999761
doi: 10.1063/1.4999761
[69]
Franco-Pérez, M.; Heidar-Zadeh, F.; Ayers, P. W.; Gazquez, J. L.; Vela, A. Phys. Chem. Chem. Phys. 2017, 19, 11588.doi: 10.1039/C7CP00224F
doi: 10.1039/C7CP00224F
[70]
Franco-Pérez, M.; Ayers, P. W.; Gazquez, J. L.; Vela, A. Phys. Chem. Chem. Phys. 2017, 19, 13687. doi: 10.1039/C7CP00692F
doi: 10.1039/C7CP00692F
[71]
Polanco-Ramírez, C. A.; Franco-Pérez, M.; Carmona-Espíndola, J.; Gazquez, J. L.; Ayers, P. W. Phys. Chem. Chem. Phys. 2017, 19, 12355. doi: 10.1039/C7CP00691H
doi: 10.1039/C7CP00691H
[72]
Franco-Pérez, M.; Ayers, P.; Gazquez, J. L.; Vela, A. J. Chem. Phys. 2015, 143, 244117. doi: 10.1063/1.4938422
doi: 10.1063/1.4938422
[73]
Franco-Pérez, M.; Gazquez, J. L.; Ayers, P.; Vela, A. J. Chem. Phys. 2015, 143, 154103. doi: 10.1063/1.4932539
doi: 10.1063/1.4932539
[74]
Malek, A.; Balawender, R. J. Chem. Phys. 2015, 142, 054104.doi: 10.1063/1.4906555
doi: 10.1063/1.4906555
[75]
Miranda-Quintana, R. A.; Ayers, P. W. Phys. Chem. Chem. Phys. 2016, 18, 15070. doi: 10.1039/c6cp00939e
doi: 10.1039/c6cp00939e
[76]
Miranda-Quintana, R. A. Theor. Chem. Acc. 2017, 136, 76.doi: 10.1007/s00214-017-2109-8
doi: 10.1007/s00214-017-2109-8
[77]
Miranda-Quintana, R. A.; Ayers, P. W. Theor. Chem. Acc. 2016, 135, 172. doi: 10.1007/s00214-016-1924-7
doi: 10.1007/s00214-016-1924-7
[78]
Miranda-Quintana, R. A. Theor. Chem. Acc. 2016, 135, 189.doi: 10.1007/s00214-016-1945-2
doi: 10.1007/s00214-016-1945-2
[79]
Miranda-Quintana, R. A.; González, M. M.; Ayers, P. W. Phys. Chem. Chem. Phys. 2016, 18, 22235. doi: 10.1039/c6cp03213c
doi: 10.1039/c6cp03213c
[80]
Klopman, G. J. Am. Chem. Soc. 1968, 90, 223.doi: 10.1021/ja01004a002
doi: 10.1021/ja01004a002
[81]
Klopman, G.; Hudson, R. F. Theor. Chim. Act. 1967, 8, 165.doi: 10.1007/bf00526373
doi: 10.1007/bf00526373
[82]
Klopman, G.; Klopman, G. Chemical Reactivity and Reaction Paths; Wiley-Interscience: New York, NY, USA, 1974; p. 55.
[83]
Hudson, R. F.; Klopman, G. Tetrahedron Lett. 1967, 12, 1103.doi: 10.1016/S0040-4039(00)90645-2
doi: 10.1016/S0040-4039(00)90645-2
[84]
Salem, L. J. Am. Chem. Soc. 1968, 90, 553.doi: 10.1021/ja01005a002
doi: 10.1021/ja01005a002
[85]
Salem, L. J. Am. Chem. Soc. 1968, 90, 543.doi: 10.1021/ja01005a001
doi: 10.1021/ja01005a001
[86]
Salem, L. Chem. Br. 1969, 5, 449.
[87]
Witten, I. H.; Frank, E.; Hall, M. A.; Pal, C. J. Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Elsevier: Cambridge, MA, USA, 2017.
[88]
Gertrudes, J. C.; Maltarollo, V. G.; Silva, R. A.; Oliveira, P. R.; Honorio, K. M.; da Silva, A. B. F. Curr. Med. Chem. 2012, 19, 4289. doi: 10.2174/092986712802884259
doi: 10.2174/092986712802884259
[89]
Carrera, G.; Gupta, S.; Aires-de-Sousa, J. J. Comput. Aided Mol. Des. 2009, 23, 419. doi: 10.1007/s10822-009-9275-2
doi: 10.1007/s10822-009-9275-2
[90]
Dietterich, T. G. Ai Mag. 1997, 18, 97. doi: 10.1609/aimag.v18i4.1324
doi: 10.1609/aimag.v18i4.1324
[91]
Carbonell, J. G.; Michalski, R. S.; Mitchell, T. M. Machine Learning: an Artificial Intelligence Approach; Michalski, R. S., Carbonell, J. G., Mitchell, T. M., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, Germany, 1983; p. 3.doi: 10.1007/978-3-662-12405-5_1
doi: 10.1007/978-3-662-12405-5_1
[92]
Lino, A.; Rocha, A.; Sizo, A. J. Intell. Fuzzy Syst. 2016, 31, 2061. doi: 10.3233/JIFS-169045
doi: 10.3233/JIFS-169045
[93]
Affenzeller, M.; Winkler, S. M.; Kronberger, G.; Kommenda, M.; Burlacu, B.; Wagner, S. Genetic Programming Theory and Practice XI; Riolo, R., Moore, J. H., Kotanchek, M., Eds.; Springer New York: New York, NY, USA, 2014; p. 175.doi: 10.1007/978-1-4939-0375-7_10
doi: 10.1007/978-1-4939-0375-7_10
[94]
Billard, L.; Diday, E. Classification, Clustering, and Data Analysis: Recent Advances and Applications; Jajuga, K., Sokolowski, A., Bock, H. H., Eds.; Springer-Verlag: Berlin, Germany, 2012.
[95]
Billard, L.; Diday, E. J. Am. Stat. Assoc. 2003, 98, 470.doi: 10.1198/016214503000242
doi: 10.1198/016214503000242
[96]
Billard, L.; Diday, E. Data Analysis, Classification, and Related Methods; Kiers, H. A. L., Rasson, J.-P., Groenen, P. J. F., Schader, M., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, Germany, 2000; p. 369. doi: 10.1007/978-3-642-59789-3_58
doi: 10.1007/978-3-642-59789-3_58
[97]
Schmidt, M. D.; Vallabhajosyula, R. R.; Jenkins, J. W.; Hood, J. E.; Soni, A. S.; Wikswo, J. P.; Lipson, H. Phys. Biol. 2011, 8, 055011. doi: 10.1088/1478-3975/8/5/055011
doi: 10.1088/1478-3975/8/5/055011
[98]
Schmidt, M. W.; Lipson, H. Science 2009, 324, 81.doi: 10.1126/science.1165893
doi: 10.1126/science.1165893
[99]
Bongard, J.; Lipson, H. Proc. Natl. Acad. Sci. USA 2007, 104, 9943.doi: 10.1073/pnas.0609476104
doi: 10.1073/pnas.0609476104
[100]
Quade, M.; Abel, M.; Shafi, K.; Niven, R. K. Phys. Rev. B 2016, 94, 012214. doi: 10.1103/PhysRevE.94.012214
doi: 10.1103/PhysRevE.94.012214
[101]
Koza, J. R. 2nd International IEEE Conference on Tools for Artificial Intelligence, 1990; p. 819.
[102]
Sharma, S.; Tambe, S. S. Biochem. Eng. J. 2014, 85, 89.doi: 10.1016/j.bej.2014.02.007
doi: 10.1016/j.bej.2014.02.007
[103]
Langdon, W. B.; Poli, R. Found. Genet. Program.; Springer-Verlag: Berlin, Germany, 2013.
[104]
O'Neil, M.; Ryan, C. Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language; Springer US: Boston, MA, USA, 2003; p. 33. doi: 10.1007/978-1-4615-0447-4_4
doi: 10.1007/978-1-4615-0447-4_4
[105]
O'Neil, M.; Ryan, C. Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language; Springer US: Boston, MA, USA, 2003; p. 79. doi: 10.1007/978-1-4615-0447-4_7
doi: 10.1007/978-1-4615-0447-4_7
[106]
Ryan, C.; Collins, J.; Neill, M. O. Genetic Programming: First European Workshop, EuroGP'98 Paris, France, April 14–15, 1998 Proceedings; Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T. C., Eds.; Springer: Berlin, Heidelberg, Germany, 1998;, p. 83.doi: 10.1007/BFb0055930
doi: 10.1007/BFb0055930
[107]
Luke, S.; Spector, L. Genetic Programming 1997: Proceedings of the Second Annual Conference (GP97); Koza, J., Ed.; San Francisco, CA, USA, 1997; p. 240.
[108]
Spears, W. M.; Anand, V. Methodologies for Intelligent Systems: 6th International Symposium, ISMIS '91 Charlotte, N. C., USA, October 16–19, 1991 Proceedings; Ras, Z. W., Zemankova, M., Eds.; Springer: Berlin, Heidelberg, Germany, 1991; p. 409.doi: 10.1007/3-540-54563-8_104
doi: 10.1007/3-540-54563-8_104
[109]
Chomsky, N. IRE Trans. Inf. Theory 1956, 2, 113. doi: 10.1109/TIT.1956.1056813
doi: 10.1109/TIT.1956.1056813
[110]
Ginsburg, S. The Mathematical Theory of Context Free Languages; McGraw-Hill, New York, NY, USA, 1966.
[111]
Temkin, J. M.; Gilder, M. R. Bioinformatics 2003, 19, 2046.doi: 10.1093/bioinformatics/btg279
doi: 10.1093/bioinformatics/btg279
Mayr, H. http://www.cup.lmu.de/oc/mayr/reaktionsdatenbank/; Vol. 2017 (accessed Oct 30, 2017).
[114]
Mayr, H. http://www.cup.uni-muenchen.de/oc/mayr/DBintro.html; Vol. 2017 (accessed Oct 30, 2017).
[115]
Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; et al. Gaussian 09; Gaussian Inc.: Wallingford, CT, USA, 2009.
[116]
Wagner, S.; Kronberger, G.; Beham, A.; Kommenda, M.; Scheibenpflug, A.; Pitzer, E.; Vonolfen, S.; Kofler, M.; Winkler, S.; Dorfer, V.; et al. Advanced Methods and Applications in Computational Intelligence; Klempous, R., Nikodem, J., Jacak, W., Chaczko, Z., Eds.; Springer International Publishing: Heidelberg, Germany, 2014; p. 197. doi: 10.1007/978-3-319-01436-4_10
doi: 10.1007/978-3-319-01436-4_10
[117]
Yang, W. T.; Mortier, W. J. J. Am. Chem. Soc. 1986, 108, 5708.doi: 10.1021/ja00279a008
doi: 10.1021/ja00279a008
[118]
Miranda-Quintana, R. A. Chem. Phys. Lett. 2016, 658, 328.doi: 10.1016/j.cplett.2016.06.068
doi: 10.1016/j.cplett.2016.06.068
[119]
Zielinski, F.; Tognetti, V.; Joubert, L. Chem. Phys. Lett. 2012, 527, 67. doi: 10.1016/j.cplett.2012.01.011
doi: 10.1016/j.cplett.2012.01.011
[120]
Bultinck, P.; Fias, S.; Alsenoy, C. V.; Ayers, P. W.; Carbó-Dorca, R. J. Chem. Phys. 2007, 127, 034102. doi: 10.1063/1.2749518
doi: 10.1063/1.2749518
[1]
Miller, B. Advanced Organic Chemistry: Reactions and Mechanisms; Prentice-Hall: Upper Saddle River, NJ, USA, 1998.doi: 10.1021/ed075p1558
doi: 10.1021/ed075p1558
[2]
March, J. Advanced Organic Chemistry; Wiley-Interscience: New York, NY, USA, 1992. doi: 10.1002/0470084960
doi: 10.1002/0470084960
[3]
Mayr, H.; Kempf, B.; Ofial, A. R. Acc. Chem. Res. 2003, 36, 66.doi: 10.1021/ar020094c
doi: 10.1021/ar020094c
[4]
Mayr, H.; Bug, T.; Gotta, M. F.; Hering, N.; Irrgang, B.; Janker, B.; Kempf, B.; Loos, R.; Ofial, A. R.; Remennikov, G.; et al. J. Am. Chem. Soc. 2001, 123, 9500. doi: 10.1021/ja010890y
doi: 10.1021/ja010890y
Mayr, H.; Ofial, A. R. J. Phys. Org. Chem. 2008, 21, 584.doi: 10.1002/poc.1325
doi: 10.1002/poc.1325
[7]
Mayr, H.; Ofial, A. R. Pure Appl. Chem. 2005, 77, 1807.doi: 10.1351/pac200577111807
doi: 10.1351/pac200577111807
[8]
Liu, S. B. In Chemical Reactivity Theory: A Density Functional View; Chattaraj, P. K., Ed.; Taylor and Francis: Boca Raton, FL, USA, 2009; p. 179.
[9]
Chattaraj, P. K.; Giri, S. Ann. Rep. Prog. Chem. C 2009, 105, 13.doi: 10.1039/B802832J
doi: 10.1039/B802832J
[10]
Chattaraj, P. K.; Giri, S.; Duley, S. Chem. Rev. 2011, 111, PR43.doi: 10.1021/cr100149p
doi: 10.1021/cr100149p
[11]
Maynard, A. T.; Huang, M.; Rice, W. G.; Covell, D. G. Proc. Natl. Acad. Sci. USA 1998, 95, 11578. doi: 10.1073/pnas.95.20.11578
doi: 10.1073/pnas.95.20.11578
[12]
Parr, R. G.; von Szentpály, L.; Liu, S. B. J. Am. Chem. Soc. 1999, 121, 1922. doi: 10.1021/ja983494x
doi: 10.1021/ja983494x
Parr, R. G.; Yang, W. Density-Functional Theory of Atoms and Molecules; Oxford UP: New York, NY, USA, 1989.
[15]
Geerlings, P.; De Proft, F.; Langenaeker, W. Chem. Rev. 2003, 103, 1793. doi: 10.1021/cr990029
doi: 10.1021/cr990029
[16]
Johnson, P. A.; Bartolotti, L. J.; Ayers, P. W.; Fievez, T.; Geerlings, P. Modern Charge Density Analysis; Gatti, C., Macchi, P., Eds.; Springer: New York, NY, USA, 2012; p. 715.
[17]
Ayers, P. W.; Anderson, J. S. M.; Bartolotti, L. J. Int. J. Quantum Chem. 2005, 101, 520. doi: 10.1002/qua.20307
doi: 10.1002/qua.20307
[18]
Ayers, P. W.; Parr, R. G. J. Am. Chem. Soc. 2000, 122, 2010.doi: 10.1021/ja9924039
doi: 10.1021/ja9924039
[19]
Miranda-Quintana, R. A. Conceptual Density Functional Theory and its Applications in the Chemical Domain; Islam, N., Kaya, S., Eds.; Apple Academic Press: NJ, USA, in press.
[20]
Chemical Reactivity Theory: A Density Functional View; Chattaraj, P. K., Ed.; CRC Press: Boca Raton, FL, USA, 2009.
[21]
Fuentealba, P.; Cárdenas, C. Chemical Modelling; Springborg, M., Ed.; The Royal Society of Chemistry: London, UK, 2015; Vol. 11, p. 151.
[22]
Liu, S. B. Acta Phys. -Chim. Sin. 2009, 25, 590. doi: 10.3866/PKU.WHXB20090332
doi: 10.3866/PKU.WHXB20090332
[23]
Gazquez, J. L. J. Mex. Chem. Soc. 2008, 52, 3.
[24]
Domingo, L. R.; Perez, P.; Saez, J. A. RSC Adv. 2013, 3, 1486.doi: 10.1039/C2RA22886F
doi: 10.1039/C2RA22886F
[25]
Domingo, L. R.; Zaragoza, R. J.; Saez, J. A.; Arno, M. Molecules 2012, 17, 1335. doi: 10.3390/molecules17021335
doi: 10.3390/molecules17021335
[26]
Domingo, L. R.; Perez, P.; Contreras, R. Tetrahedron 2004, 60, 6585. doi: 10.1016/j.tet.2004.06.003
doi: 10.1016/j.tet.2004.06.003
[27]
Domingo, L. R.; Aurell, M. J.; Perez, P.; Contreras, R. J. Phys. Chem. A 2002, 106, 6871. doi: 10.1021/jp020715j
doi: 10.1021/jp020715j
[28]
Anderson, J. S. M.; Liu, Y. L.; Thomson, J. W.; Ayers, P. W. J. Mol. Struct.: THEOCHEM 2010, 943, 168.doi: 10.1016/j.theochem.2009.12.013
doi: 10.1016/j.theochem.2009.12.013
[29]
Ayers, P. W.; Anderson, J. S. M.; Rodriguez, J. I.; Jawed, Z. Phys. Chem. Chem. Phys. 2005, 7, 1918. doi: 10.1039/B500996K
doi: 10.1039/B500996K
[30]
Chamorro, E.; Chattaraj, P. K.; Fuentealba, P. J. Phys. Chem. A 2003, 107, 7068. doi: 10.1021/jp035435y
doi: 10.1021/jp035435y
[31]
Parthasarathi, R.; Elango, M.; Subramanian, V.; Chattaraj, P. K. Theor. Chem. Acc. 2005, 113, 257. doi: 10.1007/s00214-005-0634-3
doi: 10.1007/s00214-005-0634-3
[32]
González, M. M.; Hernández-Castillo, D.; Montero-Cabrera, L. A.; Miranda-Quintana, R. A. Int. J. Quantum Chem. 2017, e25444.doi: 10.1002/qua.25444
doi: 10.1002/qua.25444
[33]
Moens, J.; Jaque, P.; De Proft, F.; Geerlings, P. J. Phys. Chem. A 2008, 112, 6023. doi: 10.1021/jp711652a
doi: 10.1021/jp711652a
[34]
Moens, J.; Geerlings, P.; Roos, G. Chem. -A Eur. J. 2007, 13, 8174. doi: 10.1002/chem.200601896
doi: 10.1002/chem.200601896
[35]
Moens, J.; Roos, G.; Jaque, P.; Proft, F.; Geerlings, P. Chem. -A Eur. J. 2007, 13, 9331. doi: 10.1002/chem.200700547
doi: 10.1002/chem.200700547
[36]
Parthasarathi, R.; Padmanabhan, J.; Subramanian, V.; Maiti, B.; Chattaraj, P. K. Curr. Sci. 2004, 86, 535.
[37]
Parthasarathi, R.; Subramanian, V.; Roy, D. R.; Chattaraj, P. K. Biorg. Med. Chem. 2004, 12, 5533. doi: 10.1016/j.bmc.2004.08.013
doi: 10.1016/j.bmc.2004.08.013
Galvan, M.; Vela, A.; Gazquez, J. L. J. Phys. Chem. 1988, 92, 6470. doi: 10.1021/j100333a056
doi: 10.1021/j100333a056
[54]
Vargas, R.; Galvan, M.; Vela, A. J. Phys. Chem. A 1998, 102, 3134. doi: 10.1021/jp972984t
doi: 10.1021/jp972984t
[55]
Galvan, M.; Vargas, R. J. Phys. Chem. 1992, 96, 1625.doi: 10.1021/j100183a026
doi: 10.1021/j100183a026
[56]
Ghanty, T. K.; Ghosh, S. K. J. Am. Chem. Soc. 1994, 116, 3943. doi: 10.1021/ja00088a033
doi: 10.1021/ja00088a033
[57]
Chamorro, E.; Santos, J. C.; Escobar, C. A.; Perez, P. Chem. Phys. Lett. 2006, 431, 210. doi: 10.1016/j.cplett.2006.09.072
doi: 10.1016/j.cplett.2006.09.072
[58]
Chamorro, E.; Perez, P.; De Proft, F.; Geerlings, P. J. Chem. Phys. 2006, 124, 044105. doi: 10.1063/1.2161187
doi: 10.1063/1.2161187
[59]
Perez, P.; Chamorro, E.; Ayers, P. W. J. Chem. Phys. 2008, 128, 204108. doi: 10.1063/1.2916714
doi: 10.1063/1.2916714
[60]
Miranda-Quintana, R. A.; Ayers, P. W. Theor. Chem. Acc. 2016, 135, 239. doi: 10.1007/s00214-016-1995-5
doi: 10.1007/s00214-016-1995-5
[61]
Cardenas, C.; Heidar Zadeh, F.; Ayers, P. W. Phys. Chem. Chem. Phys. 2016, 18, 25721. doi: 10.1039/C6CP04533B
doi: 10.1039/C6CP04533B
[62]
Perdew, J. P.; Parr, R. G.; Levy, M.; Balduz, J. L., Jr. Phys. Rev. Lett. 1982, 49, 1691. doi: 10.1103/PhysRevLett.49.1691
doi: 10.1103/PhysRevLett.49.1691
[63]
Heidar Zadeh, F.; Miranda-Quintana, R. A.; Verstraelen, T.; Bultinck, P.; Ayers, P. W. J. Chem. Theory Comp. 2016, 12, 5777. doi: 10.1021/acs.jctc.6b00494
doi: 10.1021/acs.jctc.6b00494
[64]
Miranda-Quintana, R. A.; Ayers, P. W. Conceptual Density Functional Theory and Its Applications in the Chemical Domain; Islam, N., Kaya, S., Eds.; Apple Academic Press: NJ, USA, in press.
[65]
Miranda-Quintana, R. A.; Ayers, P. W. J. Chem. Phys. 2016, 144, 244112. doi: 10.1063/1.4953557
doi: 10.1063/1.4953557
[66]
Ayers, P. W.; Parr, R. G. J. Chem. Phys. 2008, 129, 054111. doi: 10.1063/1.2957900
doi: 10.1063/1.2957900
[67]
Ayers, P. W.; Parr, R. G. J. Chem. Phys. 2008, 128, 184108. doi: 10.1063/1.2918731
doi: 10.1063/1.2918731
[68]
Franco-Pérez, M.; Ayers, P. W.; Gazquez, J. L.; Vela, A. J. Chem. Phys. 2017, 147, 094105. doi: 10.1063/1.4999761
doi: 10.1063/1.4999761
[69]
Franco-Pérez, M.; Heidar-Zadeh, F.; Ayers, P. W.; Gazquez, J. L.; Vela, A. Phys. Chem. Chem. Phys. 2017, 19, 11588.doi: 10.1039/C7CP00224F
doi: 10.1039/C7CP00224F
[70]
Franco-Pérez, M.; Ayers, P. W.; Gazquez, J. L.; Vela, A. Phys. Chem. Chem. Phys. 2017, 19, 13687. doi: 10.1039/C7CP00692F
doi: 10.1039/C7CP00692F
[71]
Polanco-Ramírez, C. A.; Franco-Pérez, M.; Carmona-Espíndola, J.; Gazquez, J. L.; Ayers, P. W. Phys. Chem. Chem. Phys. 2017, 19, 12355. doi: 10.1039/C7CP00691H
doi: 10.1039/C7CP00691H
[72]
Franco-Pérez, M.; Ayers, P.; Gazquez, J. L.; Vela, A. J. Chem. Phys. 2015, 143, 244117. doi: 10.1063/1.4938422
doi: 10.1063/1.4938422
[73]
Franco-Pérez, M.; Gazquez, J. L.; Ayers, P.; Vela, A. J. Chem. Phys. 2015, 143, 154103. doi: 10.1063/1.4932539
doi: 10.1063/1.4932539
[74]
Malek, A.; Balawender, R. J. Chem. Phys. 2015, 142, 054104.doi: 10.1063/1.4906555
doi: 10.1063/1.4906555
[75]
Miranda-Quintana, R. A.; Ayers, P. W. Phys. Chem. Chem. Phys. 2016, 18, 15070. doi: 10.1039/c6cp00939e
doi: 10.1039/c6cp00939e
[76]
Miranda-Quintana, R. A. Theor. Chem. Acc. 2017, 136, 76.doi: 10.1007/s00214-017-2109-8
doi: 10.1007/s00214-017-2109-8
[77]
Miranda-Quintana, R. A.; Ayers, P. W. Theor. Chem. Acc. 2016, 135, 172. doi: 10.1007/s00214-016-1924-7
doi: 10.1007/s00214-016-1924-7
[78]
Miranda-Quintana, R. A. Theor. Chem. Acc. 2016, 135, 189.doi: 10.1007/s00214-016-1945-2
doi: 10.1007/s00214-016-1945-2
[79]
Miranda-Quintana, R. A.; González, M. M.; Ayers, P. W. Phys. Chem. Chem. Phys. 2016, 18, 22235. doi: 10.1039/c6cp03213c
doi: 10.1039/c6cp03213c
[80]
Klopman, G. J. Am. Chem. Soc. 1968, 90, 223.doi: 10.1021/ja01004a002
doi: 10.1021/ja01004a002
[81]
Klopman, G.; Hudson, R. F. Theor. Chim. Act. 1967, 8, 165.doi: 10.1007/bf00526373
doi: 10.1007/bf00526373
[82]
Klopman, G.; Klopman, G. Chemical Reactivity and Reaction Paths; Wiley-Interscience: New York, NY, USA, 1974; p. 55.
[83]
Hudson, R. F.; Klopman, G. Tetrahedron Lett. 1967, 12, 1103.doi: 10.1016/S0040-4039(00)90645-2
doi: 10.1016/S0040-4039(00)90645-2
[84]
Salem, L. J. Am. Chem. Soc. 1968, 90, 553.doi: 10.1021/ja01005a002
doi: 10.1021/ja01005a002
[85]
Salem, L. J. Am. Chem. Soc. 1968, 90, 543.doi: 10.1021/ja01005a001
doi: 10.1021/ja01005a001
[86]
Salem, L. Chem. Br. 1969, 5, 449.
[87]
Witten, I. H.; Frank, E.; Hall, M. A.; Pal, C. J. Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Elsevier: Cambridge, MA, USA, 2017.
[88]
Gertrudes, J. C.; Maltarollo, V. G.; Silva, R. A.; Oliveira, P. R.; Honorio, K. M.; da Silva, A. B. F. Curr. Med. Chem. 2012, 19, 4289. doi: 10.2174/092986712802884259
doi: 10.2174/092986712802884259
[89]
Carrera, G.; Gupta, S.; Aires-de-Sousa, J. J. Comput. Aided Mol. Des. 2009, 23, 419. doi: 10.1007/s10822-009-9275-2
doi: 10.1007/s10822-009-9275-2
[90]
Dietterich, T. G. Ai Mag. 1997, 18, 97. doi: 10.1609/aimag.v18i4.1324
doi: 10.1609/aimag.v18i4.1324
[91]
Carbonell, J. G.; Michalski, R. S.; Mitchell, T. M. Machine Learning: an Artificial Intelligence Approach; Michalski, R. S., Carbonell, J. G., Mitchell, T. M., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, Germany, 1983; p. 3.doi: 10.1007/978-3-662-12405-5_1
doi: 10.1007/978-3-662-12405-5_1
[92]
Lino, A.; Rocha, A.; Sizo, A. J. Intell. Fuzzy Syst. 2016, 31, 2061. doi: 10.3233/JIFS-169045
doi: 10.3233/JIFS-169045
[93]
Affenzeller, M.; Winkler, S. M.; Kronberger, G.; Kommenda, M.; Burlacu, B.; Wagner, S. Genetic Programming Theory and Practice XI; Riolo, R., Moore, J. H., Kotanchek, M., Eds.; Springer New York: New York, NY, USA, 2014; p. 175.doi: 10.1007/978-1-4939-0375-7_10
doi: 10.1007/978-1-4939-0375-7_10
[94]
Billard, L.; Diday, E. Classification, Clustering, and Data Analysis: Recent Advances and Applications; Jajuga, K., Sokolowski, A., Bock, H. H., Eds.; Springer-Verlag: Berlin, Germany, 2012.
[95]
Billard, L.; Diday, E. J. Am. Stat. Assoc. 2003, 98, 470.doi: 10.1198/016214503000242
doi: 10.1198/016214503000242
[96]
Billard, L.; Diday, E. Data Analysis, Classification, and Related Methods; Kiers, H. A. L., Rasson, J.-P., Groenen, P. J. F., Schader, M., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, Germany, 2000; p. 369. doi: 10.1007/978-3-642-59789-3_58
doi: 10.1007/978-3-642-59789-3_58
[97]
Schmidt, M. D.; Vallabhajosyula, R. R.; Jenkins, J. W.; Hood, J. E.; Soni, A. S.; Wikswo, J. P.; Lipson, H. Phys. Biol. 2011, 8, 055011. doi: 10.1088/1478-3975/8/5/055011
doi: 10.1088/1478-3975/8/5/055011
[98]
Schmidt, M. W.; Lipson, H. Science 2009, 324, 81.doi: 10.1126/science.1165893
doi: 10.1126/science.1165893
[99]
Bongard, J.; Lipson, H. Proc. Natl. Acad. Sci. USA 2007, 104, 9943.doi: 10.1073/pnas.0609476104
doi: 10.1073/pnas.0609476104
[100]
Quade, M.; Abel, M.; Shafi, K.; Niven, R. K. Phys. Rev. B 2016, 94, 012214. doi: 10.1103/PhysRevE.94.012214
doi: 10.1103/PhysRevE.94.012214
[101]
Koza, J. R. 2nd International IEEE Conference on Tools for Artificial Intelligence, 1990; p. 819.
[102]
Sharma, S.; Tambe, S. S. Biochem. Eng. J. 2014, 85, 89.doi: 10.1016/j.bej.2014.02.007
doi: 10.1016/j.bej.2014.02.007
[103]
Langdon, W. B.; Poli, R. Found. Genet. Program.; Springer-Verlag: Berlin, Germany, 2013.
[104]
O'Neil, M.; Ryan, C. Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language; Springer US: Boston, MA, USA, 2003; p. 33. doi: 10.1007/978-1-4615-0447-4_4
doi: 10.1007/978-1-4615-0447-4_4
[105]
O'Neil, M.; Ryan, C. Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language; Springer US: Boston, MA, USA, 2003; p. 79. doi: 10.1007/978-1-4615-0447-4_7
doi: 10.1007/978-1-4615-0447-4_7
[106]
Ryan, C.; Collins, J.; Neill, M. O. Genetic Programming: First European Workshop, EuroGP'98 Paris, France, April 14–15, 1998 Proceedings; Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T. C., Eds.; Springer: Berlin, Heidelberg, Germany, 1998;, p. 83.doi: 10.1007/BFb0055930
doi: 10.1007/BFb0055930
[107]
Luke, S.; Spector, L. Genetic Programming 1997: Proceedings of the Second Annual Conference (GP97); Koza, J., Ed.; San Francisco, CA, USA, 1997; p. 240.
[108]
Spears, W. M.; Anand, V. Methodologies for Intelligent Systems: 6th International Symposium, ISMIS '91 Charlotte, N. C., USA, October 16–19, 1991 Proceedings; Ras, Z. W., Zemankova, M., Eds.; Springer: Berlin, Heidelberg, Germany, 1991; p. 409.doi: 10.1007/3-540-54563-8_104
doi: 10.1007/3-540-54563-8_104
[109]
Chomsky, N. IRE Trans. Inf. Theory 1956, 2, 113. doi: 10.1109/TIT.1956.1056813
doi: 10.1109/TIT.1956.1056813
[110]
Ginsburg, S. The Mathematical Theory of Context Free Languages; McGraw-Hill, New York, NY, USA, 1966.
[111]
Temkin, J. M.; Gilder, M. R. Bioinformatics 2003, 19, 2046.doi: 10.1093/bioinformatics/btg279
doi: 10.1093/bioinformatics/btg279
Mayr, H. http://www.cup.lmu.de/oc/mayr/reaktionsdatenbank/; Vol. 2017 (accessed Oct 30, 2017).
[114]
Mayr, H. http://www.cup.uni-muenchen.de/oc/mayr/DBintro.html; Vol. 2017 (accessed Oct 30, 2017).
[115]
Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; et al. Gaussian 09; Gaussian Inc.: Wallingford, CT, USA, 2009.
[116]
Wagner, S.; Kronberger, G.; Beham, A.; Kommenda, M.; Scheibenpflug, A.; Pitzer, E.; Vonolfen, S.; Kofler, M.; Winkler, S.; Dorfer, V.; et al. Advanced Methods and Applications in Computational Intelligence; Klempous, R., Nikodem, J., Jacak, W., Chaczko, Z., Eds.; Springer International Publishing: Heidelberg, Germany, 2014; p. 197. doi: 10.1007/978-3-319-01436-4_10
doi: 10.1007/978-3-319-01436-4_10
[117]
Yang, W. T.; Mortier, W. J. J. Am. Chem. Soc. 1986, 108, 5708.doi: 10.1021/ja00279a008
doi: 10.1021/ja00279a008
[118]
Miranda-Quintana, R. A. Chem. Phys. Lett. 2016, 658, 328.doi: 10.1016/j.cplett.2016.06.068
doi: 10.1016/j.cplett.2016.06.068
[119]
Zielinski, F.; Tognetti, V.; Joubert, L. Chem. Phys. Lett. 2012, 527, 67. doi: 10.1016/j.cplett.2012.01.011
doi: 10.1016/j.cplett.2012.01.011
[120]
Bultinck, P.; Fias, S.; Alsenoy, C. V.; Ayers, P. W.; Carbó-Dorca, R. J. Chem. Phys. 2007, 127, 034102. doi: 10.1063/1.2749518
doi: 10.1063/1.2749518
[1]
Ling Tang
, Yan Wan
, Yangming Lin
. Lowering the kinetic barrier via enhancing electrophilicity of surface oxygen to boost acidic oxygen evolution reaction. Chinese Journal of Structural Chemistry,
2024, 43(11): 100345-100345.
doi: 10.1016/j.cjsc.2024.100345
[2]
Lei Shi
. Nucleophilicity and Electrophilicity of Radicals. University Chemistry,
2024, 39(11): 131-135.
doi: 10.3866/PKU.DXHX202402018
Jing Cao
, Dezheng Zhang
, Bianqing Ren
, Ping Song
, Weilin Xu
. Mn incorporated RuO2 nanocrystals as an efficient and stable bifunctional electrocatalyst for oxygen evolution reaction and hydrogen evolution reaction in acid and alkaline. Chinese Chemical Letters,
2024, 35(10): 109863-.
doi: 10.1016/j.cclet.2024.109863
[11]
Yunfei Shen
, Long Chen
. Gradient imprinted Zn metal anodes assist dendrites-free at high current density/capacity. Chinese Journal of Structural Chemistry,
2024, 43(10): 100321-100321.
doi: 10.1016/j.cjsc.2024.100321
Uttam Pandurang Patil
. Porous carbon catalysis in sustainable synthesis of functional heterocycles: An overview. Chinese Chemical Letters,
2024, 35(8): 109472-.
doi: 10.1016/j.cclet.2023.109472