Application of the genetic algorithms to the identification of the hydrodynamic parameters

In this work, we propose an adaptation of the algorithm Non-dominated Sorting Genetic Algorithm-II (NSGA-II) proposed by deb. et al. (2002) to solve multi-objective problems to the resolution of mono-objective problem. Contrary to the majority of the genetic algorithms, we did not define a probability of crossing. After having applied our algorithm to functions test, we then used it to identify hydrogeologic parameters where the boundaries values and the source term are supposed to be unknown besides the permeability. The direct problem was solved by using the finite elements of Galerkin on freefem++ and the genetic algorithm was programmed in Matlab. Then we carried out a coupling of the two codes to identify the parameters. AMS Subject Classification: 35A15, 35R30, 65L09, 65L10, 65L12.


Introduction
For the numerical simulation of flow and transfer related problems of substances into the basement knowing the boundary values, the source terms and certain physical parameters of geological strata is capital. However,these factors which are generally experiment-based cause a number of difficulties. In this work, we adapt the genetic algorithm Non-dominated Sorting Genetic Algorithm-II (NSGA-II) proposed by Deb and al(2002) [11] for the optimization of a multi-objective problem to the resolution of a mono-objective problem without constraints. First we tested it on examples before using it to solve an inverse problem of hydrogeology by basing us on actual data resulting from the studing site of project TRANSPOL II (INERIS 2003) [1].

Presentation of the algorithm used
This algorithm makes it possible to maximize a positive function f called fitness or evaluation function of the individual. The individuals represent the variables.

Coding and creation of the initial population
The real-type coding used consists in directly representing the actual values of the variable. We subdivided the eligible field in several under fields. And the initial population was created in a random way by using the uniform law in each under field. That makes it possible to have a diversified population from the beginning and convergencyaccelerating. The size of the population is n. We create a table of n variables.

Operation of selection
We used the selection by caster of Goldberg [19]. The parents are selected according to their performance. In this method the probability p with which an individual i represented by a variable x i of fitness f i (evaluation of the function in x i ) reintroduced in a new population of size n is:

Operation of crossing
The barycentric crossing is used but we did not use a probability of crossing. In this kind of crossing, two genes P 1(i) and P 2(i) are selected from each parent to the same position i. They define two new genes C1(i) and C2(i) by linear combination: In this document, we crossed the whole mother population to get a child population of size n.

Operation of mutation
Mutation of a Gaussian type is applied to the population. One selects an individual x under a probability p. If p is lower than the probability of mutation p m , one adds a Gaussian noise to x i.e. one replaces x by x + , where are a random value obtained according to the law from Gauss. The newly-created individual replaces the former one if it is better and if it is in the acceptable field.

New population
After the operations of selection, crossing and change, an intermediate population of size 2n is created by gathering the parent and child populations. The new parent population is obtained by keeping the N better individuals. Finally the algorithm used is: Algorithm 1 Algorithm used For each iteration t do To calculate the score of each individual of P t To generate a new population of child Q t by applying the operators of selection, crossing and mutation To classify the individuals of R t from decreasing order according to the score of each individual To keep n best individuals of R t to form a new population of parent P t+1 t = t + 1 (To increment the counter of the generation )

Application to the test functions
Before using our algorithm for the identification of the parameters in hydrogeology, we tested it on classical functions.

Function of Rosenbrock
The function of Rosenbrock is a unimodal function, nonconvex, of n dimensions, used like test for problems of mathematical optimization. It is defined by: In the literature, this function is regarded as being a difficult problem because of the nonlinear interaction between the variables [20]. The global minimum is obtained as in point (1, . . . , 1), for which the function is worth 0.

Function of Rastrigin
It is a function of n dimensions, strongly multimodal defined by: The local minima site is distributed regularly. The global minimum is in the beginning and the value of its function is equal to zero.

Results of tests
In all the examples, we used the same parameters given by the table below: p m σ size of the opulation Number of iterations 0.00001 0.5 100 100 We applied our algorithm to these functions in dimension three. Then, we compared our results, with those obtained by the genetic algorithm provided with by Matlab (the function ga). For each function test, we carry out five simulations. The tables below, gives the results of optimization. It is noticed that the results got with our code are better than those obtained with the function ga of Matlab but our code is slower.   The site in question has a length of 500 meters in the direction of groundwater flow (SN) and a width of 300 meters. The unsaturated zone is 6 meters.

Boundary conditions
The map of figure 1 shows the domain to be modeled. The river flows southwest to the northeast. The upstream boundary condition is equated with the river and has a constant load over time. The downstream boundary condition corresponds to an imaginary line perpendicular to the stream. A constant charge will be imposed. The others side boundary and the lower limit corresponding to zero flow conditions.

Mathematical model of the flow
The fluid is considered incompressible and monophasic and the medium Ω porous and is saturated, and we place ourselves in permanent mode. The law of conservation of the mass and the law of Darcy [13] applied to our site gives: where: • u(x): Darcy velocity, • p(x): hydraulic potential, • f :source term, • K: the hydraulic conductivity (constant) • Ω represent the domain.
• Γ 0 : limits upstream and downstream where the hydraulic potential are constant.

Direct problem solving
Theorem 1 Under the assumptions of flow, the problem (1) admits one and only solution.
Proof By eliminating u, the problem (1) is equivalent to: Let V be the space be defined by Since mes(Γ 0 ) > 0, according to [21], we can choose Let v ∈ V be a test function. On multiplying (2) by v and integrating by parts, the variational formulation associated to the problem (2) is: Find p ∈ H 1 (Ω) such that p = d on Γ 0 and such that We denote by γ 0 the trace operator. Let r d ∈ H 1 (Ω) such as γ 0 (r d ) = d and we denote p 0 = p − r d . The variational formulation becomes: f ∈ L 2 (Ω). Let the bilinear form α : V × V −→ R be defined by: Let the linear form L : V −→ R be defined by: The space V is a Hilbert space for the Hilbertian norm . V . The bilinear form α is continuous, coercive and the linear form L is also continuous. Thus the theorem of Lax-Milgram [15] ensures the existence and uniqueness of a solution to the variational problem (4) and consequently the existence and uniqueness of a solution of (2).
Let T h be a triangulation of Ω. Let P 1 denote the space of continuous, piewise affine function in Ω i.e the space of continuous functions which are affine in x, y on each triangle of T h . We pose V h = P 1 ∩ V . V h is a linear vector space of finite dimension. We denote N its dimension and φ 1 , . . . , φ N a basis. The approximated problem is: Let and take v h = φ i for i = 1, . . . , N ; equation (6) is equivalent to This gives the system Ax = b, where:

Data
The location of monitoring points are given in figure 1 and observed values are given in the table below.

Parametric identification problem
We suppose that the permeability K is constant and unknown. We also assume that the Dirichlet boundary conditions(three values) values and the value of the source term are unknown. The problem is to find the values of these constants that minimize J defined by with: p s (x i , y i ) is the simulated pressure head at the point (x i , y i ), p ob (x i , y i ) is the pressure head observed at the point (x i , y i ). nobs number of observations. We use the genetic algorithm to minimize J.

Algorithm of calculus of the function cost
Since the points (x i , y i ) of measurement do not correspond necessarily to the points of discretization where the solution is calculated, it is thus necessary to seek the approximation of h i (x i , y i ). The points P 17 and P 30 were not used in the procedure of identification. They will be used as point tests.
Algorithm 2 Algorithm to calculate the function cost J←− 0 for i=1 to nobs-2 do Determine the triangle T i such as (x i , y i ) ∈ T i Determine the approximate solution P i (x, y) so that (x, y) ∈ T i J←− J+ 1 2 (P i (x i , y i ) − P ob (x i , y i )) 2 endfor 3.6. Inverse problem solving: coupling Freefem++/Matlab We programmed this algorithm by using Freefem++ and matlab. The function cost which uses the solution of the direct problem is calculated in FreeFem++. The communication between FreeFem++ and Matlab is made through two files. With each iteration, a file which contains the new population (population.dat) is created by the program Matlab and a file which contains the scores (scores.dat) of the various individuals is created by FreeFem++. Therefore, with each iteration, the FreeFem++ program calculates the scores of the various individuals, by solving the direct problem. The data are the values of the various parameters being in the file population.dat. The Matlab program uses the file scores.dat for the process of optimization by genetic algorithm.   Value of the function cost J : 0.734

Simulated groundwater level
Groundwater levels simulated with the identified values are given in the table below

Conclusion
In this work, we adapted a genetic algorithm allowing to optimize positive functions effectively. Tests carried out on certain functions made it possible to prove the effectiveness of the algorithm to find the optimum total.