The source file systems_of_equations_ex3.C with comments:
#include <iostream>
#include <algorithm>
#include <sstream>
#include <math.h>
Basic include file needed for the mesh functionality.
#include "libmesh/libmesh.h"
#include "libmesh/mesh.h"
#include "libmesh/mesh_generation.h"
#include "libmesh/exodusII_io.h"
#include "libmesh/equation_systems.h"
#include "libmesh/fe.h"
#include "libmesh/quadrature_gauss.h"
#include "libmesh/dof_map.h"
#include "libmesh/sparse_matrix.h"
#include "libmesh/numeric_vector.h"
#include "libmesh/dense_matrix.h"
#include "libmesh/dense_vector.h"
#include "libmesh/linear_implicit_system.h"
#include "libmesh/transient_system.h"
#include "libmesh/perf_log.h"
#include "libmesh/boundary_info.h"
#include "libmesh/utility.h"
For systems of equations the \p DenseSubMatrix
and \p DenseSubVector provide convenient ways for
assembling the element matrix and vector on a
component-by-component basis.
#include "libmesh/dense_submatrix.h"
#include "libmesh/dense_subvector.h"
The definition of a geometric element
#include "libmesh/elem.h"
Bring in everything from the libMesh namespace
using namespace libMesh;
Function prototype. This function will assemble the system
matrix and right-hand-side.
void assemble_stokes (EquationSystems& es,
const std::string& system_name);
The main program.
int main (int argc, char** argv)
{
Initialize libMesh.
LibMeshInit init (argc, argv);
Skip this 2D example if libMesh was compiled as 1D-only.
libmesh_example_assert(2 <= LIBMESH_DIM, "2D support");
Trilinos solver NaNs by default on the zero pressure block.
We'll skip this example for now.
if (libMesh::default_solver_package() == TRILINOS_SOLVERS)
{
std::cout << "We skip example 13 when using the Trilinos solvers.\n"
<< std::endl;
return 0;
}
Create a mesh.
Mesh mesh;
Use the MeshTools::Generation mesh generator to create a uniform
2D grid on the square [-1,1]^2. We instruct the mesh generator
to build a mesh of 8x8 \p Quad9 elements in 2D. Building these
higher-order elements allows us to use higher-order
approximation, as in example 3.
MeshTools::Generation::build_square (mesh,
20, 20,
0., 1.,
0., 1.,
QUAD9);
Print information about the mesh to the screen.
mesh.print_info();
Create an equation systems object.
EquationSystems equation_systems (mesh);
Declare the system and its variables.
Creates a transient system named "Navier-Stokes"
TransientLinearImplicitSystem & system =
equation_systems.add_system<TransientLinearImplicitSystem> ("Navier-Stokes");
Add the variables "u" & "v" to "Navier-Stokes". They
will be approximated using second-order approximation.
system.add_variable ("u", SECOND);
system.add_variable ("v", SECOND);
Add the variable "p" to "Navier-Stokes". This will
be approximated with a first-order basis,
providing an LBB-stable pressure-velocity pair.
system.add_variable ("p", FIRST);
Add a scalar Lagrange multiplier to constrain the
pressure to have zero mean.
system.add_variable ("alpha", FIRST, SCALAR);
Give the system a pointer to the matrix assembly
function.
system.attach_assemble_function (assemble_stokes);
Initialize the data structures for the equation system.
equation_systems.init ();
Prints information about the system to the screen.
equation_systems.print_info();
Create a performance-logging object for this example
PerfLog perf_log("Systems Example 3");
Get a reference to the Stokes system to use later.
TransientLinearImplicitSystem& navier_stokes_system =
equation_systems.get_system<TransientLinearImplicitSystem>("Navier-Stokes");
Now we begin the timestep loop to compute the time-accurate
solution of the equations.
const Real dt = 0.01;
navier_stokes_system.time = 0.0;
const unsigned int n_timesteps = 15;
The number of steps and the stopping criterion are also required
for the nonlinear iterations.
const unsigned int n_nonlinear_steps = 15;
const Real nonlinear_tolerance = 1.e-3;
We also set a standard linear solver flag in the EquationSystems object
which controls the maxiumum number of linear solver iterations allowed.
equation_systems.parameters.set<unsigned int>("linear solver maximum iterations") = 250;
Tell the system of equations what the timestep is by using
the set_parameter function. The matrix assembly routine can
then reference this parameter.
equation_systems.parameters.set<Real> ("dt") = dt;
The first thing to do is to get a copy of the solution at
the current nonlinear iteration. This value will be used to
determine if we can exit the nonlinear loop.
AutoPtr<NumericVector<Number> >
last_nonlinear_soln (navier_stokes_system.solution->clone());
for (unsigned int t_step=0; t_step<n_timesteps; ++t_step)
{
Incremenet the time counter, set the time and the
time step size as parameters in the EquationSystem.
navier_stokes_system.time += dt;
A pretty update message
std::cout << "\n\n*** Solving time step " << t_step <<
", time = " << navier_stokes_system.time <<
" ***" << std::endl;
Now we need to update the solution vector from the
previous time step. This is done directly through
the reference to the Stokes system.
*navier_stokes_system.old_local_solution = *navier_stokes_system.current_local_solution;
At the beginning of each solve, reset the linear solver tolerance
to a "reasonable" starting value.
const Real initial_linear_solver_tol = 1.e-6;
equation_systems.parameters.set<Real> ("linear solver tolerance") = initial_linear_solver_tol;
Now we begin the nonlinear loop
for (unsigned int l=0; l<n_nonlinear_steps; ++l)
{
Update the nonlinear solution.
last_nonlinear_soln->zero();
last_nonlinear_soln->add(*navier_stokes_system.solution);
Assemble & solve the linear system.
perf_log.push("linear solve");
equation_systems.get_system("Navier-Stokes").solve();
perf_log.pop("linear solve");
Compute the difference between this solution and the last
nonlinear iterate.
last_nonlinear_soln->add (-1., *navier_stokes_system.solution);
Close the vector before computing its norm
last_nonlinear_soln->close();
Compute the l2 norm of the difference
const Real norm_delta = last_nonlinear_soln->l2_norm();
How many iterations were required to solve the linear system?
const unsigned int n_linear_iterations = navier_stokes_system.n_linear_iterations();
What was the final residual of the linear system?
const Real final_linear_residual = navier_stokes_system.final_linear_residual();
Print out convergence information for the linear and
nonlinear iterations.
std::cout << "Linear solver converged at step: "
<< n_linear_iterations
<< ", final residual: "
<< final_linear_residual
<< " Nonlinear convergence: ||u - u_old|| = "
<< norm_delta
<< std::endl;
Terminate the solution iteration if the difference between
this nonlinear iterate and the last is sufficiently small, AND
if the most recent linear system was solved to a sufficient tolerance.
if ((norm_delta < nonlinear_tolerance) &&
(navier_stokes_system.final_linear_residual() < nonlinear_tolerance))
{
std::cout << " Nonlinear solver converged at step "
<< l
<< std::endl;
break;
}
Otherwise, decrease the linear system tolerance. For the inexact Newton
method, the linear solver tolerance needs to decrease as we get closer to
the solution to ensure quadratic convergence. The new linear solver tolerance
is chosen (heuristically) as the square of the previous linear system residual norm.
Real flr2 = final_linear_residual*final_linear_residual;
equation_systems.parameters.set<Real> ("linear solver tolerance") =
std::min(Utility::pow<2>(final_linear_residual), initial_linear_solver_tol);
} // end nonlinear loop
Write out every nth timestep to file.
const unsigned int write_interval = 1;
#ifdef LIBMESH_HAVE_EXODUS_API
if ((t_step+1)%write_interval == 0)
{
std::ostringstream file_name;
We write the file in the ExodusII format.
file_name << "out_"
<< std::setw(3)
<< std::setfill('0')
<< std::right
<< t_step + 1
<< ".e";
ExodusII_IO(mesh).write_equation_systems (file_name.str(),
equation_systems);
}
#endif // #ifdef LIBMESH_HAVE_EXODUS_API
} // end timestep loop.
All done.
return 0;
}
The matrix assembly function to be called at each time step to
prepare for the linear solve.
void assemble_stokes (EquationSystems& es,
const std::string& system_name)
{
It is a good idea to make sure we are assembling
the proper system.
libmesh_assert_equal_to (system_name, "Navier-Stokes");
Get a constant reference to the mesh object.
const MeshBase& mesh = es.get_mesh();
The dimension that we are running
const unsigned int dim = mesh.mesh_dimension();
Get a reference to the Stokes system object.
TransientLinearImplicitSystem & navier_stokes_system =
es.get_system<TransientLinearImplicitSystem> ("Navier-Stokes");
Numeric ids corresponding to each variable in the system
const unsigned int u_var = navier_stokes_system.variable_number ("u");
const unsigned int v_var = navier_stokes_system.variable_number ("v");
const unsigned int p_var = navier_stokes_system.variable_number ("p");
const unsigned int alpha_var = navier_stokes_system.variable_number ("alpha");
Get the Finite Element type for "u". Note this will be
the same as the type for "v".
FEType fe_vel_type = navier_stokes_system.variable_type(u_var);
Get the Finite Element type for "p".
FEType fe_pres_type = navier_stokes_system.variable_type(p_var);
Build a Finite Element object of the specified type for
the velocity variables.
AutoPtr<FEBase> fe_vel (FEBase::build(dim, fe_vel_type));
Build a Finite Element object of the specified type for
the pressure variables.
AutoPtr<FEBase> fe_pres (FEBase::build(dim, fe_pres_type));
A Gauss quadrature rule for numerical integration.
Let the \p FEType object decide what order rule is appropriate.
QGauss qrule (dim, fe_vel_type.default_quadrature_order());
Tell the finite element objects to use our quadrature rule.
fe_vel->attach_quadrature_rule (&qrule);
fe_pres->attach_quadrature_rule (&qrule);
Here we define some references to cell-specific data that
will be used to assemble the linear system.
The element Jacobian * quadrature weight at each integration point.
The element Jacobian * quadrature weight at each integration point.
const std::vector<Real>& JxW = fe_vel->get_JxW();
The element shape functions evaluated at the quadrature points.
const std::vector<std::vector<Real> >& phi = fe_vel->get_phi();
The element shape function gradients for the velocity
variables evaluated at the quadrature points.
const std::vector<std::vector<RealGradient> >& dphi = fe_vel->get_dphi();
The element shape functions for the pressure variable
evaluated at the quadrature points.
const std::vector<std::vector<Real> >& psi = fe_pres->get_phi();
The value of the linear shape function gradients at the quadrature points
const std::vector >& dpsi = fe_pres->get_dphi();
A reference to the \p DofMap object for this system. The \p DofMap object handles the index translation from node and element numbers to degree of freedom numbers. We will talk more about the \p DofMap in future examples.
A reference to the \p DofMap object for this system. The \p DofMap object handles the index translation from node and element numbers to degree of freedom numbers. We will talk more about the \p DofMap in future examples.
const DofMap & dof_map = navier_stokes_system.get_dof_map();
Define data structures to contain the element matrix
and right-hand-side vector contribution. Following
basic finite element terminology we will denote these
"Ke" and "Fe".
DenseMatrix<Number> Ke;
DenseVector<Number> Fe;
DenseSubMatrix<Number>
Kuu(Ke), Kuv(Ke), Kup(Ke),
Kvu(Ke), Kvv(Ke), Kvp(Ke),
Kpu(Ke), Kpv(Ke), Kpp(Ke);
DenseSubMatrix<Number> Kalpha_p(Ke), Kp_alpha(Ke);
DenseSubVector<Number>
Fu(Fe),
Fv(Fe),
Fp(Fe);
This vector will hold the degree of freedom indices for
the element. These define where in the global system
the element degrees of freedom get mapped.
std::vector<dof_id_type> dof_indices;
std::vector<dof_id_type> dof_indices_u;
std::vector<dof_id_type> dof_indices_v;
std::vector<dof_id_type> dof_indices_p;
std::vector<dof_id_type> dof_indices_alpha;
Find out what the timestep size parameter is from the system, and
the value of theta for the theta method. We use implicit Euler (theta=1)
for this simulation even though it is only first-order accurate in time.
The reason for this decision is that the second-order Crank-Nicolson
method is notoriously oscillatory for problems with discontinuous
initial data such as the lid-driven cavity. Therefore,
we sacrifice accuracy in time for stability, but since the solution
reaches steady state relatively quickly we can afford to take small
timesteps. If you monitor the initial nonlinear residual for this
simulation, you should see that it is monotonically decreasing in time.
const Real dt = es.parameters.get<Real>("dt");
const Real time = es.parameters.get("time");
const Real theta = 1.;
Now we will loop over all the elements in the mesh that
live on the local processor. We will compute the element
matrix and right-hand-side contribution. Since the mesh
will be refined we want to only consider the ACTIVE elements,
hence we use a variant of the \p active_elem_iterator.
MeshBase::const_element_iterator el = mesh.active_local_elements_begin();
const MeshBase::const_element_iterator end_el = mesh.active_local_elements_end();
for ( ; el != end_el; ++el)
{
Store a pointer to the element we are currently
working on. This allows for nicer syntax later.
const Elem* elem = *el;
Get the degree of freedom indices for the
current element. These define where in the global
matrix and right-hand-side this element will
contribute to.
dof_map.dof_indices (elem, dof_indices);
dof_map.dof_indices (elem, dof_indices_u, u_var);
dof_map.dof_indices (elem, dof_indices_v, v_var);
dof_map.dof_indices (elem, dof_indices_p, p_var);
dof_map.dof_indices (elem, dof_indices_alpha, alpha_var);
const unsigned int n_dofs = dof_indices.size();
const unsigned int n_u_dofs = dof_indices_u.size();
const unsigned int n_v_dofs = dof_indices_v.size();
const unsigned int n_p_dofs = dof_indices_p.size();
Compute the element-specific data for the current
element. This involves computing the location of the
quadrature points (q_point) and the shape functions
(phi, dphi) for the current element.
fe_vel->reinit (elem);
fe_pres->reinit (elem);
Zero the element matrix and right-hand side before
summing them. We use the resize member here because
the number of degrees of freedom might have changed from
the last element. Note that this will be the case if the
element type is different (i.e. the last element was a
triangle, now we are on a quadrilateral).
Ke.resize (n_dofs, n_dofs);
Fe.resize (n_dofs);
Reposition the submatrices... The idea is this:
- - - - | Kuu Kuv Kup | | Fu | Ke = | Kvu Kvv Kvp |; Fe = | Fv | | Kpu Kpv Kpp | | Fp | - - - -
The \p DenseSubMatrix.repostition () member takes the (row_offset, column_offset, row_size, column_size).
Similarly, the \p DenseSubVector.reposition () member takes the (row_offset, row_size)
- - - - | Kuu Kuv Kup | | Fu | Ke = | Kvu Kvv Kvp |; Fe = | Fv | | Kpu Kpv Kpp | | Fp | - - - -
The \p DenseSubMatrix.repostition () member takes the (row_offset, column_offset, row_size, column_size).
Similarly, the \p DenseSubVector.reposition () member takes the (row_offset, row_size)
Kuu.reposition (u_var*n_u_dofs, u_var*n_u_dofs, n_u_dofs, n_u_dofs);
Kuv.reposition (u_var*n_u_dofs, v_var*n_u_dofs, n_u_dofs, n_v_dofs);
Kup.reposition (u_var*n_u_dofs, p_var*n_u_dofs, n_u_dofs, n_p_dofs);
Kvu.reposition (v_var*n_v_dofs, u_var*n_v_dofs, n_v_dofs, n_u_dofs);
Kvv.reposition (v_var*n_v_dofs, v_var*n_v_dofs, n_v_dofs, n_v_dofs);
Kvp.reposition (v_var*n_v_dofs, p_var*n_v_dofs, n_v_dofs, n_p_dofs);
Kpu.reposition (p_var*n_u_dofs, u_var*n_u_dofs, n_p_dofs, n_u_dofs);
Kpv.reposition (p_var*n_u_dofs, v_var*n_u_dofs, n_p_dofs, n_v_dofs);
Kpp.reposition (p_var*n_u_dofs, p_var*n_u_dofs, n_p_dofs, n_p_dofs);
Also, add a row and a column to constrain the pressure
Kp_alpha.reposition (p_var*n_u_dofs, p_var*n_u_dofs+n_p_dofs, n_p_dofs, 1);
Kalpha_p.reposition (p_var*n_u_dofs+n_p_dofs, p_var*n_u_dofs, 1, n_p_dofs);
Fu.reposition (u_var*n_u_dofs, n_u_dofs);
Fv.reposition (v_var*n_u_dofs, n_v_dofs);
Fp.reposition (p_var*n_u_dofs, n_p_dofs);
Now we will build the element matrix and right-hand-side.
Constructing the RHS requires the solution and its
gradient from the previous timestep. This must be
calculated at each quadrature point by summing the
solution degree-of-freedom values by the appropriate
weight functions.
for (unsigned int qp=0; qp<qrule.n_points(); qp++)
{
Values to hold the solution & its gradient at the previous timestep.
Number u = 0., u_old = 0.;
Number v = 0., v_old = 0.;
Number p_old = 0.;
Gradient grad_u, grad_u_old;
Gradient grad_v, grad_v_old;
Compute the velocity & its gradient from the previous timestep
and the old Newton iterate.
for (unsigned int l=0; l<n_u_dofs; l++)
{
From the old timestep:
u_old += phi[l][qp]*navier_stokes_system.old_solution (dof_indices_u[l]);
v_old += phi[l][qp]*navier_stokes_system.old_solution (dof_indices_v[l]);
grad_u_old.add_scaled (dphi[l][qp],navier_stokes_system.old_solution (dof_indices_u[l]));
grad_v_old.add_scaled (dphi[l][qp],navier_stokes_system.old_solution (dof_indices_v[l]));
From the previous Newton iterate:
u += phi[l][qp]*navier_stokes_system.current_solution (dof_indices_u[l]);
v += phi[l][qp]*navier_stokes_system.current_solution (dof_indices_v[l]);
grad_u.add_scaled (dphi[l][qp],navier_stokes_system.current_solution (dof_indices_u[l]));
grad_v.add_scaled (dphi[l][qp],navier_stokes_system.current_solution (dof_indices_v[l]));
}
Compute the old pressure value at this quadrature point.
for (unsigned int l=0; l<n_p_dofs; l++)
{
p_old += psi[l][qp]*navier_stokes_system.old_solution (dof_indices_p[l]);
}
Definitions for convenience. It is sometimes simpler to do a
dot product if you have the full vector at your disposal.
const NumberVectorValue U_old (u_old, v_old);
const NumberVectorValue U (u, v);
const Number u_x = grad_u(0);
const Number u_y = grad_u(1);
const Number v_x = grad_v(0);
const Number v_y = grad_v(1);
First, an i-loop over the velocity degrees of freedom.
We know that n_u_dofs == n_v_dofs so we can compute contributions
for both at the same time.
for (unsigned int i=0; i<n_u_dofs; i++)
{
Fu(i) += JxW[qp]*(u_old*phi[i][qp] - // mass-matrix term
(1.-theta)*dt*(U_old*grad_u_old)*phi[i][qp] + // convection term
(1.-theta)*dt*p_old*dphi[i][qp](0) - // pressure term on rhs
(1.-theta)*dt*(grad_u_old*dphi[i][qp]) + // diffusion term on rhs
theta*dt*(U*grad_u)*phi[i][qp]); // Newton term
Fv(i) += JxW[qp]*(v_old*phi[i][qp] - // mass-matrix term
(1.-theta)*dt*(U_old*grad_v_old)*phi[i][qp] + // convection term
(1.-theta)*dt*p_old*dphi[i][qp](1) - // pressure term on rhs
(1.-theta)*dt*(grad_v_old*dphi[i][qp]) + // diffusion term on rhs
theta*dt*(U*grad_v)*phi[i][qp]); // Newton term
Note that the Fp block is identically zero unless we are using
some kind of artificial compressibility scheme...
Matrix contributions for the uu and vv couplings.
Matrix contributions for the uu and vv couplings.
for (unsigned int j=0; j<n_u_dofs; j++)
{
Kuu(i,j) += JxW[qp]*(phi[i][qp]*phi[j][qp] + // mass matrix term
theta*dt*(dphi[i][qp]*dphi[j][qp]) + // diffusion term
theta*dt*(U*dphi[j][qp])*phi[i][qp] + // convection term
theta*dt*u_x*phi[i][qp]*phi[j][qp]); // Newton term
Kuv(i,j) += JxW[qp]*theta*dt*u_y*phi[i][qp]*phi[j][qp]; // Newton term
Kvv(i,j) += JxW[qp]*(phi[i][qp]*phi[j][qp] + // mass matrix term
theta*dt*(dphi[i][qp]*dphi[j][qp]) + // diffusion term
theta*dt*(U*dphi[j][qp])*phi[i][qp] + // convection term
theta*dt*v_y*phi[i][qp]*phi[j][qp]); // Newton term
Kvu(i,j) += JxW[qp]*theta*dt*v_x*phi[i][qp]*phi[j][qp]; // Newton term
}
Matrix contributions for the up and vp couplings.
for (unsigned int j=0; j<n_p_dofs; j++)
{
Kup(i,j) += JxW[qp]*(-theta*dt*psi[j][qp]*dphi[i][qp](0));
Kvp(i,j) += JxW[qp]*(-theta*dt*psi[j][qp]*dphi[i][qp](1));
}
}
Now an i-loop over the pressure degrees of freedom. This code computes
the matrix entries due to the continuity equation. Note: To maintain a
symmetric matrix, we may (or may not) multiply the continuity equation by
negative one. Here we do not.
for (unsigned int i=0; i<n_p_dofs; i++)
{
Kp_alpha(i,0) += JxW[qp]*psi[i][qp];
Kalpha_p(0,i) += JxW[qp]*psi[i][qp];
for (unsigned int j=0; j<n_u_dofs; j++)
{
Kpu(i,j) += JxW[qp]*psi[i][qp]*dphi[j][qp](0);
Kpv(i,j) += JxW[qp]*psi[i][qp]*dphi[j][qp](1);
}
}
} // end of the quadrature point qp-loop
At this point the interior element integration has
been completed. However, we have not yet addressed
boundary conditions. For this example we will only
consider simple Dirichlet boundary conditions imposed
via the penalty method. The penalty method used here
is equivalent (for Lagrange basis functions) to lumping
the matrix resulting from the L2 projection penalty
approach introduced in example 3.
{
The penalty value. \f$ \frac{1}{\epsilon} \f$
const Real penalty = 1.e10;
The following loops over the sides of the element.
If the element has no neighbor on a side then that
side MUST live on a boundary of the domain.
for (unsigned int s=0; s<elem->n_sides(); s++)
if (elem->neighbor(s) == NULL)
{
AutoPtr<Elem> side (elem->build_side(s));
Loop over the nodes on the side.
for (unsigned int ns=0; ns<side->n_nodes(); ns++)
{
Boundary ids are set internally by
build_square().
0=bottom
1=right
2=top
3=left
Set u = 1 on the top boundary, 0 everywhere else
Set u = 1 on the top boundary, 0 everywhere else
const Real u_value = (mesh.boundary_info->has_boundary_id(elem,s,2)) ? 1. : 0.;
Set v = 0 everywhere
const Real v_value = 0.;
Find the node on the element matching this node on
the side. That defined where in the element matrix
the boundary condition will be applied.
for (unsigned int n=0; n<elem->n_nodes(); n++)
if (elem->node(n) == side->node(ns))
{
Matrix contribution.
Kuu(n,n) += penalty;
Kvv(n,n) += penalty;
Right-hand-side contribution.
Fu(n) += penalty*u_value;
Fv(n) += penalty*v_value;
}
} // end face node loop
} // end if (elem->neighbor(side) == NULL)
} // end boundary condition section
If this assembly program were to be used on an adaptive mesh,
we would have to apply any hanging node constraint equations
dof_map.constrain_element_matrix_and_vector (Ke, Fe, dof_indices);
The element matrix and right-hand-side are now built
for this element. Add them to the global matrix and
right-hand-side vector. The \p SparseMatrix::add_matrix()
and \p NumericVector::add_vector() members do this for us.
navier_stokes_system.matrix->add_matrix (Ke, dof_indices);
navier_stokes_system.rhs->add_vector (Fe, dof_indices);
} // end of element loop
We can set the mean of the pressure by setting Falpha
navier_stokes_system.rhs->add(navier_stokes_system.rhs->size()-1,10.);
That's it.
return;
}
The source file systems_of_equations_ex3.C without comments:
#include <iostream>
#include <algorithm>
#include <sstream>
#include <math.h>
#include "libmesh/libmesh.h"
#include "libmesh/mesh.h"
#include "libmesh/mesh_generation.h"
#include "libmesh/exodusII_io.h"
#include "libmesh/equation_systems.h"
#include "libmesh/fe.h"
#include "libmesh/quadrature_gauss.h"
#include "libmesh/dof_map.h"
#include "libmesh/sparse_matrix.h"
#include "libmesh/numeric_vector.h"
#include "libmesh/dense_matrix.h"
#include "libmesh/dense_vector.h"
#include "libmesh/linear_implicit_system.h"
#include "libmesh/transient_system.h"
#include "libmesh/perf_log.h"
#include "libmesh/boundary_info.h"
#include "libmesh/utility.h"
#include "libmesh/dense_submatrix.h"
#include "libmesh/dense_subvector.h"
#include "libmesh/elem.h"
using namespace libMesh;
void assemble_stokes (EquationSystems& es,
const std::string& system_name);
int main (int argc, char** argv)
{
LibMeshInit init (argc, argv);
libmesh_example_assert(2 <= LIBMESH_DIM, "2D support");
if (libMesh::default_solver_package() == TRILINOS_SOLVERS)
{
std::cout << "We skip example 13 when using the Trilinos solvers.\n"
<< std::endl;
return 0;
}
Mesh mesh;
MeshTools::Generation::build_square (mesh,
20, 20,
0., 1.,
0., 1.,
QUAD9);
mesh.print_info();
EquationSystems equation_systems (mesh);
TransientLinearImplicitSystem & system =
equation_systems.add_system<TransientLinearImplicitSystem> ("Navier-Stokes");
system.add_variable ("u", SECOND);
system.add_variable ("v", SECOND);
system.add_variable ("p", FIRST);
system.add_variable ("alpha", FIRST, SCALAR);
system.attach_assemble_function (assemble_stokes);
equation_systems.init ();
equation_systems.print_info();
PerfLog perf_log("Systems Example 3");
TransientLinearImplicitSystem& navier_stokes_system =
equation_systems.get_system<TransientLinearImplicitSystem>("Navier-Stokes");
const Real dt = 0.01;
navier_stokes_system.time = 0.0;
const unsigned int n_timesteps = 15;
const unsigned int n_nonlinear_steps = 15;
const Real nonlinear_tolerance = 1.e-3;
equation_systems.parameters.set<unsigned int>("linear solver maximum iterations") = 250;
equation_systems.parameters.set<Real> ("dt") = dt;
AutoPtr<NumericVector<Number> >
last_nonlinear_soln (navier_stokes_system.solution->clone());
for (unsigned int t_step=0; t_step<n_timesteps; ++t_step)
{
navier_stokes_system.time += dt;
std::cout << "\n\n*** Solving time step " << t_step <<
", time = " << navier_stokes_system.time <<
" ***" << std::endl;
*navier_stokes_system.old_local_solution = *navier_stokes_system.current_local_solution;
const Real initial_linear_solver_tol = 1.e-6;
equation_systems.parameters.set<Real> ("linear solver tolerance") = initial_linear_solver_tol;
for (unsigned int l=0; l<n_nonlinear_steps; ++l)
{
last_nonlinear_soln->zero();
last_nonlinear_soln->add(*navier_stokes_system.solution);
perf_log.push("linear solve");
equation_systems.get_system("Navier-Stokes").solve();
perf_log.pop("linear solve");
last_nonlinear_soln->add (-1., *navier_stokes_system.solution);
last_nonlinear_soln->close();
const Real norm_delta = last_nonlinear_soln->l2_norm();
const unsigned int n_linear_iterations = navier_stokes_system.n_linear_iterations();
const Real final_linear_residual = navier_stokes_system.final_linear_residual();
std::cout << "Linear solver converged at step: "
<< n_linear_iterations
<< ", final residual: "
<< final_linear_residual
<< " Nonlinear convergence: ||u - u_old|| = "
<< norm_delta
<< std::endl;
if ((norm_delta < nonlinear_tolerance) &&
(navier_stokes_system.final_linear_residual() < nonlinear_tolerance))
{
std::cout << " Nonlinear solver converged at step "
<< l
<< std::endl;
break;
}
equation_systems.parameters.set<Real> ("linear solver tolerance") =
std::min(Utility::pow<2>(final_linear_residual), initial_linear_solver_tol);
} // end nonlinear loop
const unsigned int write_interval = 1;
#ifdef LIBMESH_HAVE_EXODUS_API
if ((t_step+1)%write_interval == 0)
{
std::ostringstream file_name;
file_name << "out_"
<< std::setw(3)
<< std::setfill('0')
<< std::right
<< t_step + 1
<< ".e";
ExodusII_IO(mesh).write_equation_systems (file_name.str(),
equation_systems);
}
#endif // #ifdef LIBMESH_HAVE_EXODUS_API
} // end timestep loop.
return 0;
}
void assemble_stokes (EquationSystems& es,
const std::string& system_name)
{
libmesh_assert_equal_to (system_name, "Navier-Stokes");
const MeshBase& mesh = es.get_mesh();
const unsigned int dim = mesh.mesh_dimension();
TransientLinearImplicitSystem & navier_stokes_system =
es.get_system<TransientLinearImplicitSystem> ("Navier-Stokes");
const unsigned int u_var = navier_stokes_system.variable_number ("u");
const unsigned int v_var = navier_stokes_system.variable_number ("v");
const unsigned int p_var = navier_stokes_system.variable_number ("p");
const unsigned int alpha_var = navier_stokes_system.variable_number ("alpha");
FEType fe_vel_type = navier_stokes_system.variable_type(u_var);
FEType fe_pres_type = navier_stokes_system.variable_type(p_var);
AutoPtr<FEBase> fe_vel (FEBase::build(dim, fe_vel_type));
AutoPtr<FEBase> fe_pres (FEBase::build(dim, fe_pres_type));
QGauss qrule (dim, fe_vel_type.default_quadrature_order());
fe_vel->attach_quadrature_rule (&qrule);
fe_pres->attach_quadrature_rule (&qrule);
const std::vector<Real>& JxW = fe_vel->get_JxW();
const std::vector<std::vector<Real> >& phi = fe_vel->get_phi();
const std::vector<std::vector<RealGradient> >& dphi = fe_vel->get_dphi();
const std::vector<std::vector<Real> >& psi = fe_pres->get_phi();
const DofMap & dof_map = navier_stokes_system.get_dof_map();
DenseMatrix<Number> Ke;
DenseVector<Number> Fe;
DenseSubMatrix<Number>
Kuu(Ke), Kuv(Ke), Kup(Ke),
Kvu(Ke), Kvv(Ke), Kvp(Ke),
Kpu(Ke), Kpv(Ke), Kpp(Ke);
DenseSubMatrix<Number> Kalpha_p(Ke), Kp_alpha(Ke);
DenseSubVector<Number>
Fu(Fe),
Fv(Fe),
Fp(Fe);
std::vector<dof_id_type> dof_indices;
std::vector<dof_id_type> dof_indices_u;
std::vector<dof_id_type> dof_indices_v;
std::vector<dof_id_type> dof_indices_p;
std::vector<dof_id_type> dof_indices_alpha;
const Real dt = es.parameters.get<Real>("dt");
const Real theta = 1.;
MeshBase::const_element_iterator el = mesh.active_local_elements_begin();
const MeshBase::const_element_iterator end_el = mesh.active_local_elements_end();
for ( ; el != end_el; ++el)
{
const Elem* elem = *el;
dof_map.dof_indices (elem, dof_indices);
dof_map.dof_indices (elem, dof_indices_u, u_var);
dof_map.dof_indices (elem, dof_indices_v, v_var);
dof_map.dof_indices (elem, dof_indices_p, p_var);
dof_map.dof_indices (elem, dof_indices_alpha, alpha_var);
const unsigned int n_dofs = dof_indices.size();
const unsigned int n_u_dofs = dof_indices_u.size();
const unsigned int n_v_dofs = dof_indices_v.size();
const unsigned int n_p_dofs = dof_indices_p.size();
fe_vel->reinit (elem);
fe_pres->reinit (elem);
Ke.resize (n_dofs, n_dofs);
Fe.resize (n_dofs);
Kuu.reposition (u_var*n_u_dofs, u_var*n_u_dofs, n_u_dofs, n_u_dofs);
Kuv.reposition (u_var*n_u_dofs, v_var*n_u_dofs, n_u_dofs, n_v_dofs);
Kup.reposition (u_var*n_u_dofs, p_var*n_u_dofs, n_u_dofs, n_p_dofs);
Kvu.reposition (v_var*n_v_dofs, u_var*n_v_dofs, n_v_dofs, n_u_dofs);
Kvv.reposition (v_var*n_v_dofs, v_var*n_v_dofs, n_v_dofs, n_v_dofs);
Kvp.reposition (v_var*n_v_dofs, p_var*n_v_dofs, n_v_dofs, n_p_dofs);
Kpu.reposition (p_var*n_u_dofs, u_var*n_u_dofs, n_p_dofs, n_u_dofs);
Kpv.reposition (p_var*n_u_dofs, v_var*n_u_dofs, n_p_dofs, n_v_dofs);
Kpp.reposition (p_var*n_u_dofs, p_var*n_u_dofs, n_p_dofs, n_p_dofs);
Kp_alpha.reposition (p_var*n_u_dofs, p_var*n_u_dofs+n_p_dofs, n_p_dofs, 1);
Kalpha_p.reposition (p_var*n_u_dofs+n_p_dofs, p_var*n_u_dofs, 1, n_p_dofs);
Fu.reposition (u_var*n_u_dofs, n_u_dofs);
Fv.reposition (v_var*n_u_dofs, n_v_dofs);
Fp.reposition (p_var*n_u_dofs, n_p_dofs);
for (unsigned int qp=0; qp<qrule.n_points(); qp++)
{
Number u = 0., u_old = 0.;
Number v = 0., v_old = 0.;
Number p_old = 0.;
Gradient grad_u, grad_u_old;
Gradient grad_v, grad_v_old;
for (unsigned int l=0; l<n_u_dofs; l++)
{
u_old += phi[l][qp]*navier_stokes_system.old_solution (dof_indices_u[l]);
v_old += phi[l][qp]*navier_stokes_system.old_solution (dof_indices_v[l]);
grad_u_old.add_scaled (dphi[l][qp],navier_stokes_system.old_solution (dof_indices_u[l]));
grad_v_old.add_scaled (dphi[l][qp],navier_stokes_system.old_solution (dof_indices_v[l]));
u += phi[l][qp]*navier_stokes_system.current_solution (dof_indices_u[l]);
v += phi[l][qp]*navier_stokes_system.current_solution (dof_indices_v[l]);
grad_u.add_scaled (dphi[l][qp],navier_stokes_system.current_solution (dof_indices_u[l]));
grad_v.add_scaled (dphi[l][qp],navier_stokes_system.current_solution (dof_indices_v[l]));
}
for (unsigned int l=0; l<n_p_dofs; l++)
{
p_old += psi[l][qp]*navier_stokes_system.old_solution (dof_indices_p[l]);
}
const NumberVectorValue U_old (u_old, v_old);
const NumberVectorValue U (u, v);
const Number u_x = grad_u(0);
const Number u_y = grad_u(1);
const Number v_x = grad_v(0);
const Number v_y = grad_v(1);
for (unsigned int i=0; i<n_u_dofs; i++)
{
Fu(i) += JxW[qp]*(u_old*phi[i][qp] - // mass-matrix term
(1.-theta)*dt*(U_old*grad_u_old)*phi[i][qp] + // convection term
(1.-theta)*dt*p_old*dphi[i][qp](0) - // pressure term on rhs
(1.-theta)*dt*(grad_u_old*dphi[i][qp]) + // diffusion term on rhs
theta*dt*(U*grad_u)*phi[i][qp]); // Newton term
Fv(i) += JxW[qp]*(v_old*phi[i][qp] - // mass-matrix term
(1.-theta)*dt*(U_old*grad_v_old)*phi[i][qp] + // convection term
(1.-theta)*dt*p_old*dphi[i][qp](1) - // pressure term on rhs
(1.-theta)*dt*(grad_v_old*dphi[i][qp]) + // diffusion term on rhs
theta*dt*(U*grad_v)*phi[i][qp]); // Newton term
for (unsigned int j=0; j<n_u_dofs; j++)
{
Kuu(i,j) += JxW[qp]*(phi[i][qp]*phi[j][qp] + // mass matrix term
theta*dt*(dphi[i][qp]*dphi[j][qp]) + // diffusion term
theta*dt*(U*dphi[j][qp])*phi[i][qp] + // convection term
theta*dt*u_x*phi[i][qp]*phi[j][qp]); // Newton term
Kuv(i,j) += JxW[qp]*theta*dt*u_y*phi[i][qp]*phi[j][qp]; // Newton term
Kvv(i,j) += JxW[qp]*(phi[i][qp]*phi[j][qp] + // mass matrix term
theta*dt*(dphi[i][qp]*dphi[j][qp]) + // diffusion term
theta*dt*(U*dphi[j][qp])*phi[i][qp] + // convection term
theta*dt*v_y*phi[i][qp]*phi[j][qp]); // Newton term
Kvu(i,j) += JxW[qp]*theta*dt*v_x*phi[i][qp]*phi[j][qp]; // Newton term
}
for (unsigned int j=0; j<n_p_dofs; j++)
{
Kup(i,j) += JxW[qp]*(-theta*dt*psi[j][qp]*dphi[i][qp](0));
Kvp(i,j) += JxW[qp]*(-theta*dt*psi[j][qp]*dphi[i][qp](1));
}
}
for (unsigned int i=0; i<n_p_dofs; i++)
{
Kp_alpha(i,0) += JxW[qp]*psi[i][qp];
Kalpha_p(0,i) += JxW[qp]*psi[i][qp];
for (unsigned int j=0; j<n_u_dofs; j++)
{
Kpu(i,j) += JxW[qp]*psi[i][qp]*dphi[j][qp](0);
Kpv(i,j) += JxW[qp]*psi[i][qp]*dphi[j][qp](1);
}
}
} // end of the quadrature point qp-loop
{
const Real penalty = 1.e10;
for (unsigned int s=0; s<elem->n_sides(); s++)
if (elem->neighbor(s) == NULL)
{
AutoPtr<Elem> side (elem->build_side(s));
for (unsigned int ns=0; ns<side->n_nodes(); ns++)
{
const Real u_value = (mesh.boundary_info->has_boundary_id(elem,s,2)) ? 1. : 0.;
const Real v_value = 0.;
for (unsigned int n=0; n<elem->n_nodes(); n++)
if (elem->node(n) == side->node(ns))
{
Kuu(n,n) += penalty;
Kvv(n,n) += penalty;
Fu(n) += penalty*u_value;
Fv(n) += penalty*v_value;
}
} // end face node loop
} // end if (elem->neighbor(side) == NULL)
} // end boundary condition section
dof_map.constrain_element_matrix_and_vector (Ke, Fe, dof_indices);
navier_stokes_system.matrix->add_matrix (Ke, dof_indices);
navier_stokes_system.rhs->add_vector (Fe, dof_indices);
} // end of element loop
navier_stokes_system.rhs->add(navier_stokes_system.rhs->size()-1,10.);
return;
}
The console output of the program:
***************************************************************
* Running Example systems_of_equations_ex3:
* mpirun -np 2 example-devel -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
Mesh Information:
mesh_dimension()=2
spatial_dimension()=3
n_nodes()=1681
n_local_nodes()=863
n_elem()=400
n_local_elem()=200
n_active_elem()=400
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
EquationSystems
n_systems()=1
System #0, "Navier-Stokes"
Type "TransientLinearImplicit"
Variables={ "u" "v" } "p" "alpha"
Finite Element Types="LAGRANGE", "JACOBI_20_00" "LAGRANGE", "JACOBI_20_00" "SCALAR", "JACOBI_20_00"
Infinite Element Mapping="CARTESIAN" "CARTESIAN" "CARTESIAN"
Approximation Orders="SECOND", "THIRD" "FIRST", "THIRD" "FIRST", "THIRD"
n_dofs()=3804
n_local_dofs()=1958
n_constrained_dofs()=0
n_local_constrained_dofs()=0
n_vectors()=3
n_matrices()=1
DofMap Sparsity
Average On-Processor Bandwidth <= 39.1646
Average Off-Processor Bandwidth <= 2.16246
Maximum On-Processor Bandwidth <= 1846
Maximum Off-Processor Bandwidth <= 1958
DofMap Constraints
Number of DoF Constraints = 0
Number of Node Constraints = 0
*** Solving time step 0, time = 0.01 ***
Linear solver converged at step: 26, final residual: 0.000542864 Nonlinear convergence: ||u - u_old|| = 505.659
Linear solver converged at step: 15, final residual: 0.000197449 Nonlinear convergence: ||u - u_old|| = 0.669555
Linear solver converged at step: 10, final residual: 1.96553e-05 Nonlinear convergence: ||u - u_old|| = 0.00176534
Linear solver converged at step: 12, final residual: 2.43338e-07 Nonlinear convergence: ||u - u_old|| = 3.10766e-05
Nonlinear solver converged at step 3
*** Solving time step 1, time = 0.02 ***
Linear solver converged at step: 23, final residual: 0.000465526 Nonlinear convergence: ||u - u_old|| = 28.9379
Linear solver converged at step: 12, final residual: 0.000109307 Nonlinear convergence: ||u - u_old|| = 0.0414035
Linear solver converged at step: 10, final residual: 6.26202e-06 Nonlinear convergence: ||u - u_old|| = 0.00060652
Nonlinear solver converged at step 2
*** Solving time step 2, time = 0.03 ***
Linear solver converged at step: 21, final residual: 0.000428265 Nonlinear convergence: ||u - u_old|| = 5.51097
Linear solver converged at step: 9, final residual: 7.35475e-05 Nonlinear convergence: ||u - u_old|| = 0.00952732
Linear solver converged at step: 10, final residual: 3.06116e-06 Nonlinear convergence: ||u - u_old|| = 0.000192065
Nonlinear solver converged at step 2
*** Solving time step 3, time = 0.04 ***
Linear solver converged at step: 19, final residual: 0.000554827 Nonlinear convergence: ||u - u_old|| = 1.99114
Linear solver converged at step: 7, final residual: 0.000210219 Nonlinear convergence: ||u - u_old|| = 0.00289763
Linear solver converged at step: 7, final residual: 2.2537e-05 Nonlinear convergence: ||u - u_old|| = 0.000412887
Nonlinear solver converged at step 2
*** Solving time step 4, time = 0.05 ***
Linear solver converged at step: 18, final residual: 0.00060118 Nonlinear convergence: ||u - u_old|| = 0.90864
Linear solver converged at step: 5, final residual: 0.000198775 Nonlinear convergence: ||u - u_old|| = 0.00154509
Linear solver converged at step: 8, final residual: 2.00619e-05 Nonlinear convergence: ||u - u_old|| = 0.000694022
Nonlinear solver converged at step 2
*** Solving time step 5, time = 0.06 ***
Linear solver converged at step: 18, final residual: 0.000468101 Nonlinear convergence: ||u - u_old|| = 0.466071
Linear solver converged at step: 7, final residual: 0.000136626 Nonlinear convergence: ||u - u_old|| = 0.00105415
Linear solver converged at step: 7, final residual: 9.49046e-06 Nonlinear convergence: ||u - u_old|| = 0.000415496
Nonlinear solver converged at step 2
*** Solving time step 6, time = 0.07 ***
Linear solver converged at step: 17, final residual: 0.000539321 Nonlinear convergence: ||u - u_old|| = 0.256801
Linear solver converged at step: 7, final residual: 0.00012627 Nonlinear convergence: ||u - u_old|| = 0.00178446
Linear solver converged at step: 9, final residual: 6.28023e-06 Nonlinear convergence: ||u - u_old|| = 0.000351046
Nonlinear solver converged at step 2
*** Solving time step 7, time = 0.08 ***
Linear solver converged at step: 16, final residual: 0.000566838 Nonlinear convergence: ||u - u_old|| = 0.148651
Linear solver converged at step: 7, final residual: 0.000144382 Nonlinear convergence: ||u - u_old|| = 0.00200458
Linear solver converged at step: 9, final residual: 1.29436e-05 Nonlinear convergence: ||u - u_old|| = 0.000397021
Nonlinear solver converged at step 2
*** Solving time step 8, time = 0.09 ***
Linear solver converged at step: 16, final residual: 0.000379276 Nonlinear convergence: ||u - u_old|| = 0.0892008
Linear solver converged at step: 8, final residual: 6.82181e-05 Nonlinear convergence: ||u - u_old|| = 0.00162021
Linear solver converged at step: 10, final residual: 3.03984e-06 Nonlinear convergence: ||u - u_old|| = 0.000174316
Nonlinear solver converged at step 2
*** Solving time step 9, time = 0.1 ***
Linear solver converged at step: 15, final residual: 0.000479071 Nonlinear convergence: ||u - u_old|| = 0.055322
Linear solver converged at step: 8, final residual: 0.000104835 Nonlinear convergence: ||u - u_old|| = 0.00320576
Linear solver converged at step: 10, final residual: 6.6986e-06 Nonlinear convergence: ||u - u_old|| = 0.000326132
Nonlinear solver converged at step 2
*** Solving time step 10, time = 0.11 ***
Linear solver converged at step: 14, final residual: 0.000620478 Nonlinear convergence: ||u - u_old|| = 0.0350859
Linear solver converged at step: 7, final residual: 0.000207692 Nonlinear convergence: ||u - u_old|| = 0.00471419
Linear solver converged at step: 9, final residual: 1.51019e-05 Nonlinear convergence: ||u - u_old|| = 0.000568757
Nonlinear solver converged at step 2
*** Solving time step 11, time = 0.12 ***
Linear solver converged at step: 13, final residual: 0.000499587 Nonlinear convergence: ||u - u_old|| = 0.0226644
Linear solver converged at step: 8, final residual: 0.000101303 Nonlinear convergence: ||u - u_old|| = 0.00455112
Linear solver converged at step: 9, final residual: 5.93208e-06 Nonlinear convergence: ||u - u_old|| = 0.000553815
Nonlinear solver converged at step 2
*** Solving time step 12, time = 0.13 ***
Linear solver converged at step: 11, final residual: 0.00050889 Nonlinear convergence: ||u - u_old|| = 0.0150287
Linear solver converged at step: 7, final residual: 0.000166364 Nonlinear convergence: ||u - u_old|| = 0.00472494
Linear solver converged at step: 8, final residual: 1.82086e-05 Nonlinear convergence: ||u - u_old|| = 0.000577442
Nonlinear solver converged at step 2
*** Solving time step 13, time = 0.14 ***
Linear solver converged at step: 11, final residual: 0.000332412 Nonlinear convergence: ||u - u_old|| = 0.00975013
Linear solver converged at step: 8, final residual: 6.49156e-05 Nonlinear convergence: ||u - u_old|| = 0.00322509
Linear solver converged at step: 10, final residual: 2.17793e-06 Nonlinear convergence: ||u - u_old|| = 0.000407714
Nonlinear solver converged at step 2
*** Solving time step 14, time = 0.15 ***
Linear solver converged at step: 9, final residual: 0.000497031 Nonlinear convergence: ||u - u_old|| = 0.00812662
Linear solver converged at step: 8, final residual: 0.000119397 Nonlinear convergence: ||u - u_old|| = 0.00556714
Linear solver converged at step: 10, final residual: 5.62144e-06 Nonlinear convergence: ||u - u_old|| = 0.000470077
Nonlinear solver converged at step 2
----------------------------------------------------------------------------------------------------------------------
| Processor id: 0 |
| Num Processors: 2 |
| Time: Tue Feb 5 13:44:16 2013 |
| OS: Linux |
| HostName: hbar.ices.utexas.edu |
| OS Release: 2.6.32-279.1.1.el6.x86_64 |
| OS Version: #1 SMP Tue Jul 10 11:24:23 CDT 2012 |
| Machine: x86_64 |
| Username: benkirk |
| Configuration: ./configure '--enable-everything' |
| '--prefix=/workspace/libmesh/install' |
| 'CXX=mpicxx' |
| 'CC=mpicc' |
| 'F77=mpif77' |
| 'FC=mpif90' |
| 'PETSC_DIR=/opt/apps/ossw/libraries/petsc/petsc-3.3-p2' |
| 'PETSC_ARCH=intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt' |
| 'SLEPC_DIR=/opt/apps/ossw/libraries/slepc/slepc-3.3-p2-petsc-3.3-p2-cxx-opt' |
| 'TRILINOS_DIR=/opt/apps/ossw/libraries/trilinos/trilinos-10.12.2/sl6/intel-12.1/mpich2-1.4.1p1/mkl-intel-10.3.12.361'|
| 'VTK_DIR=/opt/apps/ossw/libraries/vtk/vtk-5.10.0/sl6/intel-12.1' |
----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
| Systems Example 3 Performance: Alive time=15.0725, Active time=14.2188 |
-----------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------|
| |
| linear solve 46 14.2188 0.309104 14.2188 0.309104 100.00 100.00 |
-----------------------------------------------------------------------------------------------------------
| Totals: 46 14.2188 100.00 |
-----------------------------------------------------------------------------------------------------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/workspace/libmesh/examples/systems_of_equations/systems_of_equations_ex3/.libs/lt-example-devel on a intel-12. named hbar.ices.utexas.edu with 2 processors, by benkirk Tue Feb 5 13:44:16 2013
Using Petsc Release Version 3.3.0, Patch 2, Fri Jul 13 15:42:00 CDT 2012
Max Max/Min Avg Total
Time (sec): 1.530e+01 1.00000 1.530e+01
Objects: 4.040e+02 1.00000 4.040e+02
Flops: 3.057e+09 1.13066 2.880e+09 5.760e+09
Flops/sec: 1.997e+08 1.13066 1.882e+08 3.764e+08
MPI Messages: 8.670e+02 1.00000 8.670e+02 1.734e+03
MPI Message Lengths: 8.397e+06 1.00000 9.685e+03 1.679e+07
MPI Reductions: 2.339e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.5304e+01 100.0% 5.7602e+09 100.0% 1.734e+03 100.0% 9.685e+03 100.0% 2.338e+03 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecMDot 518 1.0 2.4491e-02 3.4 1.44e+07 1.1 0.0e+00 0.0e+00 5.2e+02 0 0 0 0 22 0 0 0 0 22 1144
VecNorm 656 1.0 7.0093e-03 2.5 2.57e+06 1.1 0.0e+00 0.0e+00 6.6e+02 0 0 0 0 28 0 0 0 0 28 712
VecScale 564 1.0 7.1740e-04 1.2 1.10e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2991
VecCopy 108 1.0 2.8038e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 756 1.0 1.3978e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 184 1.0 1.0073e-02 1.0 7.21e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 139
VecMAXPY 564 1.0 5.7144e-03 1.1 1.65e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 5596
VecAssemblyBegin 200 1.0 7.6092e-0240.9 0.00e+00 0.0 1.8e+02 1.2e+03 6.0e+02 0 0 11 1 26 0 0 11 1 26 0
VecAssemblyEnd 200 1.0 2.0933e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 625 1.0 3.1152e-03 1.3 0.00e+00 0.0 1.2e+03 8.0e+03 0.0e+00 0 0 72 60 0 0 0 72 60 0 0
VecScatterEnd 625 1.0 4.6577e-01345.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
VecNormalize 564 1.0 6.6085e-03 1.8 3.31e+06 1.1 0.0e+00 0.0e+00 5.6e+02 0 0 0 0 24 0 0 0 0 24 974
MatMult 564 1.0 5.1427e-0110.1 8.79e+07 1.0 1.1e+03 8.7e+03 0.0e+00 2 3 65 58 0 2 3 65 58 0 338
MatSolve 610 1.0 2.5104e-01 1.1 5.24e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 17 0 0 0 2 17 0 0 0 3989
MatLUFactorNum 46 1.0 1.2974e+00 1.1 2.41e+09 1.1 0.0e+00 0.0e+00 0.0e+00 8 78 0 0 0 8 78 0 0 0 3481
MatILUFactorSym 46 1.0 3.4402e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+02 21 0 0 0 6 21 0 0 0 6 0
MatAssemblyBegin 92 1.0 9.6098e-0220.7 0.00e+00 0.0 2.8e+02 2.4e+04 1.8e+02 0 0 16 39 8 0 0 16 39 8 0
MatAssemblyEnd 92 1.0 2.0310e-02 1.5 0.00e+00 0.0 4.0e+00 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 46 1.0 1.6928e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 46 1.0 2.8827e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+02 0 0 0 0 8 0 0 0 0 8 0
MatZeroEntries 48 1.0 2.5649e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 518 1.0 2.9985e-02 2.3 2.89e+07 1.1 0.0e+00 0.0e+00 5.2e+02 0 1 0 0 22 0 1 0 0 22 1870
KSPSetUp 92 1.0 3.4475e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 46 1.0 5.0818e+00 1.0 3.06e+09 1.1 1.1e+03 8.7e+03 1.5e+03 33100 65 58 62 33100 65 58 62 1133
PCSetUp 92 1.0 4.7521e+00 1.1 2.41e+09 1.1 0.0e+00 0.0e+00 3.2e+02 30 78 0 0 14 30 78 0 0 14 950
PCSetUpOnBlocks 46 1.0 4.7515e+00 1.1 2.41e+09 1.1 0.0e+00 0.0e+00 3.2e+02 30 78 0 0 14 30 78 0 0 14 951
PCApply 610 1.0 2.5709e-01 1.1 5.24e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 17 0 0 0 2 17 0 0 0 3895
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 97 97 1533256 0
Vector Scatter 6 6 6216 0
Index Set 242 242 1100880 0
IS L to G Mapping 5 5 2820 0
Matrix 49 49 238557800 0
Krylov Solver 2 2 19360 0
Preconditioner 2 2 1784 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 0
Average time for MPI_Barrier(): 1.00136e-06
Average time for zero size MPI_Send(): 1.65701e-05
#PETSc Option Table entries:
-ksp_right_pc
-log_summary
-pc_type bjacobi
-sub_pc_factor_levels 4
-sub_pc_factor_zeropivot 0
-sub_pc_type ilu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Nov 8 11:21:02 2012
Configure options: --with-debugging=false --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-clanguage=C++ --with-shared-libraries=1 --with-mpi-dir=/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1 --with-mumps=true --download-mumps=1 --with-metis=true --download-metis=1 --with-parmetis=true --download-parmetis=1 --with-superlu=true --download-superlu=1 --with-superludir=true --download-superlu_dist=1 --with-blacs=true --download-blacs=1 --with-scalapack=true --download-scalapack=1 --with-hypre=true --download-hypre=1 --with-blas-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_intel_lp64.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_sequential.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_core.so]" --with-lapack-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_lapack95_lp64.a]"
-----------------------------------------
Libraries compiled on Thu Nov 8 11:21:02 2012 on daedalus.ices.utexas.edu
Machine characteristics: Linux-2.6.32-279.1.1.el6.x86_64-x86_64-with-redhat-6.3-Carbon
Using PETSc directory: /opt/apps/ossw/libraries/petsc/petsc-3.3-p2
Using PETSc arch: intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt
-----------------------------------------
Using C compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx -wd1572 -O3 -fPIC ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/include
-----------------------------------------
Using C linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx
Using Fortran linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90
Using libraries: -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lpetsc -lX11 -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lHYPRE -lpthread -lsuperlu_dist_3.0 -lparmetis -lmetis -lscalapack -lblacs -lsuperlu_4.3 -Wl,-rpath,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -L/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,-rpath,/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -L/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -Wl,-rpath,/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
-----------------------------------------
----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=15.3146, Active time=15.2538 |
----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|----------------------------------------------------------------------------------------------------------------|
| |
| |
| DofMap |
| SCALAR_dof_indices() 21646 0.0836 0.000004 0.0836 0.000004 0.55 0.55 |
| add_neighbors_to_send_list() 1 0.0401 0.040083 0.0498 0.049771 0.26 0.33 |
| build_sparsity() 1 0.0569 0.056857 0.1025 0.102455 0.37 0.67 |
| create_dof_constraints() 1 0.0008 0.000766 0.0008 0.000766 0.01 0.01 |
| distribute_dofs() 1 0.0202 0.020158 0.0448 0.044844 0.13 0.29 |
| dof_indices() 58246 5.0696 0.000087 5.1564 0.000089 33.23 33.80 |
| prepare_send_list() 1 0.0002 0.000156 0.0002 0.000156 0.00 0.00 |
| reinit() 1 0.0243 0.024322 0.0243 0.024322 0.16 0.16 |
| |
| EquationSystems |
| build_solution_vector() 15 0.0481 0.003204 0.7565 0.050434 0.32 4.96 |
| |
| ExodusII_IO |
| write_nodal_data() 15 0.0796 0.005309 0.0796 0.005309 0.52 0.52 |
| |
| FE |
| compute_shape_functions() 18400 0.3279 0.000018 0.3279 0.000018 2.15 2.15 |
| init_shape_functions() 92 0.0036 0.000039 0.0036 0.000039 0.02 0.02 |
| |
| FEMap |
| compute_affine_map() 18400 0.1946 0.000011 0.1946 0.000011 1.28 1.28 |
| init_reference_to_physical_map() 92 0.0077 0.000084 0.0077 0.000084 0.05 0.05 |
| |
| Mesh |
| find_neighbors() 1 0.0047 0.004724 0.0048 0.004772 0.03 0.03 |
| renumber_nodes_and_elem() 2 0.0005 0.000249 0.0005 0.000249 0.00 0.00 |
| |
| MeshCommunication |
| compute_hilbert_indices() 2 0.0036 0.001809 0.0036 0.001809 0.02 0.02 |
| find_global_indices() 2 0.0015 0.000757 0.0062 0.003102 0.01 0.04 |
| parallel_sort() 2 0.0007 0.000362 0.0008 0.000402 0.00 0.01 |
| |
| MeshOutput |
| write_equation_systems() 15 0.0010 0.000068 0.8382 0.055877 0.01 5.49 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0022 0.002213 0.0022 0.002213 0.01 0.01 |
| |
| MetisPartitioner |
| partition() 1 0.0050 0.004966 0.0077 0.007737 0.03 0.05 |
| |
| Parallel |
| allgather() 9 0.0002 0.000019 0.0002 0.000021 0.00 0.00 |
| max(bool) 1 0.0000 0.000003 0.0000 0.000003 0.00 0.00 |
| max(scalar) 315 0.0008 0.000003 0.0008 0.000003 0.01 0.01 |
| max(vector) 66 0.0004 0.000007 0.0009 0.000014 0.00 0.01 |
| min(bool) 373 0.0009 0.000002 0.0009 0.000002 0.01 0.01 |
| min(scalar) 309 0.0049 0.000016 0.0049 0.000016 0.03 0.03 |
| min(vector) 66 0.0006 0.000009 0.0012 0.000018 0.00 0.01 |
| probe() 12 0.0003 0.000022 0.0003 0.000022 0.00 0.00 |
| receive() 12 0.0001 0.000010 0.0004 0.000032 0.00 0.00 |
| send() 12 0.0001 0.000005 0.0001 0.000005 0.00 0.00 |
| send_receive() 16 0.0003 0.000017 0.0008 0.000047 0.00 0.00 |
| sum() 62 0.0013 0.000020 0.0018 0.000029 0.01 0.01 |
| |
| Parallel::Request |
| wait() 12 0.0000 0.000003 0.0000 0.000003 0.00 0.00 |
| |
| Partitioner |
| set_node_processor_ids() 1 0.0009 0.000949 0.0010 0.001047 0.01 0.01 |
| set_parent_processor_ids() 1 0.0003 0.000334 0.0003 0.000334 0.00 0.00 |
| |
| PetscLinearSolver |
| solve() 46 5.1950 0.112935 5.1950 0.112935 34.06 34.06 |
| |
| System |
| assemble() 46 4.0713 0.088507 9.0186 0.196056 26.69 59.12 |
----------------------------------------------------------------------------------------------------------------
| Totals: 118297 15.2538 100.00 |
----------------------------------------------------------------------------------------------------------------
***************************************************************
* Done Running Example systems_of_equations_ex3:
* mpirun -np 2 example-devel -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
Systems Example 3 - Navier-Stokes with SCALAR Lagrange Multiplier
This example shows how the transient Navier-Stokes problem from example 13 can be solved using a scalar Lagrange multiplier formulation to constrain the integral of the pressure variable, rather than pinning the pressure at a single point.
C++ include files that we need