The source file exact_solution.C with comments:
#include <math.h>
Mesh library includes
#include "libmesh/libmesh_common.h"
Bring in everything from the libMesh namespace
using namespace libMesh;
/**
* This is the exact solution that
* we are trying to obtain. We will solve
*
* - (u_xx + u_yy) = f
*
* and take a finite difference approximation using this
* function to get f. This is the well-known "method of
* manufactured solutions".
*/
Real exact_solution (const Real x,
const Real y,
const Real z = 0.)
{
static const Real pi = acos(-1.);
return cos(.5*pi*x)*sin(.5*pi*y)*cos(.5*pi*z);
}
The source file introduction_ex4.C with comments:
Introduction Example 4 - Solving a 1D, 2D or 3D Poisson Problem in Parallel
This is the fourth example program. It builds on the third example program by showing how to formulate the code in a dimension-independent way. Very minor changes to the example will allow the problem to be solved in one, two or three dimensions.
This example will also introduce the PerfLog class as a way to monitor your code's performance. We will use it to instrument the matrix assembly code and look for bottlenecks where we should focus optimization efforts.
This example also shows how to extend example 3 to run in parallel. Notice how little has changed! The significant differences are marked with "PARALLEL CHANGE".
C++ include files that we need
#include <iostream>
#include <algorithm>
#include <math.h>
#include <set>
Basic include file needed for the mesh functionality.
#include "libmesh/libmesh.h"
#include "libmesh/mesh.h"
#include "libmesh/mesh_generation.h"
#include "libmesh/exodusII_io.h"
#include "libmesh/gnuplot_io.h"
#include "libmesh/linear_implicit_system.h"
#include "libmesh/equation_systems.h"
Define the Finite Element object.
#include "libmesh/fe.h"
Define Gauss quadrature rules.
#include "libmesh/quadrature_gauss.h"
Define the DofMap, which handles degree of freedom
indexing.
#include "libmesh/dof_map.h"
Define useful datatypes for finite element
matrix and vector components.
#include "libmesh/sparse_matrix.h"
#include "libmesh/numeric_vector.h"
#include "libmesh/dense_matrix.h"
#include "libmesh/dense_vector.h"
Define the PerfLog, a performance logging utility.
It is useful for timing events in a code and giving
you an idea where bottlenecks lie.
#include "libmesh/perf_log.h"
The definition of a geometric element
#include "libmesh/elem.h"
To impose Dirichlet boundary conditions
#include "libmesh/dirichlet_boundaries.h"
#include "libmesh/analytic_function.h"
#include "libmesh/string_to_enum.h"
#include "libmesh/getpot.h"
Bring in everything from the libMesh namespace
using namespace libMesh;
Function prototype. This is the function that will assemble
the linear system for our Poisson problem. Note that the
function will take the \p EquationSystems object and the
name of the system we are assembling as input. From the
\p EquationSystems object we have acess to the \p Mesh and
other objects we might need.
void assemble_poisson(EquationSystems& es,
const std::string& system_name);
Exact solution function prototype.
Real exact_solution (const Real x,
const Real y,
const Real z = 0.);
Define a wrapper for exact_solution that will be needed below
void exact_solution_wrapper (DenseVector<Number>& output,
const Point& p,
const Real)
{
output(0) = exact_solution(p(0),
(LIBMESH_DIM>1)?p(1):0,
(LIBMESH_DIM>2)?p(2):0);
}
Begin the main program.
int main (int argc, char** argv)
{
Initialize libMesh and any dependent libaries, like in example 2.
LibMeshInit init (argc, argv);
Declare a performance log for the main program
PerfLog perf_main("Main Program");
Create a GetPot object to parse the command line
Create a GetPot object to parse the command line
GetPot command_line (argc, argv);
Check for proper calling arguments.
if (argc < 3)
{
if (libMesh::processor_id() == 0)
std::cerr << "Usage:\n"
<<"\t " << argv[0] << " -d 2(3)" << " -n 15"
<< std::endl;
This handy function will print the file name, line number,
and then abort. Currrently the library does not use C++
exception handling.
libmesh_error();
}
Brief message to the user regarding the program name
and command line arguments.
else
{
std::cout << "Running " << argv[0];
for (int i=1; i<argc; i++)
std::cout << " " << argv[i];
std::cout << std::endl << std::endl;
}
Read problem dimension from command line. Use int
instead of unsigned since the GetPot overload is ambiguous
otherwise.
int dim = 2;
if ( command_line.search(1, "-d") )
dim = command_line.next(dim);
Skip higher-dimensional examples on a lower-dimensional libMesh build
libmesh_example_assert(dim <= LIBMESH_DIM, "2D/3D support");
Create a mesh with user-defined dimension.
Read number of elements from command line
int ps = 15;
if ( command_line.search(1, "-n") )
ps = command_line.next(ps);
Read FE order from command line
std::string order = "SECOND";
if ( command_line.search(2, "-Order", "-o") )
order = command_line.next(order);
Read FE Family from command line
std::string family = "LAGRANGE";
if ( command_line.search(2, "-FEFamily", "-f") )
family = command_line.next(family);
Cannot use discontinuous basis.
if ((family == "MONOMIAL") || (family == "XYZ"))
{
if (libMesh::processor_id() == 0)
std::cerr << "ex4 currently requires a C^0 (or higher) FE basis." << std::endl;
libmesh_error();
}
Mesh mesh;
Use the MeshTools::Generation mesh generator to create a uniform
grid on the square [-1,1]^D. We instruct the mesh generator
to build a mesh of 8x8 \p Quad9 elements in 2D, or \p Hex27
elements in 3D. Building these higher-order elements allows
us to use higher-order approximation, as in example 3.
Real halfwidth = dim > 1 ? 1. : 0.;
Real halfheight = dim > 2 ? 1. : 0.;
if ((family == "LAGRANGE") && (order == "FIRST"))
{
No reason to use high-order geometric elements if we are
solving with low-order finite elements.
MeshTools::Generation::build_cube (mesh,
ps,
(dim>1) ? ps : 0,
(dim>2) ? ps : 0,
-1., 1.,
-halfwidth, halfwidth,
-halfheight, halfheight,
(dim==1) ? EDGE2 :
((dim == 2) ? QUAD4 : HEX8));
}
else
{
MeshTools::Generation::build_cube (mesh,
ps,
(dim>1) ? ps : 0,
(dim>2) ? ps : 0,
-1., 1.,
-halfwidth, halfwidth,
-halfheight, halfheight,
(dim==1) ? EDGE3 :
((dim == 2) ? QUAD9 : HEX27));
}
Print information about the mesh to the screen.
mesh.print_info();
Create an equation systems object.
EquationSystems equation_systems (mesh);
Declare the system and its variables.
Create a system named "Poisson"
LinearImplicitSystem& system =
equation_systems.add_system<LinearImplicitSystem> ("Poisson");
Add the variable "u" to "Poisson". "u"
will be approximated using second-order approximation.
unsigned int u_var = system.add_variable("u",
Utility::string_to_enum<Order> (order),
Utility::string_to_enum<FEFamily>(family));
Give the system a pointer to the matrix assembly
function.
system.attach_assemble_function (assemble_poisson);
Construct a Dirichlet boundary condition object
Indicate which boundary IDs we impose the BC on We either build a line, a square or a cube, and here we indicate the boundaries IDs in each case
Indicate which boundary IDs we impose the BC on We either build a line, a square or a cube, and here we indicate the boundaries IDs in each case
std::set<boundary_id_type> boundary_ids;
the dim==1 mesh has two boundaries with IDs 0 and 1
boundary_ids.insert(0);
boundary_ids.insert(1);
the dim==2 mesh has four boundaries with IDs 0, 1, 2 and 3
if(dim>=2)
{
boundary_ids.insert(2);
boundary_ids.insert(3);
}
the dim==3 mesh has four boundaries with IDs 0, 1, 2, 3, 4 and 5
if(dim==3)
{
boundary_ids.insert(4);
boundary_ids.insert(5);
}
Create a vector storing the variable numbers which the BC applies to
std::vector<unsigned int> variables(1);
variables[0] = u_var;
Create an AnalyticFunction object that we use to project the BC
This function just calls the function exact_solution via exact_solution_wrapper
AnalyticFunction<> exact_solution_object(exact_solution_wrapper);
DirichletBoundary dirichlet_bc(boundary_ids,
variables,
&exact_solution_object);
We must add the Dirichlet boundary condition _before_
we call equation_systems.init()
system.get_dof_map().add_dirichlet_boundary(dirichlet_bc);
Initialize the data structures for the equation system.
equation_systems.init();
Print information about the system to the screen.
equation_systems.print_info();
mesh.print_info();
Solve the system "Poisson", just like example 2.
system.solve();
After solving the system write the solution
to a GMV-formatted plot file.
if(dim == 1)
{
GnuPlotIO plot(mesh,"Introduction Example 4, 1D",GnuPlotIO::GRID_ON);
plot.write_equation_systems("gnuplot_script",equation_systems);
}
#ifdef LIBMESH_HAVE_EXODUS_API
else
{
ExodusII_IO (mesh).write_equation_systems ((dim == 3) ?
"out_3.e" : "out_2.e",equation_systems);
}
#endif // #ifdef LIBMESH_HAVE_EXODUS_API
All done.
return 0;
}
We now define the matrix assembly function for the
Poisson system. We need to first compute element
matrices and right-hand sides, and then take into
account the boundary conditions.
void assemble_poisson(EquationSystems& es,
const std::string& system_name)
{
It is a good idea to make sure we are assembling
the proper system.
libmesh_assert_equal_to (system_name, "Poisson");
Declare a performance log. Give it a descriptive
string to identify what part of the code we are
logging, since there may be many PerfLogs in an
application.
PerfLog perf_log ("Matrix Assembly");
Get a constant reference to the mesh object.
const MeshBase& mesh = es.get_mesh();
The dimension that we are running
const unsigned int dim = mesh.mesh_dimension();
Get a reference to the LinearImplicitSystem we are solving
LinearImplicitSystem& system = es.get_system<LinearImplicitSystem>("Poisson");
A reference to the \p DofMap object for this system. The \p DofMap
object handles the index translation from node and element numbers
to degree of freedom numbers. We will talk more about the \p DofMap
in future examples.
const DofMap& dof_map = system.get_dof_map();
Get a constant reference to the Finite Element type
for the first (and only) variable in the system.
FEType fe_type = dof_map.variable_type(0);
Build a Finite Element object of the specified type. Since the
\p FEBase::build() member dynamically creates memory we will
store the object as an \p AutoPtr. This can be thought
of as a pointer that will clean up after itself.
AutoPtr<FEBase> fe (FEBase::build(dim, fe_type));
A 5th order Gauss quadrature rule for numerical integration.
QGauss qrule (dim, FIFTH);
Tell the finite element object to use our quadrature rule.
fe->attach_quadrature_rule (&qrule);
Declare a special finite element object for
boundary integration.
AutoPtr<FEBase> fe_face (FEBase::build(dim, fe_type));
Boundary integration requires one quadraure rule,
with dimensionality one less than the dimensionality
of the element.
QGauss qface(dim-1, FIFTH);
Tell the finte element object to use our
quadrature rule.
fe_face->attach_quadrature_rule (&qface);
Here we define some references to cell-specific data that
will be used to assemble the linear system.
We begin with the element Jacobian * quadrature weight at each
integration point.
const std::vector<Real>& JxW = fe->get_JxW();
The physical XY locations of the quadrature points on the element.
These might be useful for evaluating spatially varying material
properties at the quadrature points.
const std::vector<Point>& q_point = fe->get_xyz();
The element shape functions evaluated at the quadrature points.
const std::vector<std::vector<Real> >& phi = fe->get_phi();
The element shape function gradients evaluated at the quadrature
points.
const std::vector<std::vector<RealGradient> >& dphi = fe->get_dphi();
Define data structures to contain the element matrix
and right-hand-side vector contribution. Following
basic finite element terminology we will denote these
"Ke" and "Fe". More detail is in example 3.
DenseMatrix<Number> Ke;
DenseVector<Number> Fe;
This vector will hold the degree of freedom indices for
the element. These define where in the global system
the element degrees of freedom get mapped.
std::vector<dof_id_type> dof_indices;
Now we will loop over all the elements in the mesh.
We will compute the element matrix and right-hand-side
contribution. See example 3 for a discussion of the
element iterators.
MeshBase::const_element_iterator el = mesh.active_local_elements_begin();
const MeshBase::const_element_iterator end_el = mesh.active_local_elements_end();
for ( ; el != end_el; ++el)
{
Start logging the shape function initialization.
This is done through a simple function call with
the name of the event to log.
perf_log.push("elem init");
Store a pointer to the element we are currently
working on. This allows for nicer syntax later.
const Elem* elem = *el;
Get the degree of freedom indices for the
current element. These define where in the global
matrix and right-hand-side this element will
contribute to.
dof_map.dof_indices (elem, dof_indices);
Compute the element-specific data for the current
element. This involves computing the location of the
quadrature points (q_point) and the shape functions
(phi, dphi) for the current element.
fe->reinit (elem);
Zero the element matrix and right-hand side before
summing them. We use the resize member here because
the number of degrees of freedom might have changed from
the last element. Note that this will be the case if the
element type is different (i.e. the last element was a
triangle, now we are on a quadrilateral).
Ke.resize (dof_indices.size(),
dof_indices.size());
Fe.resize (dof_indices.size());
Stop logging the shape function initialization.
If you forget to stop logging an event the PerfLog
object will probably catch the error and abort.
perf_log.pop("elem init");
Now we will build the element matrix. This involves
a double loop to integrate the test funcions (i) against
the trial functions (j).
We have split the numeric integration into two loops so that we can log the matrix and right-hand-side computation seperately.
Now start logging the element matrix computation
We have split the numeric integration into two loops so that we can log the matrix and right-hand-side computation seperately.
Now start logging the element matrix computation
perf_log.push ("Ke");
for (unsigned int qp=0; qp<qrule.n_points(); qp++)
for (unsigned int i=0; i<phi.size(); i++)
for (unsigned int j=0; j<phi.size(); j++)
Ke(i,j) += JxW[qp]*(dphi[i][qp]*dphi[j][qp]);
Stop logging the matrix computation
perf_log.pop ("Ke");
Now we build the element right-hand-side contribution.
This involves a single loop in which we integrate the
"forcing function" in the PDE against the test functions.
Start logging the right-hand-side computation
Start logging the right-hand-side computation
perf_log.push ("Fe");
for (unsigned int qp=0; qp<qrule.n_points(); qp++)
{
fxy is the forcing function for the Poisson equation.
In this case we set fxy to be a finite difference
Laplacian approximation to the (known) exact solution.
We will use the second-order accurate FD Laplacian approximation, which in 2D on a structured grid is
u_xx + u_yy = (u(i-1,j) + u(i+1,j) + u(i,j-1) + u(i,j+1) + -4*u(i,j))/h^2
Since the value of the forcing function depends only on the location of the quadrature point (q_point[qp]) we will compute it here, outside of the i-loop
We will use the second-order accurate FD Laplacian approximation, which in 2D on a structured grid is
u_xx + u_yy = (u(i-1,j) + u(i+1,j) + u(i,j-1) + u(i,j+1) + -4*u(i,j))/h^2
Since the value of the forcing function depends only on the location of the quadrature point (q_point[qp]) we will compute it here, outside of the i-loop
const Real x = q_point[qp](0);
#if LIBMESH_DIM > 1
const Real y = q_point[qp](1);
#else
const Real y = 0.;
#endif
#if LIBMESH_DIM > 2
const Real z = q_point[qp](2);
#else
const Real z = 0.;
#endif
const Real eps = 1.e-3;
const Real uxx = (exact_solution(x-eps,y,z) +
exact_solution(x+eps,y,z) +
-2.*exact_solution(x,y,z))/eps/eps;
const Real uyy = (exact_solution(x,y-eps,z) +
exact_solution(x,y+eps,z) +
-2.*exact_solution(x,y,z))/eps/eps;
const Real uzz = (exact_solution(x,y,z-eps) +
exact_solution(x,y,z+eps) +
-2.*exact_solution(x,y,z))/eps/eps;
Real fxy;
if(dim==1)
{
In 1D, compute the rhs by differentiating the
exact solution twice.
const Real pi = libMesh::pi;
fxy = (0.25*pi*pi)*sin(.5*pi*x);
}
else
{
fxy = - (uxx + uyy + ((dim==2) ? 0. : uzz));
}
Add the RHS contribution
for (unsigned int i=0; i<phi.size(); i++)
Fe(i) += JxW[qp]*fxy*phi[i][qp];
}
Stop logging the right-hand-side computation
perf_log.pop ("Fe");
If this assembly program were to be used on an adaptive mesh,
we would have to apply any hanging node constraint equations
Also, note that here we call heterogenously_constrain_element_matrix_and_vector
to impose a inhomogeneous Dirichlet boundary conditions.
dof_map.heterogenously_constrain_element_matrix_and_vector (Ke, Fe, dof_indices);
The element matrix and right-hand-side are now built
for this element. Add them to the global matrix and
right-hand-side vector. The \p SparseMatrix::add_matrix()
and \p NumericVector::add_vector() members do this for us.
Start logging the insertion of the local (element)
matrix and vector into the global matrix and vector
perf_log.push ("matrix insertion");
system.matrix->add_matrix (Ke, dof_indices);
system.rhs->add_vector (Fe, dof_indices);
Start logging the insertion of the local (element)
matrix and vector into the global matrix and vector
perf_log.pop ("matrix insertion");
}
That's it. We don't need to do anything else to the
PerfLog. When it goes out of scope (at this function return)
it will print its log to the screen. Pretty easy, huh?
}
The source file exact_solution.C without comments:
#include <math.h>
#include "libmesh/libmesh_common.h"
using namespace libMesh;
/**
* This is the exact solution that
* we are trying to obtain. We will solve
*
* - (u_xx + u_yy) = f
*
* and take a finite difference approximation using this
* function to get f. This is the well-known "method of
* manufactured solutions".
*/
Real exact_solution (const Real x,
const Real y,
const Real z = 0.)
{
static const Real pi = acos(-1.);
return cos(.5*pi*x)*sin(.5*pi*y)*cos(.5*pi*z);
}
The source file introduction_ex4.C without comments:
#include <iostream>
#include <algorithm>
#include <math.h>
#include <set>
#include "libmesh/libmesh.h"
#include "libmesh/mesh.h"
#include "libmesh/mesh_generation.h"
#include "libmesh/exodusII_io.h"
#include "libmesh/gnuplot_io.h"
#include "libmesh/linear_implicit_system.h"
#include "libmesh/equation_systems.h"
#include "libmesh/fe.h"
#include "libmesh/quadrature_gauss.h"
#include "libmesh/dof_map.h"
#include "libmesh/sparse_matrix.h"
#include "libmesh/numeric_vector.h"
#include "libmesh/dense_matrix.h"
#include "libmesh/dense_vector.h"
#include "libmesh/perf_log.h"
#include "libmesh/elem.h"
#include "libmesh/dirichlet_boundaries.h"
#include "libmesh/analytic_function.h"
#include "libmesh/string_to_enum.h"
#include "libmesh/getpot.h"
using namespace libMesh;
void assemble_poisson(EquationSystems& es,
const std::string& system_name);
Real exact_solution (const Real x,
const Real y,
const Real z = 0.);
void exact_solution_wrapper (DenseVector<Number>& output,
const Point& p,
const Real)
{
output(0) = exact_solution(p(0),
(LIBMESH_DIM>1)?p(1):0,
(LIBMESH_DIM>2)?p(2):0);
}
int main (int argc, char** argv)
{
LibMeshInit init (argc, argv);
GetPot command_line (argc, argv);
if (argc < 3)
{
if (libMesh::processor_id() == 0)
std::cerr << "Usage:\n"
<<"\t " << argv[0] << " -d 2(3)" << " -n 15"
<< std::endl;
libmesh_error();
}
else
{
std::cout << "Running " << argv[0];
for (int i=1; i<argc; i++)
std::cout << " " << argv[i];
std::cout << std::endl << std::endl;
}
int dim = 2;
if ( command_line.search(1, "-d") )
dim = command_line.next(dim);
libmesh_example_assert(dim <= LIBMESH_DIM, "2D/3D support");
int ps = 15;
if ( command_line.search(1, "-n") )
ps = command_line.next(ps);
std::string order = "SECOND";
if ( command_line.search(2, "-Order", "-o") )
order = command_line.next(order);
std::string family = "LAGRANGE";
if ( command_line.search(2, "-FEFamily", "-f") )
family = command_line.next(family);
if ((family == "MONOMIAL") || (family == "XYZ"))
{
if (libMesh::processor_id() == 0)
std::cerr << "ex4 currently requires a C^0 (or higher) FE basis." << std::endl;
libmesh_error();
}
Mesh mesh;
Real halfwidth = dim > 1 ? 1. : 0.;
Real halfheight = dim > 2 ? 1. : 0.;
if ((family == "LAGRANGE") && (order == "FIRST"))
{
MeshTools::Generation::build_cube (mesh,
ps,
(dim>1) ? ps : 0,
(dim>2) ? ps : 0,
-1., 1.,
-halfwidth, halfwidth,
-halfheight, halfheight,
(dim==1) ? EDGE2 :
((dim == 2) ? QUAD4 : HEX8));
}
else
{
MeshTools::Generation::build_cube (mesh,
ps,
(dim>1) ? ps : 0,
(dim>2) ? ps : 0,
-1., 1.,
-halfwidth, halfwidth,
-halfheight, halfheight,
(dim==1) ? EDGE3 :
((dim == 2) ? QUAD9 : HEX27));
}
mesh.print_info();
EquationSystems equation_systems (mesh);
LinearImplicitSystem& system =
equation_systems.add_system<LinearImplicitSystem> ("Poisson");
unsigned int u_var = system.add_variable("u",
Utility::string_to_enum<Order> (order),
Utility::string_to_enum<FEFamily>(family));
system.attach_assemble_function (assemble_poisson);
std::set<boundary_id_type> boundary_ids;
boundary_ids.insert(0);
boundary_ids.insert(1);
if(dim>=2)
{
boundary_ids.insert(2);
boundary_ids.insert(3);
}
if(dim==3)
{
boundary_ids.insert(4);
boundary_ids.insert(5);
}
std::vector<unsigned int> variables(1);
variables[0] = u_var;
AnalyticFunction<> exact_solution_object(exact_solution_wrapper);
DirichletBoundary dirichlet_bc(boundary_ids,
variables,
&exact_solution_object);
system.get_dof_map().add_dirichlet_boundary(dirichlet_bc);
equation_systems.init();
equation_systems.print_info();
mesh.print_info();
system.solve();
if(dim == 1)
{
GnuPlotIO plot(mesh,"Introduction Example 4, 1D",GnuPlotIO::GRID_ON);
plot.write_equation_systems("gnuplot_script",equation_systems);
}
#ifdef LIBMESH_HAVE_EXODUS_API
else
{
ExodusII_IO (mesh).write_equation_systems ((dim == 3) ?
"out_3.e" : "out_2.e",equation_systems);
}
#endif // #ifdef LIBMESH_HAVE_EXODUS_API
return 0;
}
void assemble_poisson(EquationSystems& es,
const std::string& system_name)
{
libmesh_assert_equal_to (system_name, "Poisson");
PerfLog perf_log ("Matrix Assembly");
const MeshBase& mesh = es.get_mesh();
const unsigned int dim = mesh.mesh_dimension();
LinearImplicitSystem& system = es.get_system<LinearImplicitSystem>("Poisson");
const DofMap& dof_map = system.get_dof_map();
FEType fe_type = dof_map.variable_type(0);
AutoPtr<FEBase> fe (FEBase::build(dim, fe_type));
QGauss qrule (dim, FIFTH);
fe->attach_quadrature_rule (&qrule);
AutoPtr<FEBase> fe_face (FEBase::build(dim, fe_type));
QGauss qface(dim-1, FIFTH);
fe_face->attach_quadrature_rule (&qface);
const std::vector<Real>& JxW = fe->get_JxW();
const std::vector<Point>& q_point = fe->get_xyz();
const std::vector<std::vector<Real> >& phi = fe->get_phi();
const std::vector<std::vector<RealGradient> >& dphi = fe->get_dphi();
DenseMatrix<Number> Ke;
DenseVector<Number> Fe;
std::vector<dof_id_type> dof_indices;
MeshBase::const_element_iterator el = mesh.active_local_elements_begin();
const MeshBase::const_element_iterator end_el = mesh.active_local_elements_end();
for ( ; el != end_el; ++el)
{
perf_log.push("elem init");
const Elem* elem = *el;
dof_map.dof_indices (elem, dof_indices);
fe->reinit (elem);
Ke.resize (dof_indices.size(),
dof_indices.size());
Fe.resize (dof_indices.size());
perf_log.pop("elem init");
perf_log.push ("Ke");
for (unsigned int qp=0; qp<qrule.n_points(); qp++)
for (unsigned int i=0; i<phi.size(); i++)
for (unsigned int j=0; j<phi.size(); j++)
Ke(i,j) += JxW[qp]*(dphi[i][qp]*dphi[j][qp]);
perf_log.pop ("Ke");
perf_log.push ("Fe");
for (unsigned int qp=0; qp<qrule.n_points(); qp++)
{
const Real x = q_point[qp](0);
#if LIBMESH_DIM > 1
const Real y = q_point[qp](1);
#else
const Real y = 0.;
#endif
#if LIBMESH_DIM > 2
const Real z = q_point[qp](2);
#else
const Real z = 0.;
#endif
const Real eps = 1.e-3;
const Real uxx = (exact_solution(x-eps,y,z) +
exact_solution(x+eps,y,z) +
-2.*exact_solution(x,y,z))/eps/eps;
const Real uyy = (exact_solution(x,y-eps,z) +
exact_solution(x,y+eps,z) +
-2.*exact_solution(x,y,z))/eps/eps;
const Real uzz = (exact_solution(x,y,z-eps) +
exact_solution(x,y,z+eps) +
-2.*exact_solution(x,y,z))/eps/eps;
Real fxy;
if(dim==1)
{
const Real pi = libMesh::pi;
fxy = (0.25*pi*pi)*sin(.5*pi*x);
}
else
{
fxy = - (uxx + uyy + ((dim==2) ? 0. : uzz));
}
for (unsigned int i=0; i<phi.size(); i++)
Fe(i) += JxW[qp]*fxy*phi[i][qp];
}
perf_log.pop ("Fe");
dof_map.heterogenously_constrain_element_matrix_and_vector (Ke, Fe, dof_indices);
perf_log.push ("matrix insertion");
system.matrix->add_matrix (Ke, dof_indices);
system.rhs->add_vector (Fe, dof_indices);
perf_log.pop ("matrix insertion");
}
}
The console output of the program:
***************************************************************
* Running Example introduction_ex4:
* mpirun -np 2 example-devel -d 1 -n 20 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
Running /workspace/libmesh/examples/introduction/introduction_ex4/.libs/lt-example-devel -d 1 -n 20 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
Mesh Information:
mesh_dimension()=1
spatial_dimension()=3
n_nodes()=41
n_local_nodes()=21
n_elem()=20
n_local_elem()=10
n_active_elem()=20
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
EquationSystems
n_systems()=1
System #0, "Poisson"
Type "LinearImplicit"
Variables="u"
Finite Element Types="LAGRANGE", "JACOBI_20_00"
Infinite Element Mapping="CARTESIAN"
Approximation Orders="SECOND", "THIRD"
n_dofs()=41
n_local_dofs()=21
n_constrained_dofs()=2
n_local_constrained_dofs()=1
n_vectors()=1
n_matrices()=1
DofMap Sparsity
Average On-Processor Bandwidth <= 3.85366
Average Off-Processor Bandwidth <= 0.097561
Maximum On-Processor Bandwidth <= 5
Maximum Off-Processor Bandwidth <= 2
DofMap Constraints
Number of DoF Constraints = 2
Average DoF Constraint Length= 0
Number of Node Constraints = 0
Mesh Information:
mesh_dimension()=1
spatial_dimension()=3
n_nodes()=41
n_local_nodes()=21
n_elem()=20
n_local_elem()=10
n_active_elem()=20
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
----------------------------------------------------------------------------------------------------------------------
| Processor id: 0 |
| Num Processors: 2 |
| Time: Tue Feb 5 13:38:02 2013 |
| OS: Linux |
| HostName: hbar.ices.utexas.edu |
| OS Release: 2.6.32-279.1.1.el6.x86_64 |
| OS Version: #1 SMP Tue Jul 10 11:24:23 CDT 2012 |
| Machine: x86_64 |
| Username: benkirk |
| Configuration: ./configure '--enable-everything' |
| '--prefix=/workspace/libmesh/install' |
| 'CXX=mpicxx' |
| 'CC=mpicc' |
| 'F77=mpif77' |
| 'FC=mpif90' |
| 'PETSC_DIR=/opt/apps/ossw/libraries/petsc/petsc-3.3-p2' |
| 'PETSC_ARCH=intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt' |
| 'SLEPC_DIR=/opt/apps/ossw/libraries/slepc/slepc-3.3-p2-petsc-3.3-p2-cxx-opt' |
| 'TRILINOS_DIR=/opt/apps/ossw/libraries/trilinos/trilinos-10.12.2/sl6/intel-12.1/mpich2-1.4.1p1/mkl-intel-10.3.12.361'|
| 'VTK_DIR=/opt/apps/ossw/libraries/vtk/vtk-5.10.0/sl6/intel-12.1' |
----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
| Matrix Assembly Performance: Alive time=0.010906, Active time=0.000976 |
-----------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------|
| |
| Fe 10 0.0000 0.000004 0.0000 0.000004 3.69 3.69 |
| Ke 10 0.0000 0.000004 0.0000 0.000004 3.79 3.79 |
| elem init 10 0.0008 0.000083 0.0008 0.000083 85.25 85.25 |
| matrix insertion 10 0.0001 0.000007 0.0001 0.000007 7.27 7.27 |
-----------------------------------------------------------------------------------------------------------
| Totals: 40 0.0010 100.00 |
-----------------------------------------------------------------------------------------------------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/workspace/libmesh/examples/introduction/introduction_ex4/.libs/lt-example-devel on a intel-12. named hbar.ices.utexas.edu with 2 processors, by benkirk Tue Feb 5 13:38:02 2013
Using Petsc Release Version 3.3.0, Patch 2, Fri Jul 13 15:42:00 CDT 2012
Max Max/Min Avg Total
Time (sec): 3.167e-02 1.00000 3.167e-02
Objects: 4.300e+01 1.00000 4.300e+01
Flops: 2.950e+03 1.05169 2.878e+03 5.755e+03
Flops/sec: 9.316e+04 1.05169 9.087e+04 1.817e+05
MPI Messages: 1.150e+01 1.00000 1.150e+01 2.300e+01
MPI Message Lengths: 1.240e+02 1.00000 1.078e+01 2.480e+02
MPI Reductions: 5.300e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 3.1634e-02 99.9% 5.7550e+03 100.0% 2.300e+01 100.0% 1.078e+01 100.0% 5.200e+01 98.1%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecMDot 3 1.0 2.4796e-05 1.0 2.46e+02 1.1 0.0e+00 0.0e+00 3.0e+00 0 8 0 0 6 0 8 0 0 6 19
VecNorm 5 1.0 3.3855e-05 1.0 2.10e+02 1.1 0.0e+00 0.0e+00 5.0e+00 0 7 0 0 9 0 7 0 0 10 12
VecScale 4 1.0 1.6928e-05 1.0 8.40e+01 1.1 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 10
VecCopy 2 1.0 5.0068e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 11 1.0 7.3910e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 2 1.0 1.4067e-05 1.1 8.40e+01 1.1 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 12
VecMAXPY 4 1.0 2.1458e-06 2.2 3.78e+02 1.1 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 0 13 0 0 0 344
VecAssemblyBegin 3 1.0 7.3910e-05 1.0 0.00e+00 0.0 2.0e+00 6.0e+00 9.0e+00 0 0 9 5 17 0 0 9 5 17 0
VecAssemblyEnd 3 1.0 2.4080e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 5 1.0 4.9829e-05 1.0 0.00e+00 0.0 1.0e+01 1.4e+01 0.0e+00 0 0 43 55 0 0 0 43 55 0 0
VecScatterEnd 5 1.0 1.2398e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 4 1.0 9.0599e-05 1.0 2.52e+02 1.1 0.0e+00 0.0e+00 4.0e+00 0 9 0 0 8 0 9 0 0 8 5
MatMult 4 1.0 8.2254e-05 1.0 5.80e+02 1.1 8.0e+00 1.2e+01 0.0e+00 0 20 35 39 0 0 20 35 39 0 14
MatSolve 5 1.0 7.1526e-06 1.0 1.06e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 36 0 0 0 0 36 0 0 0 292
MatLUFactorNum 1 1.0 2.5988e-05 1.0 3.03e+02 1.1 0.0e+00 0.0e+00 0.0e+00 0 10 0 0 0 0 10 0 0 0 23
MatILUFactorSym 1 1.0 8.8930e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 6 0 0 0 0 6 0
MatAssemblyBegin 2 1.0 7.8106e-04 5.4 0.00e+00 0.0 3.0e+00 1.7e+01 4.0e+00 1 0 13 21 8 1 0 13 21 8 0
MatAssemblyEnd 2 1.0 3.7885e-04 1.0 0.00e+00 0.0 4.0e+00 5.0e+00 8.0e+00 1 0 17 8 15 1 0 17 8 15 0
MatGetRowIJ 1 1.0 4.0531e-06 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 7.2956e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 4 0 0 0 0 4 0
MatZeroEntries 3 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 3 1.0 5.4836e-05 1.0 4.98e+02 1.1 0.0e+00 0.0e+00 3.0e+00 0 17 0 0 6 0 17 0 0 6 18
KSPSetUp 2 1.0 9.2030e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.5152e-03 1.0 2.95e+03 1.1 8.0e+00 1.2e+01 1.5e+01 5100 35 39 28 5100 35 39 29 4
PCSetUp 2 1.0 7.3218e-04 1.0 3.03e+02 1.1 0.0e+00 0.0e+00 7.0e+00 2 10 0 0 13 2 10 0 0 13 1
PCSetUpOnBlocks 1 1.0 3.3998e-04 1.0 3.03e+02 1.1 0.0e+00 0.0e+00 5.0e+00 1 10 0 0 9 1 10 0 0 10 2
PCApply 5 1.0 1.2517e-04 1.0 1.06e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 36 0 0 0 0 36 0 0 0 17
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 24 24 38872 0
Vector Scatter 2 2 2072 0
Index Set 7 7 5292 0
IS L to G Mapping 1 1 564 0
Matrix 4 4 13224 0
Krylov Solver 2 2 19360 0
Preconditioner 2 2 1784 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 6.19888e-07
Average time for zero size MPI_Send(): 1.44243e-05
#PETSc Option Table entries:
-d 1
-ksp_right_pc
-log_summary
-n 20
-pc_type bjacobi
-sub_pc_factor_levels 4
-sub_pc_factor_zeropivot 0
-sub_pc_type ilu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Nov 8 11:21:02 2012
Configure options: --with-debugging=false --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-clanguage=C++ --with-shared-libraries=1 --with-mpi-dir=/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1 --with-mumps=true --download-mumps=1 --with-metis=true --download-metis=1 --with-parmetis=true --download-parmetis=1 --with-superlu=true --download-superlu=1 --with-superludir=true --download-superlu_dist=1 --with-blacs=true --download-blacs=1 --with-scalapack=true --download-scalapack=1 --with-hypre=true --download-hypre=1 --with-blas-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_intel_lp64.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_sequential.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_core.so]" --with-lapack-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_lapack95_lp64.a]"
-----------------------------------------
Libraries compiled on Thu Nov 8 11:21:02 2012 on daedalus.ices.utexas.edu
Machine characteristics: Linux-2.6.32-279.1.1.el6.x86_64-x86_64-with-redhat-6.3-Carbon
Using PETSc directory: /opt/apps/ossw/libraries/petsc/petsc-3.3-p2
Using PETSc arch: intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt
-----------------------------------------
Using C compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx -wd1572 -O3 -fPIC ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/include
-----------------------------------------
Using C linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx
Using Fortran linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90
Using libraries: -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lpetsc -lX11 -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lHYPRE -lpthread -lsuperlu_dist_3.0 -lparmetis -lmetis -lscalapack -lblacs -lsuperlu_4.3 -Wl,-rpath,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -L/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,-rpath,/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -L/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -Wl,-rpath,/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
-----------------------------------------
--------------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=0.040639, Active time=0.024324 |
--------------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|--------------------------------------------------------------------------------------------------------------------|
| |
| |
| DofMap |
| add_neighbors_to_send_list() 1 0.0003 0.000265 0.0003 0.000313 1.09 1.29 |
| build_constraint_matrix_and_vector() 10 0.0001 0.000006 0.0001 0.000006 0.25 0.25 |
| build_sparsity() 1 0.0004 0.000370 0.0008 0.000834 1.52 3.43 |
| create_dof_constraints() 1 0.0007 0.000693 0.0011 0.001122 2.85 4.61 |
| distribute_dofs() 1 0.0005 0.000522 0.0012 0.001237 2.15 5.09 |
| dof_indices() 52 0.0014 0.000028 0.0014 0.000028 5.95 5.95 |
| hetero_cnstrn_elem_mat_vec() 10 0.0093 0.000930 0.0093 0.000930 38.22 38.22 |
| prepare_send_list() 1 0.0000 0.000008 0.0000 0.000008 0.03 0.03 |
| reinit() 1 0.0006 0.000557 0.0006 0.000557 2.29 2.29 |
| |
| EquationSystems |
| build_solution_vector() 1 0.0003 0.000269 0.0009 0.000860 1.11 3.54 |
| |
| FE |
| compute_shape_functions() 10 0.0001 0.000009 0.0001 0.000009 0.36 0.36 |
| init_shape_functions() 1 0.0001 0.000091 0.0001 0.000091 0.37 0.37 |
| |
| FEMap |
| compute_affine_map() 10 0.0001 0.000009 0.0001 0.000009 0.36 0.36 |
| init_reference_to_physical_map() 1 0.0001 0.000062 0.0001 0.000062 0.25 0.25 |
| |
| GnuPlotIO |
| write_nodal_data() 1 0.0010 0.000998 0.0010 0.000998 4.10 4.10 |
| |
| Mesh |
| find_neighbors() 1 0.0007 0.000694 0.0007 0.000748 2.85 3.08 |
| renumber_nodes_and_elem() 2 0.0001 0.000039 0.0001 0.000039 0.32 0.32 |
| |
| MeshCommunication |
| compute_hilbert_indices() 2 0.0003 0.000140 0.0003 0.000140 1.15 1.15 |
| find_global_indices() 2 0.0004 0.000187 0.0015 0.000763 1.54 6.27 |
| parallel_sort() 2 0.0005 0.000252 0.0006 0.000289 2.07 2.38 |
| |
| MeshOutput |
| write_equation_systems() 1 0.0001 0.000108 0.0020 0.002033 0.44 8.36 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0003 0.000295 0.0003 0.000295 1.21 1.21 |
| |
| MetisPartitioner |
| partition() 1 0.0007 0.000697 0.0011 0.001128 2.87 4.64 |
| |
| Parallel |
| allgather() 11 0.0002 0.000016 0.0002 0.000018 0.74 0.81 |
| max(bool) 1 0.0000 0.000003 0.0000 0.000003 0.01 0.01 |
| max(scalar) 113 0.0003 0.000003 0.0003 0.000003 1.19 1.19 |
| max(vector) 26 0.0002 0.000007 0.0004 0.000014 0.70 1.45 |
| min(bool) 131 0.0003 0.000002 0.0003 0.000002 1.33 1.33 |
| min(scalar) 107 0.0005 0.000004 0.0005 0.000004 1.98 1.98 |
| min(vector) 26 0.0002 0.000008 0.0004 0.000017 0.88 1.80 |
| probe() 12 0.0001 0.000007 0.0001 0.000007 0.35 0.35 |
| receive() 12 0.0001 0.000007 0.0002 0.000015 0.37 0.72 |
| send() 12 0.0001 0.000004 0.0001 0.000004 0.21 0.21 |
| send_receive() 16 0.0002 0.000010 0.0004 0.000028 0.68 1.83 |
| sum() 20 0.0001 0.000006 0.0002 0.000010 0.49 0.79 |
| |
| Parallel::Request |
| wait() 12 0.0000 0.000004 0.0000 0.000004 0.21 0.21 |
| |
| Partitioner |
| set_node_processor_ids() 1 0.0003 0.000269 0.0003 0.000344 1.11 1.41 |
| set_parent_processor_ids() 1 0.0001 0.000056 0.0001 0.000056 0.23 0.23 |
| |
| PetscLinearSolver |
| solve() 1 0.0029 0.002871 0.0029 0.002871 11.80 11.80 |
| |
| System |
| assemble() 1 0.0011 0.001063 0.0111 0.011142 4.37 45.81 |
--------------------------------------------------------------------------------------------------------------------
| Totals: 617 0.0243 100.00 |
--------------------------------------------------------------------------------------------------------------------
***************************************************************
* Done Running Example introduction_ex4:
* mpirun -np 2 example-devel -d 1 -n 20 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
***************************************************************
* Running Example introduction_ex4:
* mpirun -np 2 example-devel -d 2 -n 15 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
Running /workspace/libmesh/examples/introduction/introduction_ex4/.libs/lt-example-devel -d 2 -n 15 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
Mesh Information:
mesh_dimension()=2
spatial_dimension()=3
n_nodes()=961
n_local_nodes()=495
n_elem()=225
n_local_elem()=112
n_active_elem()=225
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
EquationSystems
n_systems()=1
System #0, "Poisson"
Type "LinearImplicit"
Variables="u"
Finite Element Types="LAGRANGE", "JACOBI_20_00"
Infinite Element Mapping="CARTESIAN"
Approximation Orders="SECOND", "THIRD"
n_dofs()=961
n_local_dofs()=495
n_constrained_dofs()=120
n_local_constrained_dofs()=61
n_vectors()=1
n_matrices()=1
DofMap Sparsity
Average On-Processor Bandwidth <= 14.8491
Average Off-Processor Bandwidth <= 0.520291
Maximum On-Processor Bandwidth <= 26
Maximum Off-Processor Bandwidth <= 14
DofMap Constraints
Number of DoF Constraints = 120
Number of Heterogenous Constraints= 118
Average DoF Constraint Length= 0
Number of Node Constraints = 0
Mesh Information:
mesh_dimension()=2
spatial_dimension()=3
n_nodes()=961
n_local_nodes()=495
n_elem()=225
n_local_elem()=112
n_active_elem()=225
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
----------------------------------------------------------------------------------------------------------------------
| Processor id: 0 |
| Num Processors: 2 |
| Time: Tue Feb 5 13:38:03 2013 |
| OS: Linux |
| HostName: hbar.ices.utexas.edu |
| OS Release: 2.6.32-279.1.1.el6.x86_64 |
| OS Version: #1 SMP Tue Jul 10 11:24:23 CDT 2012 |
| Machine: x86_64 |
| Username: benkirk |
| Configuration: ./configure '--enable-everything' |
| '--prefix=/workspace/libmesh/install' |
| 'CXX=mpicxx' |
| 'CC=mpicc' |
| 'F77=mpif77' |
| 'FC=mpif90' |
| 'PETSC_DIR=/opt/apps/ossw/libraries/petsc/petsc-3.3-p2' |
| 'PETSC_ARCH=intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt' |
| 'SLEPC_DIR=/opt/apps/ossw/libraries/slepc/slepc-3.3-p2-petsc-3.3-p2-cxx-opt' |
| 'TRILINOS_DIR=/opt/apps/ossw/libraries/trilinos/trilinos-10.12.2/sl6/intel-12.1/mpich2-1.4.1p1/mkl-intel-10.3.12.361'|
| 'VTK_DIR=/opt/apps/ossw/libraries/vtk/vtk-5.10.0/sl6/intel-12.1' |
----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
| Matrix Assembly Performance: Alive time=0.041035, Active time=0.027488 |
-----------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------|
| |
| Fe 112 0.0011 0.000010 0.0011 0.000010 4.14 4.14 |
| Ke 112 0.0060 0.000054 0.0060 0.000054 21.87 21.87 |
| elem init 112 0.0196 0.000175 0.0196 0.000175 71.26 71.26 |
| matrix insertion 112 0.0008 0.000007 0.0008 0.000007 2.73 2.73 |
-----------------------------------------------------------------------------------------------------------
| Totals: 448 0.0275 100.00 |
-----------------------------------------------------------------------------------------------------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/workspace/libmesh/examples/introduction/introduction_ex4/.libs/lt-example-devel on a intel-12. named hbar.ices.utexas.edu with 2 processors, by benkirk Tue Feb 5 13:38:03 2013
Using Petsc Release Version 3.3.0, Patch 2, Fri Jul 13 15:42:00 CDT 2012
Max Max/Min Avg Total
Time (sec): 2.088e-01 1.00000 2.088e-01
Objects: 6.200e+01 1.00000 6.200e+01
Flops: 4.532e+06 1.09418 4.337e+06 8.674e+06
Flops/sec: 2.170e+07 1.09418 2.077e+07 4.153e+07
MPI Messages: 4.150e+01 1.00000 4.150e+01 8.300e+01
MPI Message Lengths: 1.841e+04 1.00000 4.435e+02 3.681e+04
MPI Reductions: 1.120e+02 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.0880e-01 100.0% 8.6743e+06 100.0% 8.300e+01 100.0% 4.435e+02 100.0% 1.110e+02 99.1%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecMDot 32 1.0 3.6168e-04 1.3 4.63e+05 1.1 0.0e+00 0.0e+00 3.2e+01 0 10 0 0 29 0 10 0 0 29 2484
VecNorm 35 1.0 1.3900e-04 1.2 3.46e+04 1.1 0.0e+00 0.0e+00 3.5e+01 0 1 0 0 31 0 1 0 0 32 484
VecScale 34 1.0 3.7670e-05 1.0 1.68e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 867
VecCopy 3 1.0 4.0531e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 42 1.0 2.3603e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 4 1.0 1.7405e-05 1.4 3.96e+03 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 442
VecMAXPY 34 1.0 2.6107e-04 1.1 4.95e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 3681
VecAssemblyBegin 3 1.0 7.4863e-05 1.0 0.00e+00 0.0 2.0e+00 2.9e+02 9.0e+00 0 0 2 2 8 0 0 2 2 8 0
VecAssemblyEnd 3 1.0 2.5272e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 35 1.0 9.2268e-05 1.1 0.00e+00 0.0 7.0e+01 4.0e+02 0.0e+00 0 0 84 77 0 0 0 84 77 0 0
VecScatterEnd 35 1.0 4.1056e-0411.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 34 1.0 2.3794e-04 1.0 5.05e+04 1.1 0.0e+00 0.0e+00 3.4e+01 0 1 0 0 30 0 1 0 0 31 412
MatMult 34 1.0 9.8443e-04 1.6 5.01e+05 1.1 6.8e+01 4.0e+02 0.0e+00 0 11 82 73 0 0 11 82 73 0 978
MatSolve 35 1.0 1.2975e-03 1.1 2.15e+06 1.1 0.0e+00 0.0e+00 0.0e+00 1 47 0 0 0 1 47 0 0 0 3175
MatLUFactorNum 1 1.0 1.1680e-03 1.2 8.68e+05 1.1 0.0e+00 0.0e+00 0.0e+00 1 19 0 0 0 1 19 0 0 0 1391
MatILUFactorSym 1 1.0 3.0892e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 1 0 0 0 3 1 0 0 0 3 0
MatAssemblyBegin 2 1.0 1.2758e-03 8.5 0.00e+00 0.0 3.0e+00 2.3e+03 4.0e+00 0 0 4 19 4 0 0 4 19 4 0
MatAssemblyEnd 2 1.0 4.3607e-04 1.0 0.00e+00 0.0 4.0e+00 1.0e+02 8.0e+00 0 0 5 1 7 0 0 5 1 7 0
MatGetRowIJ 1 1.0 3.8147e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 7.7009e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 0
MatZeroEntries 3 1.0 3.8624e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 32 1.0 6.4540e-04 1.1 9.26e+05 1.1 0.0e+00 0.0e+00 3.2e+01 0 21 0 0 29 0 21 0 0 29 2786
KSPSetUp 2 1.0 1.0514e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 8.5211e-03 1.0 4.53e+06 1.1 6.8e+01 4.0e+02 7.4e+01 4100 82 73 66 4100 82 73 67 1018
PCSetUp 2 1.0 4.8590e-03 1.1 8.68e+05 1.1 0.0e+00 0.0e+00 7.0e+00 2 19 0 0 6 2 19 0 0 6 334
PCSetUpOnBlocks 1 1.0 4.4730e-03 1.1 8.68e+05 1.1 0.0e+00 0.0e+00 5.0e+00 2 19 0 0 4 2 19 0 0 5 363
PCApply 35 1.0 1.6732e-03 1.0 2.15e+06 1.1 0.0e+00 0.0e+00 0.0e+00 1 47 0 0 0 1 47 0 0 0 2462
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 43 43 215608 0
Vector Scatter 2 2 2072 0
Index Set 7 7 7700 0
IS L to G Mapping 1 1 564 0
Matrix 4 4 490648 0
Krylov Solver 2 2 19360 0
Preconditioner 2 2 1784 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 7.62939e-07
Average time for zero size MPI_Send(): 1.40667e-05
#PETSc Option Table entries:
-d 2
-ksp_right_pc
-log_summary
-n 15
-pc_type bjacobi
-sub_pc_factor_levels 4
-sub_pc_factor_zeropivot 0
-sub_pc_type ilu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Nov 8 11:21:02 2012
Configure options: --with-debugging=false --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-clanguage=C++ --with-shared-libraries=1 --with-mpi-dir=/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1 --with-mumps=true --download-mumps=1 --with-metis=true --download-metis=1 --with-parmetis=true --download-parmetis=1 --with-superlu=true --download-superlu=1 --with-superludir=true --download-superlu_dist=1 --with-blacs=true --download-blacs=1 --with-scalapack=true --download-scalapack=1 --with-hypre=true --download-hypre=1 --with-blas-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_intel_lp64.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_sequential.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_core.so]" --with-lapack-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_lapack95_lp64.a]"
-----------------------------------------
Libraries compiled on Thu Nov 8 11:21:02 2012 on daedalus.ices.utexas.edu
Machine characteristics: Linux-2.6.32-279.1.1.el6.x86_64-x86_64-with-redhat-6.3-Carbon
Using PETSc directory: /opt/apps/ossw/libraries/petsc/petsc-3.3-p2
Using PETSc arch: intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt
-----------------------------------------
Using C compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx -wd1572 -O3 -fPIC ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/include
-----------------------------------------
Using C linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx
Using Fortran linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90
Using libraries: -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lpetsc -lX11 -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lHYPRE -lpthread -lsuperlu_dist_3.0 -lparmetis -lmetis -lscalapack -lblacs -lsuperlu_4.3 -Wl,-rpath,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -L/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,-rpath,/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -L/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -Wl,-rpath,/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
-----------------------------------------
--------------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=0.220235, Active time=0.193837 |
--------------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|--------------------------------------------------------------------------------------------------------------------|
| |
| |
| DofMap |
| add_neighbors_to_send_list() 1 0.0099 0.009928 0.0132 0.013166 5.12 6.79 |
| build_constraint_matrix_and_vector() 112 0.0013 0.000012 0.0013 0.000012 0.67 0.67 |
| build_sparsity() 1 0.0069 0.006925 0.0193 0.019285 3.57 9.95 |
| create_dof_constraints() 1 0.0078 0.007833 0.0397 0.039691 4.04 20.48 |
| distribute_dofs() 1 0.0126 0.012635 0.0302 0.030215 6.52 15.59 |
| dof_indices() 593 0.0601 0.000101 0.0601 0.000101 31.01 31.01 |
| hetero_cnstrn_elem_mat_vec() 112 0.0104 0.000093 0.0104 0.000093 5.38 5.38 |
| prepare_send_list() 1 0.0001 0.000074 0.0001 0.000074 0.04 0.04 |
| reinit() 1 0.0170 0.017032 0.0170 0.017032 8.79 8.79 |
| |
| EquationSystems |
| build_solution_vector() 1 0.0010 0.001044 0.0129 0.012891 0.54 6.65 |
| |
| ExodusII_IO |
| write_nodal_data() 1 0.0039 0.003932 0.0039 0.003932 2.03 2.03 |
| |
| FE |
| compute_shape_functions() 172 0.0057 0.000033 0.0057 0.000033 2.92 2.92 |
| init_shape_functions() 61 0.0004 0.000006 0.0004 0.000006 0.19 0.19 |
| inverse_map() 180 0.0028 0.000016 0.0028 0.000016 1.46 1.46 |
| |
| FEMap |
| compute_affine_map() 172 0.0024 0.000014 0.0024 0.000014 1.26 1.26 |
| compute_face_map() 60 0.0019 0.000032 0.0048 0.000079 0.98 2.45 |
| init_face_shape_functions() 60 0.0004 0.000006 0.0004 0.000006 0.20 0.20 |
| init_reference_to_physical_map() 61 0.0021 0.000035 0.0021 0.000035 1.11 1.11 |
| |
| Mesh |
| find_neighbors() 1 0.0049 0.004869 0.0049 0.004948 2.51 2.55 |
| renumber_nodes_and_elem() 2 0.0005 0.000238 0.0005 0.000238 0.25 0.25 |
| |
| MeshCommunication |
| compute_hilbert_indices() 2 0.0034 0.001722 0.0034 0.001722 1.78 1.78 |
| find_global_indices() 2 0.0016 0.000790 0.0064 0.003198 0.81 3.30 |
| parallel_sort() 2 0.0009 0.000438 0.0010 0.000499 0.45 0.52 |
| |
| MeshOutput |
| write_equation_systems() 1 0.0001 0.000118 0.0170 0.017011 0.06 8.78 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0025 0.002506 0.0025 0.002506 1.29 1.29 |
| |
| MetisPartitioner |
| partition() 1 0.0047 0.004662 0.0075 0.007477 2.41 3.86 |
| |
| Parallel |
| allgather() 11 0.0004 0.000040 0.0005 0.000043 0.23 0.24 |
| max(bool) 1 0.0000 0.000005 0.0000 0.000005 0.00 0.00 |
| max(scalar) 113 0.0004 0.000003 0.0004 0.000003 0.20 0.20 |
| max(vector) 26 0.0002 0.000009 0.0005 0.000019 0.12 0.26 |
| min(bool) 131 0.0004 0.000003 0.0004 0.000003 0.22 0.22 |
| min(scalar) 107 0.0019 0.000018 0.0019 0.000018 0.97 0.97 |
| min(vector) 26 0.0003 0.000011 0.0006 0.000022 0.14 0.29 |
| probe() 12 0.0003 0.000022 0.0003 0.000022 0.13 0.13 |
| receive() 12 0.0001 0.000011 0.0004 0.000032 0.07 0.20 |
| send() 12 0.0001 0.000006 0.0001 0.000006 0.04 0.04 |
| send_receive() 16 0.0003 0.000017 0.0008 0.000050 0.14 0.41 |
| sum() 20 0.0002 0.000010 0.0006 0.000032 0.10 0.33 |
| |
| Parallel::Request |
| wait() 12 0.0001 0.000005 0.0001 0.000005 0.03 0.03 |
| |
| Partitioner |
| set_node_processor_ids() 1 0.0010 0.001015 0.0011 0.001117 0.52 0.58 |
| set_parent_processor_ids() 1 0.0004 0.000415 0.0004 0.000415 0.21 0.21 |
| |
| PetscLinearSolver |
| solve() 1 0.0112 0.011195 0.0112 0.011195 5.78 5.78 |
| |
| System |
| assemble() 1 0.0111 0.011080 0.0413 0.041279 5.72 21.30 |
--------------------------------------------------------------------------------------------------------------------
| Totals: 2106 0.1938 100.00 |
--------------------------------------------------------------------------------------------------------------------
***************************************************************
* Done Running Example introduction_ex4:
* mpirun -np 2 example-devel -d 2 -n 15 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
***************************************************************
* Running Example introduction_ex4:
* mpirun -np 2 example-devel -d 3 -n 6 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
Running /workspace/libmesh/examples/introduction/introduction_ex4/.libs/lt-example-devel -d 3 -n 6 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
Mesh Information:
mesh_dimension()=3
spatial_dimension()=3
n_nodes()=2197
n_local_nodes()=1209
n_elem()=216
n_local_elem()=108
n_active_elem()=216
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
EquationSystems
n_systems()=1
System #0, "Poisson"
Type "LinearImplicit"
Variables="u"
Finite Element Types="LAGRANGE", "JACOBI_20_00"
Infinite Element Mapping="CARTESIAN"
Approximation Orders="SECOND", "THIRD"
n_dofs()=2197
n_local_dofs()=1209
n_constrained_dofs()=866
n_local_constrained_dofs()=461
n_vectors()=1
n_matrices()=1
DofMap Sparsity
Average On-Processor Bandwidth <= 49.5576
Average Off-Processor Bandwidth <= 5.44197
Maximum On-Processor Bandwidth <= 130
Maximum Off-Processor Bandwidth <= 80
DofMap Constraints
Number of DoF Constraints = 866
Number of Heterogenous Constraints= 818
Average DoF Constraint Length= 0
Number of Node Constraints = 0
Mesh Information:
mesh_dimension()=3
spatial_dimension()=3
n_nodes()=2197
n_local_nodes()=1209
n_elem()=216
n_local_elem()=108
n_active_elem()=216
n_subdomains()=1
n_partitions()=2
n_processors()=2
n_threads()=1
processor_id()=0
----------------------------------------------------------------------------------------------------------------------
| Processor id: 0 |
| Num Processors: 2 |
| Time: Tue Feb 5 13:38:04 2013 |
| OS: Linux |
| HostName: hbar.ices.utexas.edu |
| OS Release: 2.6.32-279.1.1.el6.x86_64 |
| OS Version: #1 SMP Tue Jul 10 11:24:23 CDT 2012 |
| Machine: x86_64 |
| Username: benkirk |
| Configuration: ./configure '--enable-everything' |
| '--prefix=/workspace/libmesh/install' |
| 'CXX=mpicxx' |
| 'CC=mpicc' |
| 'F77=mpif77' |
| 'FC=mpif90' |
| 'PETSC_DIR=/opt/apps/ossw/libraries/petsc/petsc-3.3-p2' |
| 'PETSC_ARCH=intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt' |
| 'SLEPC_DIR=/opt/apps/ossw/libraries/slepc/slepc-3.3-p2-petsc-3.3-p2-cxx-opt' |
| 'TRILINOS_DIR=/opt/apps/ossw/libraries/trilinos/trilinos-10.12.2/sl6/intel-12.1/mpich2-1.4.1p1/mkl-intel-10.3.12.361'|
| 'VTK_DIR=/opt/apps/ossw/libraries/vtk/vtk-5.10.0/sl6/intel-12.1' |
----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
| Matrix Assembly Performance: Alive time=0.167865, Active time=0.152714 |
-----------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------|
| |
| Fe 108 0.0033 0.000030 0.0033 0.000030 2.15 2.15 |
| Ke 108 0.0940 0.000870 0.0940 0.000870 61.54 61.54 |
| elem init 108 0.0527 0.000488 0.0527 0.000488 34.54 34.54 |
| matrix insertion 108 0.0027 0.000025 0.0027 0.000025 1.77 1.77 |
-----------------------------------------------------------------------------------------------------------
| Totals: 432 0.1527 100.00 |
-----------------------------------------------------------------------------------------------------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/workspace/libmesh/examples/introduction/introduction_ex4/.libs/lt-example-devel on a intel-12. named hbar.ices.utexas.edu with 2 processors, by benkirk Tue Feb 5 13:38:04 2013
Using Petsc Release Version 3.3.0, Patch 2, Fri Jul 13 15:42:00 CDT 2012
Max Max/Min Avg Total
Time (sec): 9.346e-01 1.00002 9.346e-01
Objects: 5.300e+01 1.00000 5.300e+01
Flops: 6.867e+07 1.77224 5.371e+07 1.074e+08
Flops/sec: 7.347e+07 1.77220 5.746e+07 1.149e+08
MPI Messages: 2.950e+01 1.00000 2.950e+01 5.900e+01
MPI Message Lengths: 1.622e+05 1.00000 5.499e+03 3.244e+05
MPI Reductions: 8.900e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 9.3457e-01 100.0% 1.0741e+08 100.0% 5.900e+01 100.0% 5.499e+03 100.0% 8.800e+01 98.9%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecMDot 21 1.0 2.7077e-0312.8 5.58e+05 1.2 0.0e+00 0.0e+00 2.1e+01 0 1 0 0 24 0 1 0 0 24 375
VecNorm 23 1.0 1.9979e-04 2.7 5.56e+04 1.2 0.0e+00 0.0e+00 2.3e+01 0 0 0 0 26 0 0 0 0 26 506
VecScale 22 1.0 2.9325e-05 1.0 2.66e+04 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1648
VecCopy 2 1.0 5.0068e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 29 1.0 2.5034e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 2 1.0 1.9073e-05 1.2 4.84e+03 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 461
VecMAXPY 22 1.0 1.6260e-04 1.2 6.09e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 6810
VecAssemblyBegin 3 1.0 6.3419e-05 1.1 0.00e+00 0.0 2.0e+00 2.6e+03 9.0e+00 0 0 3 2 10 0 0 3 2 10 0
VecAssemblyEnd 3 1.0 1.8835e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 23 1.0 1.0157e-04 1.2 0.00e+00 0.0 4.6e+01 2.7e+03 0.0e+00 0 0 78 39 0 0 0 78 39 0 0
VecScatterEnd 23 1.0 6.4562e-021265.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
VecNormalize 22 1.0 1.8382e-04 1.1 7.98e+04 1.2 0.0e+00 0.0e+00 2.2e+01 0 0 0 0 25 0 0 0 0 25 789
MatMult 22 1.0 6.5696e-0246.7 2.91e+06 1.3 4.4e+01 2.7e+03 0.0e+00 4 5 75 36 0 4 5 75 36 0 78
MatSolve 23 1.0 7.3020e-03 1.5 2.08e+07 1.5 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 1 32 0 0 0 4734
MatLUFactorNum 1 1.0 2.4795e-02 1.9 4.37e+07 2.0 0.0e+00 0.0e+00 0.0e+00 2 61 0 0 0 2 61 0 0 0 2639
MatILUFactorSym 1 1.0 1.2258e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 10 0 0 0 3 10 0 0 0 3 0
MatAssemblyBegin 2 1.0 1.1172e-03 7.2 0.00e+00 0.0 3.0e+00 6.2e+04 4.0e+00 0 0 5 58 4 0 0 5 58 5 0
MatAssemblyEnd 2 1.0 1.2221e-03 1.1 0.00e+00 0.0 4.0e+00 6.6e+02 8.0e+00 0 0 7 1 9 0 0 7 1 9 0
MatGetRowIJ 1 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 5.5075e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 0
MatZeroEntries 3 1.0 1.9026e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 21 1.0 2.8841e-03 7.1 1.12e+06 1.2 0.0e+00 0.0e+00 2.1e+01 0 2 0 0 24 0 2 0 0 24 704
KSPSetUp 2 1.0 6.7949e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.5778e-01 1.0 6.87e+07 1.8 4.4e+01 2.7e+03 5.1e+01 17100 75 36 57 17100 75 36 58 681
PCSetUp 2 1.0 1.4779e-01 1.8 4.37e+07 2.0 0.0e+00 0.0e+00 7.0e+00 12 61 0 0 8 12 61 0 0 8 443
PCSetUpOnBlocks 1 1.0 1.4753e-01 1.8 4.37e+07 2.0 0.0e+00 0.0e+00 5.0e+00 12 61 0 0 6 12 61 0 0 6 444
PCApply 23 1.0 7.5138e-03 1.4 2.08e+07 1.5 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 1 32 0 0 0 4601
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 34 34 338168 0
Vector Scatter 2 2 2072 0
Index Set 7 7 13564 0
IS L to G Mapping 1 1 564 0
Matrix 4 4 6329384 0
Krylov Solver 2 2 19360 0
Preconditioner 2 2 1784 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 1.19209e-06
Average time for zero size MPI_Send(): 1.15633e-05
#PETSc Option Table entries:
-d 3
-ksp_right_pc
-log_summary
-n 6
-pc_type bjacobi
-sub_pc_factor_levels 4
-sub_pc_factor_zeropivot 0
-sub_pc_type ilu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Nov 8 11:21:02 2012
Configure options: --with-debugging=false --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-clanguage=C++ --with-shared-libraries=1 --with-mpi-dir=/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1 --with-mumps=true --download-mumps=1 --with-metis=true --download-metis=1 --with-parmetis=true --download-parmetis=1 --with-superlu=true --download-superlu=1 --with-superludir=true --download-superlu_dist=1 --with-blacs=true --download-blacs=1 --with-scalapack=true --download-scalapack=1 --with-hypre=true --download-hypre=1 --with-blas-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_intel_lp64.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_sequential.so,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_core.so]" --with-lapack-lib="[/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64/libmkl_lapack95_lp64.a]"
-----------------------------------------
Libraries compiled on Thu Nov 8 11:21:02 2012 on daedalus.ices.utexas.edu
Machine characteristics: Linux-2.6.32-279.1.1.el6.x86_64-x86_64-with-redhat-6.3-Carbon
Using PETSc directory: /opt/apps/ossw/libraries/petsc/petsc-3.3-p2
Using PETSc arch: intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt
-----------------------------------------
Using C compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx -wd1572 -O3 -fPIC ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/include -I/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/include -I/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/include
-----------------------------------------
Using C linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpicxx
Using Fortran linker: /opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/bin/mpif90
Using libraries: -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lpetsc -lX11 -Wl,-rpath,/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -L/opt/apps/ossw/libraries/petsc/petsc-3.3-p2/intel-12.1-mkl-intel-10.3.12.361-mpich2-1.4.1p1-cxx-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lHYPRE -lpthread -lsuperlu_dist_3.0 -lparmetis -lmetis -lscalapack -lblacs -lsuperlu_4.3 -Wl,-rpath,/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -L/opt/apps/sysnet/intel/12.1/mkl/10.3.12.361/lib/intel64 -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,-rpath,/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -L/opt/apps/ossw/libraries/mpich2/mpich2-1.4.1p1/sl6/intel-12.1/lib -Wl,-rpath,/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/apps/sysnet/intel/12.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
-----------------------------------------
--------------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=0.943603, Active time=0.918806 |
--------------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|--------------------------------------------------------------------------------------------------------------------|
| |
| |
| DofMap |
| add_neighbors_to_send_list() 1 0.0170 0.017004 0.0340 0.033987 1.85 3.70 |
| build_constraint_matrix_and_vector() 108 0.0039 0.000036 0.0039 0.000036 0.43 0.43 |
| build_sparsity() 1 0.0281 0.028108 0.0503 0.050349 3.06 5.48 |
| create_dof_constraints() 1 0.0444 0.044394 0.4238 0.423842 4.83 46.13 |
| distribute_dofs() 1 0.0161 0.016061 0.0376 0.037558 1.75 4.09 |
| dof_indices() 636 0.1127 0.000177 0.1127 0.000177 12.27 12.27 |
| hetero_cnstrn_elem_mat_vec() 108 0.0098 0.000091 0.0098 0.000091 1.07 1.07 |
| prepare_send_list() 1 0.0005 0.000458 0.0005 0.000458 0.05 0.05 |
| reinit() 1 0.0210 0.021044 0.0210 0.021044 2.29 2.29 |
| |
| EquationSystems |
| build_solution_vector() 1 0.0010 0.000968 0.0199 0.019917 0.11 2.17 |
| |
| ExodusII_IO |
| write_nodal_data() 1 0.0039 0.003933 0.0039 0.003933 0.43 0.43 |
| |
| FE |
| compute_shape_functions() 1116 0.0495 0.000044 0.0495 0.000044 5.39 5.39 |
| init_shape_functions() 1009 0.0045 0.000005 0.0045 0.000005 0.49 0.49 |
| inverse_map() 2376 0.0818 0.000034 0.0818 0.000034 8.91 8.91 |
| |
| FEMap |
| compute_affine_map() 1116 0.0225 0.000020 0.0225 0.000020 2.44 2.44 |
| compute_edge_map() 792 0.0051 0.000006 0.0051 0.000006 0.55 0.55 |
| compute_face_map() 216 0.0063 0.000029 0.0063 0.000029 0.69 0.69 |
| init_edge_shape_functions() 792 0.0032 0.000004 0.0032 0.000004 0.34 0.34 |
| init_face_shape_functions() 216 0.0121 0.000056 0.0121 0.000056 1.32 1.32 |
| init_reference_to_physical_map() 1009 0.1866 0.000185 0.1866 0.000185 20.31 20.31 |
| |
| Mesh |
| find_neighbors() 1 0.0041 0.004054 0.0042 0.004216 0.44 0.46 |
| renumber_nodes_and_elem() 2 0.0007 0.000360 0.0007 0.000360 0.08 0.08 |
| |
| MeshCommunication |
| compute_hilbert_indices() 2 0.0020 0.001005 0.0020 0.001005 0.22 0.22 |
| find_global_indices() 2 0.0009 0.000451 0.0039 0.001961 0.10 0.43 |
| parallel_sort() 2 0.0006 0.000324 0.0007 0.000361 0.07 0.08 |
| |
| MeshOutput |
| write_equation_systems() 1 0.0001 0.000120 0.0240 0.024021 0.01 2.61 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0033 0.003265 0.0033 0.003265 0.36 0.36 |
| |
| MetisPartitioner |
| partition() 1 0.0045 0.004485 0.0061 0.006132 0.49 0.67 |
| |
| Parallel |
| allgather() 11 0.0002 0.000018 0.0002 0.000019 0.02 0.02 |
| max(bool) 1 0.0000 0.000004 0.0000 0.000004 0.00 0.00 |
| max(scalar) 113 0.0003 0.000002 0.0003 0.000002 0.03 0.03 |
| max(vector) 26 0.0002 0.000006 0.0003 0.000013 0.02 0.04 |
| min(bool) 131 0.0003 0.000002 0.0003 0.000002 0.03 0.03 |
| min(scalar) 107 0.0066 0.000061 0.0066 0.000061 0.71 0.71 |
| min(vector) 26 0.0002 0.000008 0.0004 0.000016 0.02 0.04 |
| probe() 12 0.0005 0.000043 0.0005 0.000043 0.06 0.06 |
| receive() 12 0.0001 0.000010 0.0006 0.000053 0.01 0.07 |
| send() 12 0.0001 0.000005 0.0001 0.000005 0.01 0.01 |
| send_receive() 16 0.0003 0.000020 0.0011 0.000066 0.04 0.12 |
| sum() 20 0.0002 0.000008 0.0002 0.000011 0.02 0.02 |
| |
| Parallel::Request |
| wait() 12 0.0000 0.000003 0.0000 0.000003 0.00 0.00 |
| |
| Partitioner |
| set_node_processor_ids() 1 0.0011 0.001069 0.0012 0.001162 0.12 0.13 |
| set_parent_processor_ids() 1 0.0002 0.000190 0.0002 0.000190 0.02 0.02 |
| |
| PetscLinearSolver |
| solve() 1 0.1598 0.159785 0.1598 0.159785 17.39 17.39 |
| |
| System |
| assemble() 1 0.1026 0.102602 0.1680 0.168042 11.17 18.29 |
--------------------------------------------------------------------------------------------------------------------
| Totals: 10017 0.9188 100.00 |
--------------------------------------------------------------------------------------------------------------------
***************************************************************
* Done Running Example introduction_ex4:
* mpirun -np 2 example-devel -d 3 -n 6 -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 -sub_pc_factor_zeropivot 0 -ksp_right_pc -log_summary
***************************************************************
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
C++ Includes