Dear Vikram,
I switched to the PatchRecoveryErrorEstimator.
The AMR simulations are faster than before, but still much slower than the uniform mesh case.
Most of the time is still spent in the projections.
Let me know if you have any suggestion.
Thanks a lot for your help,
All the best,
Simone
AMR 1 refinement
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=230.416, Active time=153.727 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 1308971 2.0650 0.000002 2.0650 0.000002 1.34 1.34 |
| |
| DofMap |
| add_neighbors_to_send_list() 102 1.2799 0.012548 4.8744 0.047788 0.83 3.17 |
| build_sparsity() 102 6.6201 0.064903 15.3341 0.150334 4.31 9.97 |
| create_dof_constraints() 102 0.1584 0.001553 0.2895 0.002838 0.10 0.19 |
| distribute_dofs() 102 0.1486 0.001457 5.6969 0.055852 0.10 3.71 |
| dof_indices() 19634123 17.7785 0.000001 17.7785 0.000001 11.56 11.56 |
| enforce_constraints_exactly() 303 0.1809 0.000597 0.1809 0.000597 0.12 0.12 |
| old_dof_indices() 11672412 11.0857 0.000001 11.0857 0.000001 7.21 7.21 |
| prepare_send_list() 103 0.0001 0.000001 0.0001 0.000001 0.00 0.00 |
| reinit() 102 0.6735 0.006603 0.6735 0.006603 0.44 0.44 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 0.1823 0.036460 0.3190 0.063797 0.12 0.21 |
| build_solution_vector() 5 0.0001 0.000020 0.3191 0.063818 0.00 0.21 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0096 0.003193 0.0096 0.003193 0.01 0.01 |
| |
| FE |
| compute_shape_functions() 7892622 7.8140 0.000001 7.8140 0.000001 5.08 5.08 |
| init_shape_functions() 238324 1.0645 0.000004 1.0645 0.000004 0.69 0.69 |
| inverse_map() 168619 0.2099 0.000001 0.2099 0.000001 0.14 0.14 |
| |
| FEMap |
| compute_affine_map() 7892622 8.1976 0.000001 8.1976 0.000001 5.33 5.33 |
| init_reference_to_physical_map() 238324 0.6433 0.000003 0.6433 0.000003 0.42 0.42 |
| |
| GMVIO |
| write_nodal_data() 2 0.1544 0.077185 0.1544 0.077185 0.10 0.10 |
| |
| GenericProjector |
| copy_dofs 3813348 15.6784 0.000004 59.6722 0.000016 10.20 38.82 |
| operator() 304 11.9653 0.039359 98.8752 0.325247 7.78 64.32 |
| project_edges 88377 0.0687 0.000001 0.0687 0.000001 0.04 0.04 |
| project_interior 88377 0.0678 0.000001 0.0678 0.000001 0.04 0.04 |
| project_nodes 88377 0.3843 0.000004 5.5123 0.000062 0.25 3.59 |
| project_sides 88377 0.0691 0.000001 0.0691 0.000001 0.04 0.04 |
| |
| Mesh |
| contract() 101 0.0302 0.000299 0.0582 0.000576 0.02 0.04 |
| find_neighbors() 102 1.4291 0.014010 1.4291 0.014010 0.93 0.93 |
| renumber_nodes_and_elem() 305 0.0836 0.000274 0.0836 0.000274 0.05 0.05 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000015 0.4831 0.096625 0.00 0.31 |
| |
| MeshRefinement |
| _coarsen_elements() 202 0.0808 0.000400 0.0808 0.000400 0.05 0.05 |
| _refine_elements() 202 0.1724 0.000854 0.3641 0.001803 0.11 0.24 |
| add_node() 90496 0.0793 0.000001 0.0793 0.000001 0.05 0.05 |
| make_coarsening_compatible() 270 0.3739 0.001385 0.3739 0.001385 0.24 0.24 |
| make_flags_parallel_consistent() 303 0.2229 0.000736 0.2229 0.000736 0.14 0.14 |
| make_refinement_compatible() 270 0.0388 0.000144 0.0388 0.000144 0.03 0.03 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0045 0.004484 0.0045 0.004484 0.00 0.00 |
| |
| OldSolutionValue |
| Number eval_at_node() 304356 0.3435 0.000001 4.7509 0.000016 0.22 3.09 |
| check_old_context(c) 3813348 11.2585 0.000003 27.9372 0.000007 7.32 18.17 |
| check_old_context(c,p) 103944 0.2723 0.000003 0.6305 0.000006 0.18 0.41 |
| eval_at_point() 103944 1.4186 0.000014 4.2789 0.000041 0.92 2.78 |
| eval_old_dofs() 3813348 6.6561 0.000002 39.2936 0.000010 4.33 25.56 |
| |
| Parallel |
| allgather() 102 0.0001 0.000001 0.0001 0.000001 0.00 0.00 |
| |
| Partitioner |
| single_partition() 102 0.0321 0.000314 0.0321 0.000314 0.02 0.02 |
| |
| PatchRecoveryErrorEstimator |
| estimate_error() 101 25.9830 0.257257 61.0740 0.604693 16.90 39.73 |
| |
| PetscLinearSolver |
| solve() 202 1.6808 0.008321 1.6808 0.008321 1.09 1.09 |
| |
| StatisticsVector |
| maximum() 101 0.0018 0.000017 0.0018 0.000017 0.00 0.00 |
| |
| System |
| assemble() 202 11.6496 0.057671 29.2803 0.144952 7.58 19.05 |
| project_fem_vector() 1 0.0004 0.000423 0.2601 0.260093 0.00 0.17 |
| project_vector(FunctionBase) 1 0.0000 0.000009 0.2601 0.260103 0.00 0.17 |
| project_vector(old,new) 303 5.3281 0.017585 112.7034 0.371958 3.47 73.31 |
| |
| TopologyMap |
| init() 202 0.0871 0.000431 0.0871 0.000431 0.06 0.06 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 6.145e+07 153.7275 100.00 |
-----------------------------------------------------------------------------------------------------------------
AMR 2 refinements
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=84.948, Active time=56.8324 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 476364 0.7400 0.000002 0.7400 0.000002 1.30 1.30 |
| |
| DofMap |
| add_neighbors_to_send_list() 203 0.4752 0.002341 1.7849 0.008793 0.84 3.14 |
| build_sparsity() 203 2.8012 0.013799 6.0101 0.029606 4.93 10.58 |
| create_dof_constraints() 203 0.2178 0.001073 0.4982 0.002454 0.38 0.88 |
| distribute_dofs() 203 0.0647 0.000319 2.1029 0.010359 0.11 3.70 |
| dof_indices() 6846785 6.2077 0.000001 6.2077 0.000001 10.92 10.92 |
| enforce_constraints_exactly() 606 0.3776 0.000623 0.3776 0.000623 0.66 0.66 |
| old_dof_indices() 4266276 4.0893 0.000001 4.0893 0.000001 7.20 7.20 |
| prepare_send_list() 204 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
| reinit() 203 0.2524 0.001243 0.2524 0.001243 0.44 0.44 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 0.0295 0.005894 0.0509 0.010187 0.05 0.09 |
| build_solution_vector() 5 0.0001 0.000014 0.0510 0.010202 0.00 0.09 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0025 0.000837 0.0025 0.000837 0.00 0.00 |
| |
| FE |
| compute_shape_functions() 2537219 2.4867 0.000001 2.4867 0.000001 4.38 4.38 |
| init_shape_functions() 132687 0.6792 0.000005 0.6792 0.000005 1.20 1.20 |
| inverse_map() 149560 0.1817 0.000001 0.1817 0.000001 0.32 0.32 |
| |
| FEMap |
| compute_affine_map() 2537219 2.6492 0.000001 2.6492 0.000001 4.66 4.66 |
| init_reference_to_physical_map() 132687 0.3793 0.000003 0.3793 0.000003 0.67 0.67 |
| |
| GMVIO |
| write_nodal_data() 2 0.0492 0.024577 0.0492 0.024577 0.09 0.09 |
| |
| GenericProjector |
| copy_dofs 1341762 5.4628 0.000004 20.9821 0.000016 9.61 36.92 |
| operator() 607 4.6860 0.007720 38.8120 0.063941 8.25 68.29 |
| project_edges 83040 0.0645 0.000001 0.0645 0.000001 0.11 0.11 |
| project_interior 83040 0.0644 0.000001 0.0644 0.000001 0.11 0.11 |
| project_nodes 83040 0.3899 0.000005 4.8299 0.000058 0.69 8.50 |
| project_sides 83040 0.0655 0.000001 0.0655 0.000001 0.12 0.12 |
| |
| Mesh |
| contract() 202 0.0157 0.000078 0.0286 0.000142 0.03 0.05 |
| find_neighbors() 203 0.5877 0.002895 0.5877 0.002895 1.03 1.03 |
| renumber_nodes_and_elem() 608 0.0372 0.000061 0.0372 0.000061 0.07 0.07 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000014 0.1028 0.020552 0.00 0.18 |
| |
| MeshRefinement |
| _coarsen_elements() 404 0.0362 0.000090 0.0362 0.000090 0.06 0.06 |
| _refine_elements() 404 0.1353 0.000335 0.3436 0.000851 0.24 0.60 |
| add_node() 96928 0.0864 0.000001 0.0864 0.000001 0.15 0.15 |
| make_coarsening_compatible() 494 0.2403 0.000486 0.2403 0.000486 0.42 0.42 |
| make_flags_parallel_consistent() 606 0.0939 0.000155 0.0939 0.000155 0.17 0.17 |
| make_refinement_compatible() 494 0.0151 0.000031 0.0151 0.000031 0.03 0.03 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0007 0.000688 0.0007 0.000688 0.00 0.00 |
| |
| OldSolutionValue |
| Number eval_at_node() 326016 0.3474 0.000001 4.0303 0.000012 0.61 7.09 |
| check_old_context(c) 1341762 3.8804 0.000003 9.8097 0.000007 6.83 17.26 |
| check_old_context(c,p) 88194 0.2335 0.000003 0.5516 0.000006 0.41 0.97 |
| eval_at_point() 88194 1.1315 0.000013 3.5721 0.000041 1.99 6.29 |
| eval_old_dofs() 1341762 2.3692 0.000002 13.8493 0.000010 4.17 24.37 |
| |
| Parallel |
| allgather() 203 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
| |
| Partitioner |
| single_partition() 203 0.0133 0.000066 0.0133 0.000066 0.02 0.02 |
| |
| PatchRecoveryErrorEstimator |
| estimate_error() 202 8.7239 0.043187 20.4969 0.101470 15.35 36.07 |
| |
| PetscLinearSolver |
| solve() 303 0.7635 0.002520 0.7635 0.002520 1.34 1.34 |
| |
| StatisticsVector |
| maximum() 202 0.0008 0.000004 0.0008 0.000004 0.00 0.00 |
| |
| System |
| assemble() 303 3.3086 0.010919 8.1195 0.026797 5.82 14.29 |
| project_fem_vector() 1 0.0002 0.000171 0.0334 0.033433 0.00 0.06 |
| project_vector(FunctionBase) 1 0.0000 0.000010 0.0334 0.033443 0.00 0.06 |
| project_vector(old,new) 606 2.2412 0.003698 44.6599 0.073696 3.94 78.58 |
| |
| TopologyMap |
| init() 404 0.1538 0.000381 0.1538 0.000381 0.27 0.27 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 2.204e+07 56.8324 100.00 |
-----------------------------------------------------------------------------------------------------------------
AMR 3 refinements
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=238.81, Active time=167.585 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 842150 1.3950 0.000002 1.3950 0.000002 0.83 0.83 |
| |
| DofMap |
| add_neighbors_to_send_list() 304 0.8206 0.002699 3.1651 0.010411 0.49 1.89 |
| build_sparsity() 304 6.9767 0.022950 12.4466 0.040943 4.16 7.43 |
| create_dof_constraints() 304 1.7488 0.005753 4.3237 0.014223 1.04 2.58 |
| distribute_dofs() 304 0.1829 0.000602 3.9777 0.013084 0.11 2.37 |
| dof_indices() 12860957 11.3931 0.000001 11.3931 0.000001 6.80 6.80 |
| enforce_constraints_exactly() 909 3.1564 0.003472 3.1564 0.003472 1.88 1.88 |
| old_dof_indices() 7578465 7.1611 0.000001 7.1611 0.000001 4.27 4.27 |
| prepare_send_list() 305 0.0003 0.000001 0.0003 0.000001 0.00 0.00 |
| reinit() 304 0.6284 0.002067 0.6284 0.002067 0.37 0.37 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 0.0252 0.005032 0.0418 0.008360 0.02 0.02 |
| build_solution_vector() 5 0.0001 0.000015 0.0419 0.008375 0.00 0.02 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0015 0.000506 0.0015 0.000506 0.00 0.00 |
| |
| FE |
| compute_shape_functions() 5859019 5.9806 0.000001 5.9806 0.000001 3.57 3.57 |
| init_shape_functions() 1723435 10.2193 0.000006 10.2193 0.000006 6.10 6.10 |
| inverse_map() 2650340 3.1708 0.000001 3.1708 0.000001 1.89 1.89 |
| |
| FEMap |
| compute_affine_map() 5859019 7.7749 0.000001 7.7749 0.000001 4.64 4.64 |
| init_reference_to_physical_map() 1723435 5.3966 0.000003 5.3966 0.000003 3.22 3.22 |
| |
| GMVIO |
| write_nodal_data() 2 0.0801 0.040046 0.0801 0.040046 0.05 0.05 |
| |
| GenericProjector |
| copy_dofs 1136565 4.5570 0.000004 17.2482 0.000015 2.72 10.29 |
| operator() 910 15.3913 0.016914 141.3689 0.155350 9.18 84.36 |
| project_edges 1387677 1.0267 0.000001 1.0267 0.000001 0.61 0.61 |
| project_interior 1387677 1.0240 0.000001 1.0240 0.000001 0.61 0.61 |
| project_nodes 1387677 6.4217 0.000005 87.2413 0.000063 3.83 52.06 |
| project_sides 1387677 1.0431 0.000001 1.0431 0.000001 0.62 0.62 |
| |
| Mesh |
| contract() 303 0.1038 0.000343 0.1467 0.000484 0.06 0.09 |
| find_neighbors() 304 1.8576 0.006111 1.8576 0.006111 1.11 1.11 |
| renumber_nodes_and_elem() 911 0.1198 0.000131 0.1198 0.000131 0.07 0.07 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000016 0.1236 0.024716 0.00 0.07 |
| |
| MeshRefinement |
| _coarsen_elements() 606 0.1436 0.000237 0.1436 0.000237 0.09 0.09 |
| _refine_elements() 606 1.6756 0.002765 5.0523 0.008337 1.00 3.01 |
| add_node() 1645216 1.4108 0.000001 1.4108 0.000001 0.84 0.84 |
| make_coarsening_compatible() 890 1.2224 0.001374 1.2224 0.001374 0.73 0.73 |
| make_flags_parallel_consistent() 909 0.2219 0.000244 0.2219 0.000244 0.13 0.13 |
| make_refinement_compatible() 890 0.0720 0.000081 0.0720 0.000081 0.04 0.04 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0002 0.000212 0.0002 0.000212 0.00 0.00 |
| |
| OldSolutionValue |
| Number eval_at_node() 5549940 5.7961 0.000001 74.1854 0.000013 3.46 44.27 |
| check_old_context(c) 1136565 3.1938 0.000003 8.0410 0.000007 1.91 4.80 |
| check_old_context(c,p) 1654752 4.2185 0.000003 9.8454 0.000006 2.52 5.87 |
| eval_at_point() 1654752 21.8614 0.000013 66.4162 0.000040 13.04 39.63 |
| eval_old_dofs() 1136565 1.9376 0.000002 11.3337 0.000010 1.16 6.76 |
| |
| Parallel |
| allgather() 304 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
| |
| Partitioner |
| single_partition() 304 0.0375 0.000123 0.0375 0.000123 0.02 0.02 |
| |
| PatchRecoveryErrorEstimator |
| estimate_error() 303 15.3540 0.050673 35.4803 0.117097 9.16 21.17 |
| |
| PetscLinearSolver |
| solve() 404 1.3966 0.003457 1.3966 0.003457 0.83 0.83 |
| |
| StatisticsVector |
| maximum() 303 0.0015 0.000005 0.0015 0.000005 0.00 0.00 |
| |
| System |
| assemble() 404 5.2823 0.013075 12.6676 0.031355 3.15 7.56 |
| project_fem_vector() 1 0.0001 0.000102 0.0044 0.004428 0.00 0.00 |
| project_vector(FunctionBase) 1 0.0000 0.000010 0.0044 0.004438 0.00 0.00 |
| project_vector(old,new) 909 5.3573 0.005894 157.5095 0.173278 3.20 93.99 |
| |
| TopologyMap |
| init() 606 0.7440 0.001228 0.7440 0.001228 0.44 0.44 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 5.857e+07 167.5848 100.00 |
-----------------------------------------------------------------------------------------------------------------
On Apr 27, 2017, at 14:29, Rossi, Simone <***@email.unc.edu<mailto:***@email.unc.edu>> wrote:
Ok, I ran again the tests with different max_h_levels with the perflog enabled.
Let me know if you see anything here.
Thanks,
Simone
NO AMR
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=77.5482, Active time=40.2976 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 98306 0.1609 0.000002 0.1609 0.000002 0.40 0.40 |
| |
| DofMap |
| add_neighbors_to_send_list() 1 0.0959 0.095930 0.3744 0.374369 0.24 0.93 |
| build_sparsity() 1 0.4701 0.470055 1.1433 1.143297 1.17 2.84 |
| create_dof_constraints() 1 0.0137 0.013673 0.0137 0.013673 0.03 0.03 |
| distribute_dofs() 1 0.0126 0.012578 0.4376 0.437647 0.03 1.09 |
| dof_indices() 11010048 9.9728 0.000001 9.9728 0.000001 24.75 24.75 |
| prepare_send_list() 2 0.0000 0.000002 0.0000 0.000002 0.00 0.00 |
| reinit() 1 0.0507 0.050692 0.0507 0.050692 0.13 0.13 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 1.4241 0.284811 2.4934 0.498673 3.53 6.19 |
| build_solution_vector() 5 0.0002 0.000050 2.4936 0.498724 0.00 6.19 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0774 0.025816 0.0774 0.025816 0.19 0.19 |
| |
| FE |
| compute_shape_functions() 10027008 11.7027 0.000001 11.7027 0.000001 29.04 29.04 |
| init_shape_functions() 102 0.0007 0.000007 0.0007 0.000007 0.00 0.00 |
| |
| FEMap |
| compute_affine_map() 10027008 9.9328 0.000001 9.9328 0.000001 24.65 24.65 |
| init_reference_to_physical_map() 102 0.0008 0.000008 0.0008 0.000008 0.00 0.00 |
| |
| GMVIO |
| write_nodal_data() 2 0.2260 0.113020 0.2260 0.113020 0.56 0.56 |
| |
| GenericProjector |
| operator() 1 0.8425 0.842529 2.0842 2.084232 2.09 5.17 |
| project_edges 98304 0.0765 0.000001 0.0765 0.000001 0.19 0.19 |
| project_interior 98304 0.0765 0.000001 0.0765 0.000001 0.19 0.19 |
| project_nodes 98304 0.0865 0.000001 0.0865 0.000001 0.21 0.21 |
| project_sides 98304 0.0763 0.000001 0.0763 0.000001 0.19 0.19 |
| |
| Mesh |
| find_neighbors() 1 0.1105 0.110532 0.1105 0.110532 0.27 0.27 |
| renumber_nodes_and_elem() 2 0.0063 0.003125 0.0063 0.003125 0.02 0.02 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000021 2.7972 0.559445 0.00 6.94 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0280 0.027995 0.0280 0.027995 0.07 0.07 |
| |
| Parallel |
| allgather() 1 0.0000 0.000003 0.0000 0.000003 0.00 0.00 |
| |
| Partitioner |
| single_partition() 1 0.0028 0.002767 0.0028 0.002767 0.01 0.01 |
| |
| PetscLinearSolver |
| solve() 101 4.8469 0.047989 4.8469 0.047989 12.03 12.03 |
| |
| System |
| project_fem_vector() 1 0.0034 0.003364 2.0876 2.087598 0.01 5.18 |
| project_vector(FunctionBase) 1 0.0000 0.000011 2.0876 2.087610 0.00 5.18 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 3.156e+07 40.2976 100.00 |
-----------------------------------------------------------------------------------------------------------------
AMR: 1 refinement
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=395.981, Active time=261.811 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 1336320 2.0806 0.000002 2.0806 0.000002 0.79 0.79 |
| |
| DofMap |
| add_neighbors_to_send_list() 102 1.2626 0.012378 4.8311 0.047363 0.48 1.85 |
| build_sparsity() 102 6.5962 0.064669 15.1863 0.148885 2.52 5.80 |
| create_dof_constraints() 102 0.1384 0.001356 0.2351 0.002305 0.05 0.09 |
| distribute_dofs() 102 0.1489 0.001459 5.6797 0.055684 0.06 2.17 |
| dof_indices() 22510266 19.3897 0.000001 19.3897 0.000001 7.41 7.41 |
| enforce_constraints_exactly() 303 0.1463 0.000483 0.1463 0.000483 0.06 0.06 |
| old_dof_indices() 11914452 11.0468 0.000001 11.0468 0.000001 4.22 4.22 |
| prepare_send_list() 103 0.0001 0.000001 0.0001 0.000001 0.00 0.00 |
| reinit() 102 0.6993 0.006856 0.6993 0.006856 0.27 0.27 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 0.1832 0.036644 0.3127 0.062538 0.07 0.12 |
| build_solution_vector() 5 0.0001 0.000018 0.3128 0.062557 0.00 0.12 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0094 0.003131 0.0094 0.003131 0.00 0.00 |
| |
| FE |
| compute_shape_functions() 12975978 16.6602 0.000001 16.6602 0.000001 6.36 6.36 |
| init_shape_functions() 10329700 16.6365 0.000002 16.6365 0.000002 6.35 6.35 |
| inverse_map() 10386411 11.3644 0.000001 11.3644 0.000001 4.34 4.34 |
| |
| FEMap |
| compute_affine_map() 12975978 13.4041 0.000001 13.4041 0.000001 5.12 5.12 |
| compute_face_map() 7691859 8.9240 0.000001 8.9240 0.000001 3.41 3.41 |
| init_face_shape_functions() 101 0.0004 0.000004 0.0004 0.000004 0.00 0.00 |
| init_reference_to_physical_map() 10329700 11.4379 0.000001 11.4379 0.000001 4.37 4.37 |
| |
| GMVIO |
| write_nodal_data() 2 0.0979 0.048947 0.0979 0.048947 0.04 0.04 |
| |
| GenericProjector |
| copy_dofs 3917556 15.7713 0.000004 59.2081 0.000015 6.02 22.61 |
| operator() 304 11.6914 0.038458 95.5809 0.314411 4.47 36.51 |
| project_edges 66216 0.0489 0.000001 0.0489 0.000001 0.02 0.02 |
| project_interior 66216 0.0493 0.000001 0.0493 0.000001 0.02 0.02 |
| project_nodes 66216 0.2561 0.000004 3.4858 0.000053 0.10 1.33 |
| project_sides 66216 0.0498 0.000001 0.0498 0.000001 0.02 0.02 |
| |
| JumpErrorEstimator |
| estimate_error() 101 73.8216 0.730907 231.1510 2.288624 28.20 88.29 |
| |
| Mesh |
| contract() 101 0.0296 0.000293 0.0581 0.000575 0.01 0.02 |
| find_neighbors() 101 1.4534 0.014391 1.4534 0.014391 0.56 0.56 |
| renumber_nodes_and_elem() 303 0.0847 0.000280 0.0847 0.000280 0.03 0.03 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000017 0.4202 0.084033 0.00 0.16 |
| |
| MeshRefinement |
| _coarsen_elements() 202 0.0812 0.000402 0.0812 0.000402 0.03 0.03 |
| _refine_elements() 202 0.1485 0.000735 0.2795 0.001383 0.06 0.11 |
| add_node() 64512 0.0546 0.000001 0.0546 0.000001 0.02 0.02 |
| make_coarsening_compatible() 204 0.3018 0.001479 0.3018 0.001479 0.12 0.12 |
| make_flags_parallel_consistent() 303 0.2300 0.000759 0.2300 0.000759 0.09 0.09 |
| make_refinement_compatible() 204 0.0242 0.000119 0.0242 0.000119 0.01 0.01 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0039 0.003937 0.0039 0.003937 0.00 0.00 |
| |
| OldSolutionValue |
| Number eval_at_node() 215712 0.2301 0.000001 2.9735 0.000014 0.09 1.14 |
| check_old_context(c) 3917556 10.9141 0.000003 27.5061 0.000007 4.17 10.51 |
| check_old_context(c,p) 68724 0.1726 0.000003 0.4012 0.000006 0.07 0.15 |
| eval_at_point() 68724 0.8513 0.000012 2.6627 0.000039 0.33 1.02 |
| eval_old_dofs() 3917556 6.6409 0.000002 38.7818 0.000010 2.54 14.81 |
| |
| Parallel |
| allgather() 102 0.0001 0.000001 0.0001 0.000001 0.00 0.00 |
| |
| Partitioner |
| single_partition() 101 0.0341 0.000338 0.0341 0.000338 0.01 0.01 |
| |
| PetscLinearSolver |
| solve() 202 1.6660 0.008248 1.6660 0.008248 0.64 0.64 |
| |
| StatisticsVector |
| maximum() 101 0.0018 0.000017 0.0018 0.000017 0.00 0.00 |
| |
| System |
| assemble() 202 11.5849 0.057351 28.7372 0.142263 4.42 10.98 |
| project_fem_vector() 1 0.0004 0.000417 0.2583 0.258341 0.00 0.10 |
| project_vector(FunctionBase) 1 0.0000 0.000008 0.2584 0.258351 0.00 0.10 |
| project_vector(old,new) 303 5.2799 0.017425 109.1696 0.360296 2.02 41.70 |
| |
| TopologyMap |
| init() 202 0.1071 0.000530 0.1071 0.000530 0.04 0.04 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 1.129e+08 261.8108 100.00 |
-----------------------------------------------------------------------------------------------------------------
AMR 2 refinements
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=156.79, Active time=103.985 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 487585 0.7671 0.000002 0.7671 0.000002 0.74 0.74 |
| |
| DofMap |
| add_neighbors_to_send_list() 203 0.4861 0.002394 1.8338 0.009034 0.47 1.76 |
| build_sparsity() 203 2.8815 0.014194 6.2119 0.030601 2.77 5.97 |
| create_dof_constraints() 203 0.2105 0.001037 0.4801 0.002365 0.20 0.46 |
| distribute_dofs() 203 0.0596 0.000294 2.1454 0.010569 0.06 2.06 |
| dof_indices() 8055927 7.4875 0.000001 7.4875 0.000001 7.20 7.20 |
| enforce_constraints_exactly() 606 0.3674 0.000606 0.3674 0.000606 0.35 0.35 |
| old_dof_indices() 4358601 4.2132 0.000001 4.2132 0.000001 4.05 4.05 |
| prepare_send_list() 204 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
| reinit() 203 0.2510 0.001237 0.2510 0.001237 0.24 0.24 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 0.0316 0.006312 0.0543 0.010852 0.03 0.05 |
| build_solution_vector() 5 0.0001 0.000014 0.0543 0.010868 0.00 0.05 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0024 0.000816 0.0024 0.000816 0.00 0.00 |
| |
| FE |
| compute_shape_functions() 4507581 6.1953 0.000001 6.1953 0.000001 5.96 5.96 |
| init_shape_functions() 3783756 6.6310 0.000002 6.6310 0.000002 6.38 6.38 |
| inverse_map() 3875385 4.5491 0.000001 4.5491 0.000001 4.37 4.37 |
| |
| FEMap |
| compute_affine_map() 4507581 5.2201 0.000001 5.2201 0.000001 5.02 5.02 |
| compute_face_map() 2763882 3.5520 0.000001 3.5520 0.000001 3.42 3.42 |
| init_face_shape_functions() 202 0.0007 0.000004 0.0007 0.000004 0.00 0.00 |
| init_reference_to_physical_map() 3783756 4.6286 0.000001 4.6286 0.000001 4.45 4.45 |
| |
| GMVIO |
| write_nodal_data() 2 0.1665 0.083237 0.1665 0.083237 0.16 0.16 |
| |
| GenericProjector |
| copy_dofs 1361385 5.6580 0.000004 21.6490 0.000016 5.44 20.82 |
| operator() 607 5.0012 0.008239 40.4516 0.066642 4.81 38.90 |
| project_edges 97080 0.0766 0.000001 0.0766 0.000001 0.07 0.07 |
| project_interior 97080 0.0751 0.000001 0.0751 0.000001 0.07 0.07 |
| project_nodes 97080 0.4693 0.000005 5.0553 0.000052 0.45 4.86 |
| project_sides 97080 0.0770 0.000001 0.0770 0.000001 0.07 0.07 |
| |
| JumpErrorEstimator |
| estimate_error() 202 28.7106 0.142132 89.7093 0.444106 27.61 86.27 |
| |
| Mesh |
| contract() 202 0.0160 0.000079 0.0280 0.000139 0.02 0.03 |
| find_neighbors() 203 0.5978 0.002945 0.5978 0.002945 0.57 0.57 |
| renumber_nodes_and_elem() 608 0.0350 0.000058 0.0350 0.000058 0.03 0.03 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000013 0.2233 0.044669 0.00 0.21 |
| |
| MeshRefinement |
| _coarsen_elements() 404 0.0378 0.000094 0.0378 0.000094 0.04 0.04 |
| _refine_elements() 404 0.1563 0.000387 0.4010 0.000993 0.15 0.39 |
| add_node() 113664 0.1007 0.000001 0.1007 0.000001 0.10 0.10 |
| make_coarsening_compatible() 407 0.1988 0.000489 0.1988 0.000489 0.19 0.19 |
| make_flags_parallel_consistent() 606 0.0937 0.000155 0.0937 0.000155 0.09 0.09 |
| make_refinement_compatible() 407 0.0102 0.000025 0.0102 0.000025 0.01 0.01 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0007 0.000677 0.0007 0.000677 0.00 0.00 |
| |
| OldSolutionValue |
| Number eval_at_node() 382176 0.3948 0.000001 4.1027 0.000011 0.38 3.95 |
| check_old_context(c) 1361385 4.0214 0.000003 10.1149 0.000007 3.87 9.73 |
| check_old_context(c,p) 85266 0.2420 0.000003 0.5679 0.000007 0.23 0.55 |
| eval_at_point() 85266 1.1648 0.000014 3.5999 0.000042 1.12 3.46 |
| eval_old_dofs() 1361385 2.4347 0.000002 14.2695 0.000010 2.34 13.72 |
| |
| Parallel |
| allgather() 203 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
| |
| Partitioner |
| single_partition() 203 0.0140 0.000069 0.0140 0.000069 0.01 0.01 |
| |
| PetscLinearSolver |
| solve() 303 0.7612 0.002512 0.7612 0.002512 0.73 0.73 |
| |
| StatisticsVector |
| maximum() 202 0.0008 0.000004 0.0008 0.000004 0.00 0.00 |
| |
| System |
| assemble() 303 3.4738 0.011465 8.5615 0.028256 3.34 8.23 |
| project_fem_vector() 1 0.0001 0.000142 0.0331 0.033134 0.00 0.03 |
| project_vector(FunctionBase) 1 0.0000 0.000009 0.0331 0.033144 0.00 0.03 |
| project_vector(old,new) 606 2.3051 0.003804 46.4861 0.076710 2.22 44.70 |
| |
| TopologyMap |
| init() 404 0.1561 0.000386 0.1561 0.000386 0.15 0.15 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 4.127e+07 103.9851 100.00 |
-----------------------------------------------------------------------------------------------------------------
AMR 3 refinements
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=455.466, Active time=308.123 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DefaultCoupling |
| operator() 1153034 1.9001 0.000002 1.9001 0.000002 0.62 0.62 |
| |
| DofMap |
| add_neighbors_to_send_list() 304 1.1509 0.003786 4.4251 0.014556 0.37 1.44 |
| build_sparsity() 304 8.8682 0.029172 16.6976 0.054926 2.88 5.42 |
| create_dof_constraints() 304 1.6469 0.005417 3.9951 0.013142 0.53 1.30 |
| distribute_dofs() 304 0.2173 0.000715 5.4146 0.017811 0.07 1.76 |
| dof_indices() 19916934 18.6126 0.000001 18.6126 0.000001 6.04 6.04 |
| enforce_constraints_exactly() 909 2.8034 0.003084 2.8034 0.003084 0.91 0.91 |
| old_dof_indices() 10268793 10.0124 0.000001 10.0124 0.000001 3.25 3.25 |
| prepare_send_list() 305 0.0003 0.000001 0.0003 0.000001 0.00 0.00 |
| reinit() 304 0.7707 0.002535 0.7707 0.002535 0.25 0.25 |
| |
| EquationSystems |
| build_parallel_solution_vector() 5 0.0498 0.009954 0.0849 0.016974 0.02 0.03 |
| build_solution_vector() 5 0.0001 0.000015 0.0850 0.016991 0.00 0.03 |
| |
| ExodusII_IO |
| write_nodal_data() 3 0.0016 0.000526 0.0016 0.000526 0.00 0.00 |
| |
| FE |
| compute_shape_functions() 12087258 16.7562 0.000001 16.7562 0.000001 5.44 5.44 |
| init_shape_functions() 10555340 23.3502 0.000002 23.3502 0.000002 7.58 7.58 |
| inverse_map() 11670851 13.6081 0.000001 13.6081 0.000001 4.42 4.42 |
| |
| FEMap |
| compute_affine_map() 12087258 15.1613 0.000001 15.1613 0.000001 4.92 4.92 |
| compute_face_map() 6822171 8.8288 0.000001 8.8288 0.000001 2.87 2.87 |
| init_face_shape_functions() 303 0.0011 0.000004 0.0011 0.000004 0.00 0.00 |
| init_reference_to_physical_map() 10555340 14.9343 0.000001 14.9343 0.000001 4.85 4.85 |
| |
| GMVIO |
| write_nodal_data() 2 0.0676 0.033816 0.0676 0.033816 0.02 0.02 |
| |
| GenericProjector |
| copy_dofs 2157561 8.8513 0.000004 33.9505 0.000016 2.87 11.02 |
| operator() 910 18.4194 0.020241 155.7304 0.171132 5.98 50.54 |
| project_edges 1299333 1.0235 0.000001 1.0235 0.000001 0.33 0.33 |
| project_interior 1299333 1.0026 0.000001 1.0026 0.000001 0.33 0.33 |
| project_nodes 1299333 6.3258 0.000005 76.0383 0.000059 2.05 24.68 |
| project_sides 1299333 1.0258 0.000001 1.0258 0.000001 0.33 0.33 |
| |
| JumpErrorEstimator |
| estimate_error() 303 71.4588 0.235838 222.8668 0.735534 23.19 72.33 |
| |
| Mesh |
| contract() 303 0.0998 0.000329 0.1462 0.000483 0.03 0.05 |
| find_neighbors() 304 2.2488 0.007397 2.2488 0.007397 0.73 0.73 |
| renumber_nodes_and_elem() 911 0.1348 0.000148 0.1348 0.000148 0.04 0.04 |
| |
| MeshOutput |
| write_equation_systems() 5 0.0001 0.000013 0.1542 0.030848 0.00 0.05 |
| |
| MeshRefinement |
| _coarsen_elements() 606 0.1621 0.000268 0.1621 0.000268 0.05 0.05 |
| _refine_elements() 606 1.6498 0.002722 4.9647 0.008193 0.54 1.61 |
| add_node() 1542432 1.3647 0.000001 1.3647 0.000001 0.44 0.44 |
| make_coarsening_compatible() 809 1.4420 0.001782 1.4420 0.001782 0.47 0.47 |
| make_flags_parallel_consistent() 909 0.2881 0.000317 0.2881 0.000317 0.09 0.09 |
| make_refinement_compatible() 809 0.0552 0.000068 0.0552 0.000068 0.02 0.02 |
| |
| MeshTools::Generation |
| build_cube() 1 0.0002 0.000230 0.0002 0.000230 0.00 0.00 |
| |
| OldSolutionValue |
| Number eval_at_node() 5196564 5.4931 0.000001 63.1554 0.000012 1.78 20.50 |
| check_old_context(c) 2157561 6.2716 0.000003 15.8724 0.000007 2.04 5.15 |
| check_old_context(c,p) 1343484 3.6784 0.000003 8.6255 0.000006 1.19 2.80 |
| eval_at_point() 1343484 18.1202 0.000013 55.9662 0.000042 5.88 18.16 |
| eval_old_dofs() 2157561 3.8284 0.000002 22.3994 0.000010 1.24 7.27 |
| |
| Parallel |
| allgather() 304 0.0003 0.000001 0.0003 0.000001 0.00 0.00 |
| |
| Partitioner |
| single_partition() 304 0.0450 0.000148 0.0450 0.000148 0.01 0.01 |
| |
| PetscLinearSolver |
| solve() 404 1.5022 0.003718 1.5022 0.003718 0.49 0.49 |
| |
| StatisticsVector |
| maximum() 303 0.0019 0.000006 0.0019 0.000006 0.00 0.00 |
| |
| System |
| assemble() 404 7.4765 0.018506 18.1484 0.044922 2.43 5.89 |
| project_fem_vector() 1 0.0001 0.000109 0.0045 0.004474 0.00 0.00 |
| project_vector(FunctionBase) 1 0.0000 0.000010 0.0045 0.004485 0.00 0.00 |
| project_vector(old,new) 909 6.4352 0.007079 174.8106 0.192311 2.09 56.73 |
| |
| TopologyMap |
| init() 606 0.9755 0.001610 0.9755 0.001610 0.32 0.32 |
-----------------------------------------------------------------------------------------------------------------
| Totals: 1.162e+08 308.1230 100.00 |
-----------------------------------------------------------------------------------------------------------------
On Apr 27, 2017, at 12:14, Vikram Garg <***@gmail.com<mailto:***@gmail.com>> wrote:
Rossi, yes compiling with perflog should give you all the details as in the example.
On Thu, Apr 27, 2017 at 10:54 AM, Rossi, Simone <***@email.unc.edu<mailto:***@email.unc.edu>> wrote:
Dear Vikram,
as in the examples, I am using the libmesh::KellyErrorEstimator.
I’m compiling libmesh with the --enable-perflog option. Does it automatically give all the details you have listed in the example?
For the time being, I am attaching two perfLogs I had saved with only “coarse scale” data for 2 levels of refinements.
It looks like that most of the time is spent in the AMR step, probably in the call to reinit().
Thanks,
Simone
NO AMR:
------------------------------------------------------------------------------------------------------------
| perf_log Performance: Alive time=18.0494, Active time=18.0426 |
------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|------------------------------------------------------------------------------------------------------------|
| no amr matrix assembly 1 0.1545 0.154465 0.1545 0.154465 0.86 0.86 |
| no amr linear solve 101 4.8069 0.047593 4.8069 0.047593 26.64 26.64 |
| no amr rhs assembly 101 12.0348 0.119156 12.0348 0.119156 66.70 66.70 |
| time loop 1 1.0464 1.046422 17.8884 17.888405 5.80 99.15 |
------------------------------------------------------------------------------------------------------------
| Totals: 204 18.0426 100.00 |
------------------------------------------------------------------------------------------------------------
AMR:
------------------------------------------------------------------------------------------------------------
| perf_log Performance: Alive time=209.305, Active time=209.298 |
------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|------------------------------------------------------------------------------------------------------------|
| |
| amr 303 195.1102 0.643928 195.1102 0.643928 93.22 93.22 |
| amr solve 303 13.9907 0.046174 13.9907 0.046174 6.68 6.68 |
| time loop 1 0.1974 0.197370 209.2990 209.299042 0.09 100.00 |
------------------------------------------------------------------------------------------------------------
| Totals: 607 209.2983 100.00 |
------------------------------------------------------------------------------------------------------------
On Apr 27, 2017, at 11:02, Vikram Garg <***@gmail.com<mailto:***@gmail.com>> wrote:
Hello Rossi,
Two questions:
1) Which error estimator/indicator are you using to mark elements for refinement ?
2) Can you send the perfLog output from libMesh ? You might need to recompile libMesh with the option --enable-perflog.
Looks something like this:
-----------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=0.013423, Active time=0.007095 |
-----------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-----------------------------------------------------------------------------------------------------------------|
| |
| |
| DofMap |
| add_neighbors_to_send_list() 6 0.0001 0.000012 0.0001 0.000012 1.01 1.01 |
| build_sparsity() 6 0.0002 0.000033 0.0011 0.000187 2.78 15.84 |
| create_dof_constraints() 6 0.0000 0.000001 0.0000 0.000001 0.07 0.07 |
| distribute_dofs() 6 0.0001 0.000025 0.0004 0.000066 2.09 5.57 |
| dof_indices() 688 0.0010 0.000001 0.0010 0.000001 14.36 14.36 |
| old_dof_indices() 300 0.0001 0.000000 0.0001 0.000000 0.96 0.96 |
| prepare_send_list() 7 0.0000 0.000000 0.0000 0.000000 0.01 0.01 |
| reinit() 6 0.0002 0.000041 0.0002 0.000041 3.48 3.48 |
| |
| EquationSystems |
| build_solution_vector() 1 0.0001 0.000056 0.0001 0.000064 0.79 0.90 |
Thanks.
On Wed, Apr 26, 2017 at 10:09 PM, Rossi, Simone <***@email.unc.edu<mailto:***@email.unc.edu>> wrote:
Dear Roy, dear Paul, dear all,
I am testing AMR in libmesh using simple linear elements.
My test case is a propagating front described by a reaction-diffusion equation with a cubic bistable reaction term.
I followed the adaptivity examples to create this test case.
The run times for 100 timesteps using AMR can be more than 10 times slower than when using a fine uniform grid.
For example, with a 16 x 16 x 16 uniform grid, 100 iterations take about 18 seconds with a single processor.
With AMR, using a 2 x 2 x 2 grid and 3 levels of refinement, 100 iterations take about 800 seconds.
I’m attaching the code I’m using.
Without AMR, I build the matrix ( mass + dt * stiffness ) once and I update the rhs at every timestep.
Conversely, with AMR I am building the matrix and the rhs at every timestep for all the refinement levels.
Do you have any suggestions?
Thanks a lot for your help,
All the best,
Simone
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org<http://slashdot.org/>! http://sdm.link/slashdot
_______________________________________________
Libmesh-users mailing list
Libmesh-***@lists.sourceforge.net<mailto:Libmesh-***@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/libmesh-users
--
Vikram Garg
Postdoctoral Associate
The University of Texas at Austin
http://vikramvgarg.wordpress.com/
http://www.runforindia.org/runners/vikramg
--
Vikram Garg
Postdoctoral Associate
The University of Texas at Austin
http://vikramvgarg.wordpress.com/
http://www.runforindia.org/runners/vikramg