Skip to content

Significant runtime difference between CERES and GTSAM on same BAL problem #2299

@dvorak0

Description

@dvorak0

Description:

I noticed a large performance discrepancy between the bundle_adjuster binary and the SFMExample_bal example when running on the same BAL dataset.

Dataset:

problem-16-22106-pre.txt from https://grail.cs.washington.edu/projects/bal/dubrovnik.html
(Problem size: 16 22106 83718)

Version

96e5355

Build command:

cmake -DCMAKE_BUILD_TYPE=Release ..
make -j

Timing

ceres

root@nixos:/tinynav/output/ceres-solver/build# time ./bin/bundle_adjuster --input /tinynav/problem-16-22106-pre.txt -num_threads=1 -linear_solver=dense_schur
iter      cost      cost_change  |gradient|   |step|    tr_ratio  tr_radius  ls_iter  iter_time  total_time
   0  4.185660e+06    0.00e+00    2.16e+07   0.00e+00   0.00e+00  1.00e+04        0    4.42e-02    7.22e-02
   1  1.980525e+05    3.99e+06    5.34e+06   0.00e+00   9.60e-01  3.00e+04        1    8.01e-02    1.52e-01
   2  5.086543e+04    1.47e+05    2.11e+06   1.01e+03   8.22e-01  4.09e+04        1    7.46e-02    2.27e-01
   3  1.859667e+04    3.23e+04    2.87e+05   2.64e+02   9.85e-01  1.23e+05        1    7.48e-02    3.02e-01
   4  1.803857e+04    5.58e+02    2.69e+04   8.66e+01   9.93e-01  3.69e+05        1    7.44e-02    3.76e-01
   5  1.803391e+04    4.66e+00    3.11e+02   1.02e+01   1.00e+00  1.11e+06        1    7.31e-02    4.49e-01

Solver Summary (v 2.2.0-eigen-(3.4.0)-lapack-suitesparse-(5.10.1)-eigensparse-cuda-(12020))

                                     Original                  Reduced
Parameter blocks                        22122                    22122
Parameters                              66462                    66462
Residual blocks                         83718                    83718
Residuals                              167436                   167436

Minimizer                        TRUST_REGION

Dense linear algebra library            EIGEN 
Trust region strategy     LEVENBERG_MARQUARDT
                                        Given                     Used
Linear solver                     DENSE_SCHUR              DENSE_SCHUR
Threads                                     1                        1
Linear solver ordering               22106,16                 22106,16
Schur structure                         2,3,9                    2,3,9

Cost:
Initial                          4.185660e+06
Final                            1.803391e+04
Change                           4.167626e+06

Minimizer iterations                        6
Successful steps                            6
Unsuccessful steps                          0

Time (in seconds):
Preprocessor                         0.028064

  Residual only evaluation           0.034154 (5)
  Jacobian & residual evaluation     0.210194 (6)
  Linear solver                      0.131609 (5)
Minimizer                            0.422234

Postprocessor                        0.002126
Total                                0.452424

Termination:                   NO_CONVERGENCE (Maximum number of iterations reached. Number of iterations: 5.)


real	0m0.589s
user	0m0.527s
sys	0m0.060s

gtsam

root@nixos:/tinynav/output/gtsam/build# time ./examples/SFMExample_bal /tinynav/problem-16-22106-pr
e.txt 
read 22106 tracks on 16 cameras
Initial error: 4.18566e+06
newError: 2.84007e+06
errorThreshold: 2.84007e+06 > 0
absoluteDecrease: 1345595.21265 >= 1e-05
relativeDecrease: 0.321477369734 >= 1e-05
newError: 428396.143992
errorThreshold: 428396.143992 > 0
absoluteDecrease: 2411669.40624 >= 1e-05
relativeDecrease: 0.84915976888 >= 1e-05
newError: 82756.334431
errorThreshold: 82756.334431 > 0
absoluteDecrease: 345639.809561 >= 1e-05
relativeDecrease: 0.806822877396 >= 1e-05
newError: 28524.7237205
errorThreshold: 28524.7237205 > 0
absoluteDecrease: 54231.6107104 >= 1e-05
relativeDecrease: 0.655316732953 >= 1e-05
newError: 21028.9020562
errorThreshold: 21028.9020562 > 0
absoluteDecrease: 7495.82166434 >= 1e-05
relativeDecrease: 0.262783322207 >= 1e-05
newError: 20482.7928745
errorThreshold: 20482.7928745 > 0
absoluteDecrease: 546.109181693 >= 1e-05
relativeDecrease: 0.0259694576651 >= 1e-05
newError: 20475.2556405
errorThreshold: 20475.2556405 > 0
absoluteDecrease: 7.53723400232 >= 1e-05
relativeDecrease: 0.000367978822444 >= 1e-05
newError: 20475.2532671
errorThreshold: 20475.2532671 > 0
absoluteDecrease: 0.00237339307205 >= 1e-05
relativeDecrease: 1.15915186297e-07 < 1e-05
converged
errorThreshold: 20475.2532671 <? 0
absoluteDecrease: 0.00237339307205 <? 1e-05
relativeDecrease: 1.15915186297e-07 <? 1e-05
iterations: 8 >? 100
final error: 20475.2532671

real	0m3.145s
user	0m13.053s
sys	0m1.102s

Expected behavior:

Since both programs seem to solve the same BAL problem, I expected similar runtime performance, especially since both were built in Release mode.

perf result

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions