Skip to content

Performance

Sunil Anandatheertha edited this page Jul 3, 2021 · 11 revisions

Current performance


@25-06-2021: By pre-allocatring random flip changes into a matrix, the size of the domain and taking it out of the inner solver sloop, speed up has been achieved. ALG0101 is derived from ALG01, is implemented and integrated into PXO. On a 200 x 200 domain with a 1% SLSP volume fraction, ALG01 took 44.24 seconds while ALG0101 took 26.87 seconds, for 1E4 time steps. ALG0101 is by far the fastest in PXO. Further details can be seen in this excel file.


@09 06 2021: With reference to KERNEL__LOOP_MC_2D____ALGORITHM_01 as ALG1, KERNEL__LOOP_MC_2D____ALGORITHM_02 as ALG2 and KERNEL__LOOP_MC_2D____ALGORITHM_03 as ALG3, the following apply to performance aspects of PXO.

  1. Computational time can be decided by choosing the right algorithm
  2. In general, the cost will increase with the following changes
    • Increase in domain size
    • Consideration of transition probability
    • Increase in simulation time
    • Increase in complexity of the Kernel function
    • Decrease in number of particles
    • Frequency of command line output
  3. For basic computations, use either Algorithm 01, 02 or 03
    • ALG 01 and ALG 02 does not consider simulation temperature
    • ALG 03 transition probability from simulation temperature input
    • ALG 01 is fast in a single MC step but, is slow along overall time domain. NOT ADVISED for large domains.
    • ALG 02 uses “closed state samples” at each lattice site and is slower than ALG 01 in a single MC step. But comparatively, it takes much lesser total simulation time to achieve similar grain structure. ADVISED for all domains small and large. BUT, this alters the grain growth kinetics. So if y9ou are interested in grain growth kinetics, please choose your algorithms carefully

Performance 01 Above figures highlights the comparative performance with respect to computational speed. To get this, PXO is run from start to end of mcsolver over different lattice sizes using ALG1, ALG2 and ALG3 and computer time (for mcsolver excluding pre-processing) is noted in seconds. These data can be seen in the above image. Details are in this excel file. In all simulations, a total of 128 states were used, the total volume fraction of particles was kept at 5%, for a 1000 Monte-Carlo steps. The ability of the algorithms are not compared in the above data, only run times are. Simulation temperature is also maintained constant. Results show ALG01 is faster for the same number of Monte-Carlo steps. But from the perspective of capability, as said earlier, AL02 is far faster in achieving particle limited grain growth due to faster rate of reduction of the lattice Hamiltonian. However, grain growth kinetics will be different.

“Current performance issues”:

@09 06 2021, the following performenace issues are opened. With reference to KERNEL__LOOP_MC_2D____ALGORITHM_01 as ALG1, KERNEL__LOOP_MC_2D____ALGORITHM_02 as ALG2 and KERNEL__LOOP_MC_2D____ALGORITHM_03 as ALG3, the following apply to performance aspects of PXO.

  1. Generally, ALG01 (without T, with “open state samples”), ALG02 (without T, with “closed state samples”) and ALG03 (with T) are fast, but ALG02 is the fastest. But the speed drops with increasing lattice size.
  2. Grain boundary identification identifies particles embedded inside the grain as grain boundary, which needs to be addressed.
  3. Figures of grain structures are retained across all temporal domains consuming memory. This is being worked on to remove.
Clone this wiki locally