Parallel IB timings

git clone https://github.com/sonwell/ib.cu.git

Results

The tables below summarize the results of a simulation in a triply periodic 16 μm × 16 μm × 16 μm domain with a single red blood cell in mock shear flow with flow rate ɣ̇ = 1000 s-1. The red blood cell was constructed from 864 data sites and 8832 sample sites.

Strong scaling

Results of strong scaling tests for IB spread and interpolate. Computation time for Lagrangian forces are also reported for comparison.
p cells interpolate
N = 20000
forces
N = 10000
spread
N = 10000
6410.01080 ± 0.000040.00241 ± 0.000020.11888 ± 0.00055
12810.00586 ± 0.000020.00243 ± 0.000020.06535 ± 0.00023
25610.00341 ± 0.000020.00241 ± 0.000020.03912 ± 0.00012
51210.00179 ± 0.000010.00243 ± 0.000020.02649 ± 0.00014
102410.00093 ± 0.000010.00241 ± 0.000020.01924 ± 0.00011
204810.00094 ± 0.000010.00243 ± 0.000020.01627 ± 0.00012
409610.00093 ± 0.000010.00241 ± 0.000020.01455 ± 0.00009
Results of strong scaling for IB interpolate and spread with one red blood cell.
Results of strong scaling tests for IB spread and interpolate. Computation time for Lagrangian forces are also reported for comparison.
p cells interpolate
N = 20000
forces
N = 10000
spread
N = 10000
6440.04150 ± 0.000130.00289 ± 0.000100.40148 ± 0.00146
12840.02251 ± 0.000030.00286 ± 0.000030.20628 ± 0.00060
25640.01172 ± 0.000020.00286 ± 0.000030.11208 ± 0.00031
51240.00603 ± 0.000020.00286 ± 0.000030.06313 ± 0.00014
102440.00349 ± 0.000020.00286 ± 0.000030.03931 ± 0.00011
204840.00181 ± 0.000010.00286 ± 0.000030.02785 ± 0.00011
409640.00103 ± 0.000010.00288 ± 0.000020.02168 ± 0.00009
Results of strong scaling for IB interpolate and spread with four red blood cells.

Weak scaling

Results of weak scaling tests for IB spread and interpolate.
p cells interpolate
N = 20000
forces
N = 10000
spread
N = 10000
64 1 0.01079 ± 0.00004 0.00242 ± 0.00002 0.11881 ± 0.00055
128 2 0.01165 ± 0.00003 0.00249 ± 0.00002 0.11219 ± 0.00051
256 4 0.01171 ± 0.00003 0.00287 ± 0.00003 0.11214 ± 0.00036
512 8 0.01199 ± 0.00003 0.00532 ± 0.00008 0.11354 ± 0.00047
Weak scaling results.
Results of n randomly placed IB points in a 16 μm × 16 μm × 16 μm periodic domain.
p n interpolate
N = 20000
forces
N = 10000
spread
N = 10000
128 2160.43930 ± 0.000190.00026 ± 0.000310.47632 ± 0.00142
256 2170.44918 ± 0.000560.00024 ± 0.000020.46503 ± 0.00026
512 2180.45072 ± 0.000610.00029 ± 0.000010.44533 ± 0.00028
10242190.45442 ± 0.000490.00055 ± 0.000000.43561 ± 0.00024

Convergence study

Results of a convergence study for a single RBC in mock periodic flow for the ℓ2 and max norms. Errors are reported for t = 0.4 ms against the result of a simulation with grid refinement of 128, 2210 data sites, and 8832 sample sites, uf.
refinement h nd ns un - u2n2 order2 un - u2n order
161.00 μm31 132 0.000789 0.006200
320.50 μm132546 0.0003311.2531940.0047000.399607
640.25 μm54622100.0001111.5762720.0024860.918834
Convergence study, against a simulation with grid refinement of 128, nd = 2210, and s = 8832, for ℓ2 and max norms. The short-dashed line represents first order convergence and the long-dashed line represents second order convergence. The error converges at approximately order 1.4 in the ℓ2 norm and approximately order 0.7 in the max norm.

Dependence on background grid

Dependence of the algorithm on background grid
n h (μm) interpolate
N = 500(n/16)2
forces
N = 250(n/16)2
spread
N = 250(n/16)2
16 1.0000.00100 ± 0.000010.00252 ± 0.000140.00694 ± 0.00057
32 0.5000.00108 ± 0.000010.00247 ± 0.000060.00677 ± 0.00036
64 0.2500.00111 ± 0.000010.00282 ± 0.000020.01501 ± 0.00007
1280.1250.00113 ± 0.000020.00282 ± 0.000030.05514 ± 0.00019

Comparison to serial code

Comparison to serial code.
p n interpolate
N = 20000
spread
N = 10000
1 (CPU)2161.40735 ± 0.063251.65417 ± 0.05096
12 (CPU)2160.12494 ± 0.002180.18054 ± 0.00369
128 (GPU)2160.43930 ± 0.000190.47632 ± 0.00142
* (GPU)2160.01738 ± 0.000090.02369 ± 0.00018