Compress phase space via regrid/merge operations
A simple mechanism to reduce the size of phase space data is to merge identical particles while adding their weights. In practice, this doesn’t tend to be that useful, because even in a large data set, there are quite few truly identical particles. What there are however, are many particles which are close together.
This tutorial demonstrates how to explot this fact by first regridding phase space data (which in effect ‘nudges’ particles which are close together in 7-D space into the same bin) followed by merging the phase space. This algorithm is based on work originally developed by Leo Esnault for the p2sat code.
[1]:
from pathlib import Path
import sys
sys.path.append('../') # not necessary when the library is installed
from ParticlePhaseSpace import DataLoaders
from ParticlePhaseSpace import PhaseSpace
import numpy as np
from matplotlib import pyplot as plt
test_data_loc = Path(r'../tests/test_data/coll_PhaseSpace_xAng_0.00_yAng_0.00_angular_error_0.0.phsp').absolute()
ps_data = DataLoaders.Load_TopasData(test_data_loc)
PS = PhaseSpace(ps_data)
For this data set, the vast majority of the data is close to x=0, y=0, so I’m going to discard some of the scattered particles outside this range. (We will come back to this discarded data later).
[2]:
keep_ind = np.logical_and(np.abs(PS.ps_data['x [mm]']) < 5, np.abs(PS.ps_data['y [mm]']) < 5)
keep_ind2 = np.logical_and(np.abs(PS.ps_data['px [MeV/c]']) < 1, np.abs(PS.ps_data['py [MeV/c]']) < 1)
keep_ind3 = np.logical_and(keep_ind2, keep_ind)
PS, PS_discard = PS.filter_by_boolean_index(keep_ind3, split=True)
data where boolean_index=True accounts for 89.6 % of the original data
Ok, so this operation removed about 10% of the input data. First, let’s try merging any identical particles:
[3]:
PS.merge(in_place=True)
merge operation removed 0 of 279094 particles ( 0.0 % removed)
merge operation took 0.9 s
We can see that without regridding, there are not identical particles in the data and hence no particles can be merged.
Let’s try the same thing after performing a re-grid operation:
[4]:
new_PS = PS.transform.regrid(n_bins=100)
not regriding z as it is already single valued
not regriding time as it is already single valued
regrid operation took 10.7 s
[5]:
new_PS.merge(in_place=True)
merge operation removed 157174 of 279094 particles ( 56.3 % removed)
merge operation took 0.3 s
What we have done here is regridded the default quantitiues [‘x’, ‘y’, ‘z’, ‘px’, ‘py’, ‘pz’, ‘time’] into 100 bins. The values for each bin are calculated based on the range of the input data. The resultant phase space is roughly 33% the size of the original data. Now, we can add back in the data we extracted previously:
[6]:
new_PS = new_PS + PS_discard
PS = PS + PS_discard
The obvious question is: how have these operations effected the data set, so let’s check:
Particle Positions
[7]:
PS.plot.particle_positions_hist_2D(xlim=[-5,5], ylim=[-5,5])
new_PS.plot.particle_positions_hist_2D(xlim=[-5,5], ylim=[-5,5])
[8]:
PS.plot.position_hist_1D()
new_PS.plot.position_hist_1D()
Energy
[9]:
PS.plot.energy_hist_1D()
new_PS.plot.energy_hist_1D()
[10]:
PS.print_energy_stats()
new_PS.print_energy_stats()
===================================================
ENERGY STATS
===================================================
total number of particles in phase space: 311489
number of unique particle species: 3
308280 gammas
mean energy: 1.91 MeV
median energy: 1.20 MeV
Energy spread IQR: 2.03 MeV
min energy 0.01 MeV
max energy 10.35 MeV
2853 electrons
mean energy: 3.50 MeV
median energy: 3.82 MeV
Energy spread IQR: 3.07 MeV
min energy 0.02 MeV
max energy 9.93 MeV
356 positrons
mean energy: 2.93 MeV
median energy: 2.80 MeV
Energy spread IQR: 2.50 MeV
min energy 0.10 MeV
max energy 8.95 MeV
===================================================
ENERGY STATS
===================================================
total number of particles in phase space: 154315
number of unique particle species: 3
151106 gammas
mean energy: 1.91 MeV
median energy: 1.18 MeV
Energy spread IQR: 2.09 MeV
min energy 0.01 MeV
max energy 10.35 MeV
2853 electrons
mean energy: 3.50 MeV
median energy: 3.83 MeV
Energy spread IQR: 3.00 MeV
min energy 0.01 MeV
max energy 9.93 MeV
356 positrons
mean energy: 2.92 MeV
median energy: 2.80 MeV
Energy spread IQR: 2.49 MeV
min energy 0.10 MeV
max energy 8.95 MeV
Momentum
[11]:
PS.plot.momentum_hist_1D()
new_PS.plot.momentum_hist_1D()
Trace Space
[12]:
PS.plot.transverse_trace_space_hist_2D(ylim=[-1,1], xlim=[-2,2])
new_PS.plot.transverse_trace_space_hist_2D(ylim=[-1,1], xlim=[-2,2])
[13]:
PS.print_twiss_parameters()
new_PS.print_twiss_parameters()
===================================================
TWISS PARAMETERS
===================================================
gammas:
x y
epsilon 2.525901 2.673361
alpha 0.619585 0.525727
beta 8.976938 9.488084
gamma 0.154160 0.134526
electrons:
x y
epsilon 2.208670 2.357445
alpha 1.159331 1.009165
beta 4.971102 4.634599
gamma 0.471535 0.435510
positrons:
x y
epsilon 1.881200 1.178092
alpha 1.141023 2.283976
beta 3.888975 5.672388
gamma 0.591913 1.095931
===================================================
TWISS PARAMETERS
===================================================
gammas:
x y
epsilon 3.690392 4.014616
alpha 0.443078 0.366593
beta 6.144493 6.318489
gamma 0.194698 0.179535
electrons:
x y
epsilon 2.211982 2.360174
alpha 1.157147 1.008641
beta 4.963498 4.630658
gamma 0.471238 0.435652
positrons:
x y
epsilon 1.885795 1.179988
alpha 1.139382 2.280596
beta 3.882121 5.668230
gamma 0.591994 1.094013
Notes/ Conclusions
If you want to not include some data in the regridding process, you need to do it before regridding as we did in this example. The regrid bins are always based on the min/max range of the input data, since not doing this could result in many particles being ‘stacked’ around the edge bins
The most strongly affected quantity in this case appeared to be the twiss parameters/ trace space = however, even here, the RMS ellipse is quite close.
Grid/Merge operations can be a highly effective approach to reducing the total size of a phase space without losing too much information - however, as always, the extent to which this is appropriate is very much application specific.