Getting started with WFSim¶
Hello friend. Welcome to the basic tutorial on how to simulate waveforms with the latest wfsim version in strax. Here we’ll just demonstrate the basic functionality. For more in depth analysis stuff, checkout the straxen tutorials for more detailed thing.
[1]:
import numpy as np
import strax
import straxen
import wfsim
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from multihist import Histdd, Hist1d
from scipy import stats
import json
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-1-28690857fe6c> in <module>
1 import numpy as np
2 import strax
----> 3 import straxen
4 import wfsim
5
~/Desktop/python/straxen/straxen/__init__.py in <module>
1 __version__ = '0.15.8'
2
----> 3 from utilix import uconfig
4 from .common import *
5 # contexts.py below
~/Desktop/python/utilix/utilix/__init__.py in <module>
2
3 from utilix.config import Config
----> 4 uconfig = Config()
5
6 from utilix.rundb import DB
~/Desktop/python/utilix/utilix/config.py in __init__(self, path)
19 def __init__(self, path=None):
20 if not Config.instance:
---> 21 Config.instance = Config.__Config()
22
23 def __getattr__(self, name):
~/Desktop/python/utilix/utilix/config.py in __init__(self)
50
51 else:
---> 52 raise FileNotFoundError(f"Could not load a configuration file. "
53 f"You can create one at {home_config}, or set a custom path using\n\n"
54 f"export XENON_CONFIG=path/to/your/config\n")
FileNotFoundError: Could not load a configuration file. You can create one at /Users/petergaemers/.xenon_config, or set a custom path using
export XENON_CONFIG=path/to/your/config
Different simulators¶
Because we’d like to make simulations we have multiple different ways of making data. They are: * #### RawRecordsFromFaxNT(1T) * #### RawRecordsFromFaxEpix * #### RawRecordsFromFaxOptical * #### RawRecordsFromFaxNveto
The main difference is the way you specify instructions. The NT(1T) simulator depends on either a csv file or a defaul/user specified function to generate instructions. Epix takes input from G4 processed with epix. Optical takes optical input from G4 for photon channels and timings. And Nveto is a small modification to optical to provide nveto data
Setting everything up¶
First we need to define the right context. For this we made two contexts to help you out. One for 1T and one for nT. By default in your context RawRecordsFromFaxNT will be registered. If you want another one be sure to register it
For 1T do:
[ ]:
st = straxen.contexts.xenon1t_simulation()
Or nT do:
[2]:
st = straxen.contexts.xenonnt_simulation()
Now we need to define a run id. What you give it doesn’t really matter, since strax will look for data and make new if it doesn’t find anything. And this is what you want. Strax will use the run id to get the electron lifetime and pmt gains from a database, and returns placeholders if the run doesn’t exist.
[3]:
run_id = '1'
Defining instructions¶
For the instructions there are multiple different ways to do it. The simulator has this option called “fax_file”. If it has a value (None by default) the simulator will either read it as a csv or root file. If not it will use some predefined functions to make your events. The number of event you’ll simulate based on the product of the config values “nchunk”,”event_rate” and “chunk_size”. Which you can set as follows:
[4]:
st.set_config(dict(nchunk=1, event_rate=5, chunk_size=10,))
Strax groups data together in chunks based on time (for low level data). nchunk is the number of chunks you want to simulate event_rate is the number of events per seconds, so this effects the amount of spacing between events. Finally chunk_size is the length in seconds of your chunk For the DAQ this is 5 seconds. For simulations you can do whatever you want. It is important to note that Strax will write out data per chunk So when your chunks are small you’ll, among other things, call Strax’ IO functions a lot, giving a substantial overhead. On the other hand, to large chunks will hog all your memory and your kernel might crash. Based on my experimentation setting chunk_size to 500 gives best performance
These are ways you can give instructions * #### Random This is the default where simulator will generate some random instructions for you. * #### Custom For this you will need to overwrite the instruction generator function * #### CSV You can provide a csv file with the instruction (Like the output of nSort) * #### Geant4 For this you’ll need to use epix to do some clustering for you.
Random¶
I guess this is pretty self explanatory. The simulator has this function called “rand_instructions” which will make something up for you.
[6]:
wfsim.rand_instructions??
Custom¶
This is some more fun. To do this we’ll write a new function which returns a structured numpy array with the correct dtype.
[7]:
wfsim.instruction_dtype
[7]:
[('event_number', numpy.int32),
('type', numpy.int8),
('time', numpy.int64),
('x', numpy.float32),
('y', numpy.float32),
('z', numpy.float32),
('amp', numpy.int32),
('recoil', '<U2')]
Event number is just a lable which peaks are together. type is either 1(S1) or 2 (S2). In the truth higher numbers are also used to refer to different types of afterpulses. time,x,y,z are the time and positions of the signal. Amp is the number of photons or electrons generated, and recoil can be used for different types of recoil (but only Electronic recoil is supported). Recoil is a number according to the NEST convention which will be used to indicate which interaction. For the key look here: https://github.com/XENONnT/WFSim/blob/2c614b0f7b0d7c7adc516f6188e281857e8d7e22/wfsim/core.py#L22
Now, lets say we want some krypton peaks. For this we’ll need to change the default instruction function to include this double decay and use Nestpy to convert energy deposits into a number of photons and electrons.
In this case I’ll use 1 “event” per full decay, that where all the 4’s are coming from.
[8]:
def super_awesome_custom_instruction(c):
import nestpy
half_life = 156.94e-9 #Kr intermediate state half-life in ns
decay_energies = [32.2,9.4] # Decay energies in kev
n = c['nevents'] = c['event_rate'] * c['chunk_size'] * c['nchunk']
c['total_time'] = c['chunk_size'] * c['nchunk']
instructions = np.zeros(4 * n, dtype=wfsim.instruction_dtype)
instructions['event_number'] = np.digitize(instructions['time'],
1e9 * np.arange(c['nchunk']) * c['chunk_size']) - 1
instructions['type'] = np.tile([1, 2], 2 * n)
instructions['recoil'] = [7 for i in range(4 * n)]
r = np.sqrt(np.random.uniform(0, 2500, n))
t = np.random.uniform(-np.pi, np.pi, n)
instructions['x'] = np.repeat(r * np.cos(t), 4)
instructions['y'] = np.repeat(r * np.sin(t), 4)
instructions['z'] = np.repeat(np.random.uniform(-100, 0, n), 4)
#To get the correct times we'll need to include the 156.94 ns half life of the intermediate state.
uniform_times = c['total_time'] * (np.arange(n) + 0.5) / n
delayed_times = uniform_times + np.random.exponential(half_life/np.log(2),len(uniform_times))
instructions['time'] = np.repeat(list(zip(uniform_times,delayed_times)),2) * 1e9
# Here we'll define our XENON-like detector
nc = nestpy.NESTcalc(nestpy.VDetector())
A = 131.293
Z = 54.
density = 2.862 # g/cm^3 #SR1 Value
drift_field = 82 # V/cm #SR1 Value
interaction = nestpy.INTERACTION_TYPE(7)
energy = np.tile(decay_energies,n)
quanta = []
for en in energy:
y = nc.GetYields(interaction,
en,
density,
drift_field,
A,
Z,
(1, 1))
quanta.append(nc.GetQuanta(y, density).photons)
quanta.append(nc.GetQuanta(y, density).electrons)
instructions['amp'] = quanta
return instructions
Now here comes the magic line:
[9]:
wfsim.strax_interface.rand_instruction = super_awesome_custom_instruction
This changes the default rand_instruction function in our own super awesome function. So when the simulator will call rand_instruction the code from super_awesome_custom_instruction will be executed
CSV¶
The format for csv files is the same as the instructions dtype. So on every line specify event_number,type,time ,x,y,z, amp and recoil in that order. Then tell the simulator it exists:
[10]:
st.set_config(dict(fax_file='instructions.csv'))
Ofcourse if you do not have this file it will not work
Geant4¶
For starters you’ll need to register the Epix plugin. This requires an epix configuration (it’s on private_nt_aux_files), and if you want a start and stop event.
[11]:
st.register(wfsim.strax_interface.RawRecordsFromFaxEpix)
st.set_config(dict(epix_config= path_to_config))
Configuration customization¶
The simulator using a larger large amount of configuration settings to do it’s magic. Some of them are best left along, like pmt_circuit_load_resistor. Others on the other hand are things you might want change a bit to see how the data will change. Unfortunately currently the full list is spread out over two different places. One is the fax config json which is on github. The other is the option list in strax. Besides those things like pattern maps are hardcoded in load_resource.py. These files are overrideable with the option “fax_config_override”
The strax config is viewable like this and can be changed by st.set_config(dict(option you want=value you want))
[12]:
st.show_config('raw_records')
/Users/petergaemers/Desktop/python/strax/strax/context.py:369: UserWarning: Option channel_map not taken by any registered plugin
warnings.warn(f"Option {k} not taken by any registered plugin")
[12]:
| option | default | current | applies_to | help | |
|---|---|---|---|---|---|
| 0 | fax_file | None | /dali/lgrandi/pgaemers/fax_files/Xenon1T_Whole... | (raw_records, truth) | Directory with fax instructions |
| 1 | nSort_path | None | <OMITTED> | (raw_records, truth) | If nSort is needed to convert GEANT4 to instru... |
| 2 | fax_config_override | None | <OMITTED> | (raw_records, truth) | Dictionary with configuration option overrides |
| 3 | event_rate | 5 | 5 | (raw_records, truth) | Average number of events per second |
| 4 | chunk_size | 5 | 10 | (raw_records, truth) | Duration of each chunk in seconds |
| 5 | nchunk | 4 | 1 | (raw_records, truth) | Number of chunks to simulate |
| 6 | fax_config | https://raw.githubusercontent.com/XENONnT/stra... | https://raw.githubusercontent.com/XENONnT/stra... | (raw_records, truth) | |
| 7 | to_pe_file | https://raw.githubusercontent.com/XENONnT/stra... | https://raw.githubusercontent.com/XENONnT/stra... | (raw_records, truth) | |
| 8 | gain_model | (to_pe_per_run, https://raw.githubusercontent.... | (to_pe_constant, 0.005) | (raw_records, truth) | PMT gain model. Specify as (model_type, model_... |
| 9 | right_raw_extension | 50000 | <OMITTED> | (raw_records, truth) | |
| 10 | zle_threshold | 0 | <OMITTED> | (raw_records, truth) | |
| 11 | detector | XENON1T | XENONnT | (raw_records, truth) | |
| 12 | timeout | 1800 | <OMITTED> | (raw_records, truth) | Terminate processing if any one mailbox receiv... |
The config from github can be loaded as:
[13]:
straxen.get_resource('https://raw.githubusercontent.com/XENONnT/'
'strax_auxiliary_files/master/fax_files/fax_config_nt.json',fmt='json').keys()
[13]:
dict_keys(['pmt_pulse_time_rounding', 'gauss_noise_sigmas', 'real_noise_file', 'real_noise_sample_mode', 'maximum_recombination_time', 's1_detection_efficiency', 'singlet_lifetime_liquid', 'triplet_lifetime_liquid', 's1_ER_recombination_fraction', 's1_ER_primary_singlet_fraction', 's1_ER_secondary_singlet_fraction', 's1_NR_singlet_fraction', 's1_ER_alpha_singlet_fraction', 'electron_trapping_time', 'gas_drift_velocity_slope', 'lxe_dielectric_constant', 'singlet_lifetime_gas', 'triplet_lifetime_gas', 'singlet_fraction_gas', 'each_pmt_afterpulse_types', 'pmt_afterpulse_types', 'p_double_pe_emision', 's1_model_type', 's1_decay_time', 'pmt_transit_time_spread', 'real_noise_sample_size', 'pressure', 'temperature', 'anode_voltage', 'pmt_rise_time', 'pmt_fall_time', 'photon_area_distribution', 'pe_pulse_ts', 'pe_pulse_ys', 'samples_before_pulse_center', 'samples_after_pulse_center', 'pmt_transit_time_mean', 'led_pulse_length', 'drift_field', 'liquid_density', 'diffusion_constant_liquid', 'electron_extraction_yield', 'gate_to_anode_distance', 'elr_gas_gap_length', 's2_secondary_sc_gain', 'anode_field_domination_distance', 'anode_wire_radius', 's2_fitted_patterns_file', 's2_mean_area_fraction_top', 'rz_position_distortion_map', 'tpc_name', 'run_number', 'channels_top', 'channels_bottom', 'sample_duration', 'digitizer_voltage_range', 'nominal_gain', 'digitizer_bits', 'pmt_circuit_load_resistor', 'external_amplification', 'digitizer_reference_baseline', 's1_width_threshold', 's1_risetime_threshold', 'tight_coincidence_window_left', 'tight_coincidence_window_right', 'tight_coincidence_threshold', 'tpc_length', 'tpc_radius', 'electron_lifetime_liquid', 'drift_velocity_liquid', 'drift_time_gate', 'shrink_data_threshold', 'xy_posrec_preference', 'channels_in_detector', 'n_channels', 'gains', 'gain_sigmas', 'quantum_efficiencies', 'relative_qe_error', 'relative_gain_error', 'pmts', 'noise_file_index', 'noise_file_folder', 'max_intervals', 'samples_to_store_before', 'samples_to_store_after', 'zle_threshold', 'special_thresholds', 's1_ER_recombination_time', 'pmt_ap_modifier', 'photoionization_modifier', 'pmt_ap_t_modifier', 'trigger_window', 'photoelectric_p', 'photoelectric_modifier', 'photoelectric_t_center', 'photoelectric_t_spread'])
Changing things in this guy goes slightly different. In the strax option list there is the option called “fax_config_override”. This takes a dict which will be used to override any values in the json config. So changing the ‘s2_secondary_sc_gain’ is done as:
[14]:
st.set_config(dict(fax_config_override = dict(s2_secondary_sc_gain=23)))
What actually happens?¶
What happens behind the scenes is that the instructions are first grouped together in chunks. Then we loop over the instructions and the full chunk is returned before starting with the next one.
We use a S1 and S2 class to calculate the arrival times of the photons and the channels which have been hit. Then we’ll hand them over to the Pulse class to calculate the currents in the channels. Finally the currents go to a RawData class where we fake the digitizer response.
S1¶
For S1s we start with calculating the light yield based on the position of the interaction, and draw the number of photons seen from a Poisson distribution.
Second we calculate the arrival times of the photons. This is based on the scintillation of the xenon atoms. It is dependend on the recombination time, the singlet and triplet fractions.
Finally the channels are calculated. Based on the pattern map we use a interpolation map to get a probability distribution for channels to be hit for a S1 signal based on the position of the interaction, and then we draw from this distribution for every photon.
S2¶
S2s are slightly more complicated. First we need to drift the electrons up, and while doing so we’ll lose some of them. To get the photon timings, we first need to get the arrival times of the electrons at the gas interface based on a diffusion model. Then we can calculate the photon timings based on a luminescence model for every individual electron. And for the channels we do the same trick with the interpolating map.
Pulse¶
When we have our lists of channels and timing we can generate actual pulses. First we add a pmt transition time. Then we loop over all channels, calculate the double pe emission probabilities, and add a current in the pmt channel based on the arrival time. This is all stored in a big dictionary. Afterwards this is passed to our fake digitizers which then returns you with your very own pretty data
Getting down to bussiness¶
Now we have access to all the normal strax data types, and another one called ‘truth’ which holds the simulation instructions. Calling it follows the normal strax convention.
[15]:
st.set_config(dict(fax_file=None))
[17]:
# Remove any previously simulated data, if such exists
# !rm -r strax_data
records = st.get_array(run_id,'records')
# peaks = st.get_array(run_id, ['peaks','peak_basics'])
# data = st.get_df(run_id, 'event_info')
truth = st.get_df(run_id, 'truth')
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-17-98001a2a1c1a> in <module>
2 # !rm -r strax_data
3
----> 4 records = st.get_array(run_id,'records')
5 # peaks = st.get_array(run_id, ['peaks','peak_basics'])
6 # data = st.get_df(run_id, 'event_info')
~/Desktop/python/strax/strax/context.py in get_array(self, run_id, targets, save, max_workers, **kwargs)
1106 max_workers=max_workers,
1107 **kwargs)
-> 1108 results = [x.data for x in source]
1109
1110 results = np.concatenate(results)
~/Desktop/python/strax/strax/context.py in <listcomp>(.0)
1106 max_workers=max_workers,
1107 **kwargs)
-> 1108 results = [x.data for x in source]
1109
1110 results = np.concatenate(results)
~/Desktop/python/strax/strax/context.py in get_iter(self, run_id, targets, save, max_workers, time_range, seconds_range, time_within, time_selection, selection_str, keep_columns, _chunk_number, progress_bar, **kwargs)
900 save=save,
901 time_range=time_range,
--> 902 chunk_number=_chunk_number)
903
904 # Cleanup the temp plugins
~/Desktop/python/strax/strax/context.py in get_components(self, run_id, targets, save, time_range, chunk_number)
768 for d in plugins.values():
769 self._set_plugin_config(d, run_id, tolerant=False)
--> 770 d.setup()
771 return strax.ProcessorComponents(
772 plugins=plugins,
~/Desktop/python/wfsim/wfsim/strax_interface.py in setup(self)
541
542 def setup(self):
--> 543 self.set_config()
544
545 c=self.config
~/Desktop/python/wfsim/wfsim/strax_interface.py in set_config(self)
564 def set_config(self,):
565 c = self.config
--> 566 c.update(get_resource(c['fax_config'], fmt='json'))
567 overrides = self.config['fax_config_override']
568 if overrides is not None:
~/Desktop/python/straxen/straxen/common.py in get_resource(x, fmt)
155 # 2. load from file
156 elif os.path.exists(x):
--> 157 return open_resource(x, fmt=fmt)
158 # 3. load from database
159 elif straxen.uconfig is not None:
~/Desktop/python/straxen/straxen/common.py in open_resource(file_name, fmt)
113 elif fmt == 'json':
114 with open(file_name, mode='r') as f:
--> 115 result = json.load(f)
116 elif fmt == 'binary':
117 with open(file_name, mode='rb') as f:
~/anaconda3/envs/strax/lib/python3.7/json/__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
294 cls=cls, object_hook=object_hook,
295 parse_float=parse_float, parse_int=parse_int,
--> 296 parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
297
298
~/anaconda3/envs/strax/lib/python3.7/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
346 parse_int is None and parse_float is None and
347 parse_constant is None and object_pairs_hook is None and not kw):
--> 348 return _default_decoder.decode(s)
349 if cls is None:
350 cls = JSONDecoder
~/anaconda3/envs/strax/lib/python3.7/json/decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):
~/anaconda3/envs/strax/lib/python3.7/json/decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Now it is time to make pretty plots and see if what we makes actually makes any sense
[17]:
peak_basics = st.get_df(run_id,'peak_basics')
[18]:
peak_basics[:10]
[18]:
| time | endtime | center_time | area | n_channels | max_pmt | max_pmt_area | range_50p_area | range_90p_area | area_fraction_top | length | dt | rise_time | tight_coincidence | type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 99999510 | 100000190 | 100000098 | 1511.265015 | 438 | 425 | 11 | 45.498920 | 108.933258 | 0.090103 | 68 | 10 | 32.073212 | 278 | 1 |
| 1 | 100000190 | 100000460 | 100000224 | 160.055038 | 436 | 425 | 2 | 41.434231 | 93.053680 | 0.088126 | 27 | 10 | 27.741791 | 19 | 1 |
| 2 | 100000470 | 100002090 | 100001584 | 1.275000 | 1 | 477 | 1 | 19.543358 | 126.469147 | 0.000000 | 162 | 10 | 12.299422 | 1 | 0 |
| 3 | 100002730 | 100003030 | 100002842 | 2.070000 | 3 | 36 | 0 | 211.971954 | 235.813843 | 1.000000 | 30 | 10 | 17.707243 | 2 | 0 |
| 4 | 100164950 | 100166240 | 100165555 | 18.500000 | 17 | 128 | 3 | 202.121216 | 416.111206 | 0.173243 | 129 | 10 | 187.104172 | 3 | 2 |
| 5 | 100514580 | 100523780 | 100518825 | 11563.562500 | 491 | 206 | 1228 | 1771.331787 | 4564.572266 | 0.087043 | 184 | 50 | 1627.153198 | 59 | 2 |
| 6 | 100870810 | 100871920 | 100871312 | 18.850000 | 19 | 34 | 1 | 144.104645 | 346.100159 | 0.701591 | 111 | 10 | 124.979836 | 8 | 2 |
| 7 | 100938950 | 100940020 | 100939763 | 20.995001 | 19 | 97 | 5 | 183.766678 | 469.409851 | 0.568231 | 107 | 10 | 316.340454 | 4 | 2 |
| 8 | 299999510 | 300000190 | 300000098 | 1476.585693 | 458 | 332 | 9 | 44.220722 | 108.634956 | 0.156120 | 68 | 10 | 30.482187 | 313 | 1 |
| 9 | 300000190 | 300000770 | 300000223 | 158.209991 | 455 | 179 | 1 | 40.539444 | 93.234810 | 0.161873 | 58 | 10 | 26.819931 | 32 | 1 |
Matching¶
To do matching with the truth the easiest way is to write a new strax plugin where you loop over peaks and get the truth arrays where the mean arrival time of the photons are within the time window of the peak So that will look something like this:
[20]:
class MatchedPeaks(strax.LoopPlugin):
depends_on = ('peak_basics','truth')
provides = 'matched_peaks'
__version__ = '0.0.2'
dtype = [('time',np.int),
('endtime',np.int),
('area',np.int),
('n_photon',np.int)]
def compute(self, peaks, truth):
result = np.zeros(len(peaks), self.dtype)
for ix, p in enumerate(peaks):
t = truth[(p['time']<truth['t_mean_photon'])&
(p['endtime']>truth['t_mean_photon'])]
r = result[ix]
r['time'] = p['time']
r['endtime'] = p['endtime']
r['area'] = p['area']
if len(t)==0:
r['n_photon'] = 0
elif len(t)>1:
r['n_photon'] = np.sum(t['n_photon'])
else:
r['n_photon'] = t['n_photon']
return result
Of course this doesn’t actually work. An electron afterpulse can be very spread out leading it to be interpreted as multiple peaks while coming from 1 instruction falling outside of the specified range. It would be very much appreciated if someone wants make a more sturdy selection criteria :)
For example, checkout out peak matching algorithm for WFSim: https://github.com/XENONnT/pema
[21]:
st.register(MatchedPeaks)
[21]:
__main__.MatchedPeaks
[22]:
st.get_array(run_id,'matched_peaks')
[ ]: