Welcome to wfsim’s documentation!
Github page: https://github.com/XENONnT/WFSim
WFSim is the wavefrom simulation framework for XENONnT, built on top of the straxen framework.
Installation
Installing wfsim
To get started for simulating XENONnT please follow the straxen setup guide. Simulating XENON1T works out of the box.
On top of
straxen
, installwfsim
:
pip install wfsim
Developer setup
To get started for simulating XENONnT please follow the straxen setup guide.
Install wfsim in editable mode from source:
git clone https://github.com/XENONnT/WFSim wfsim
pip install -e wfsim
Getting started with WFSim
Hello friend. Welcome to the basic tutorial on how to simulate waveforms with the latest wfsim version in strax. Here we’ll just demonstrate the basic functionality. For more in depth analysis stuff, checkout the straxen tutorials for more detailed thing.
[1]:
import numpy as np
import strax
import straxen
import wfsim
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from multihist import Histdd, Hist1d
from scipy import stats
import json
*** Detector definition message ***
You are currently using the default XENON10 template detector.
[79]:
straxen.print_versions(modules=('strax', 'straxen', 'cutax', 'wfsim', 'ntauxfiles'))
ntauxfiles is not installed
[79]:
module | version | path | git | |
---|---|---|---|---|
0 | python | 3.9.17 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/b... | None |
1 | strax | 1.5.2 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... | None |
2 | straxen | 2.1.1 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... | None |
3 | cutax | 1.15.1 | /home/shenyangshi/cutax/cutax | branch:bdt_ms | 277f174 |
4 | wfsim | 1.0.2 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... | None |
Different simulators
Because we’d like to make simulations we have multiple different ways of making data. They are: * RawRecordsFromFaxNT(1T)
* RawRecordsFromFaxEpix
* RawRecordsFromFaxOptical
* RawRecordsFromFaxNveto
The main difference is the way you specify instructions. The NT(1T) simulator depends on either a csv file or a default/user specified function to generate instructions. Epix takes input from G4 processed with epix. Optical takes optical input from G4 for photon channels and timings. And Nveto is a small modification to optical to provide nveto data
Setting everything up
First we need to define the right context. For this we made two contexts to help you out. One for 1T and one for nT. By default in your context RawRecordsFromFaxNT will be registered. If you want another one be sure to register it
For 1T do:
[2]:
st = straxen.contexts.xenon1t_simulation()
Or nT do:
[5]:
st = straxen.contexts.xenonnt_simulation(cmt_run_id_sim='00001')
/opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/straxen/url_config.py:711: UserWarning: From straxen version 2.1.0 onward, URLConfig parameterswill be sorted alphabetically before being passed to the plugins, this will change the lineage hash for non-sorted URLs. To load data processed with non-sorted URLs, you will need to use an older version.
warnings.warn("From straxen version 2.1.0 onward, URLConfig parameters"
Now we need to define a run id. What you give it doesn’t really matter, since strax will look for data and make new if it doesn’t find anything. And this is what you want. Strax will use the run id to get the electron lifetime and pmt gains from a database, and returns placeholders if the run doesn’t exist.
[6]:
run_id = '1'
Defining instructions
For the instructions there are multiple different ways to do it. The simulator has this option called “fax_file”. If it has a value (None by default) the simulator will either read it as a csv or root file. If not it will use some predefined functions to make your events. The number of event you’ll simulate based on the product of the config values “nchunk”,”event_rate” and “chunk_size”. Which you can set as follows:
[7]:
st.set_config(dict(nchunk=1, event_rate=5, chunk_size=10,))
Strax groups data together in chunks based on time (for low level data). nchunk is the number of chunks you want to simulate event_rate is the number of events per seconds, so this effects the amount of spacing between events. Finally chunk_size is the length in seconds of your chunk For the DAQ this is 5 seconds. For simulations you can do whatever you want. It is important to note that Strax will write out data per chunk So when your chunks are small you’ll, among other things, call Strax’ IO functions a lot, giving a substantial overhead. On the other hand, to large chunks will hog all your memory and your kernel might crash. Based on my experimentation setting chunk_size to 500 gives best performance
These are ways you can give instructions ### Random This is the default where simulator will generate some random instructions for you. ### Custom For this you will need to overwrite the instruction generator function ### CSV You can provide a csv file with the instruction (Like the output of nSort) ### Geant4 For this you’ll need to use epix to do some clustering for you.
Random
I guess this is pretty self explanatory. The simulator has this function called “rand_instructions” which will make something up for you.
[8]:
wfsim.strax_interface.rand_instructions??
Signature: wfsim.strax_interface.rand_instructions(c)
Source:
def rand_instructions(c):
"""Random instruction generator function. This will be called by wfsim if you do not specify
specific instructions.
:params c: wfsim configuration dict"""
log.warning('rand_instructions is deprecated, please use wfsim.random_instructions')
if 'drift_field' not in c:
log.warning('drift field not specified!')
# Do the conversion for now, don't break everything all at once
return _rand_instructions(event_rate=c.get('event_rate', 10),
chunk_size=c.get('chunk_size', 5),
n_chunk=c.get('n_chunk', 2),
energy_range=[1, 100], # keV
drift_field=c.get('drift_field', 100), # V/cm
tpc_radius=c.get('tpc_radius', straxen.tpc_r),
tpc_length=c.get('tpc_length', straxen.tpc_z),
nest_inst_types=[7]
)
File: /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/wfsim/strax_interface.py
Type: function
Custom
This is some more fun. To do this we’ll write a new function which returns a structured numpy array with the correct dtype.
[9]:
wfsim.instruction_dtype
[9]:
[(('Waveform simulator event number.', 'event_number'), numpy.int32),
(('Quanta type (S1 photons or S2 electrons)', 'type'), numpy.int8),
(('Time of the interaction [ns]', 'time'), numpy.int64),
(('X position of the cluster [cm]', 'x'), numpy.float32),
(('Y position of the cluster [cm]', 'y'), numpy.float32),
(('Z position of the cluster [cm]', 'z'), numpy.float32),
(('Number of quanta', 'amp'), numpy.int32),
(('Recoil type of interaction.', 'recoil'), numpy.int8),
(('Energy deposit of interaction', 'e_dep'), numpy.float32),
(('Eventid like in geant4 output rootfile', 'g4id'), numpy.int32),
(('Volume id giving the detector subvolume', 'vol_id'), numpy.int32),
(('Local field [ V / cm ]', 'local_field'), numpy.float64),
(('Number of excitons', 'n_excitons'), numpy.int32),
(('X position of the primary particle [cm]', 'x_pri'), numpy.float32),
(('Y position of the primary particle [cm]', 'y_pri'), numpy.float32),
(('Z position of the primary particle [cm]', 'z_pri'), numpy.float32)]
Event number is just a lable which peaks are together. type is either 1(S1) or 2 (S2). In the truth higher numbers are also used to refer to different types of afterpulses. time,x,y,z are the time and positions of the signal. Amp is the number of photons or electrons generated, and recoil can be used for different types of recoil (but only Electronic recoil is supported). Recoil is a number according to the NEST convention which will be used to indicate which interaction. For the key look here: https://github.com/XENONnT/WFSim/blob/2c614b0f7b0d7c7adc516f6188e281857e8d7e22/wfsim/core.py#L22
Now, lets say we want some krypton peaks. For this we’ll need to change the default instruction function to include this double decay and use Nestpy to convert energy deposits into a number of photons and electrons.
In this case I’ll use 1 “event” per full decay, that where all the 4’s are coming from.
[80]:
def super_awesome_custom_instruction(c):
import nestpy
half_life = 156.94e-9 #Kr intermediate state half-life in ns
decay_energies = [32.2,9.4] # Decay energies in kev
n = c['nevents'] = c['event_rate'] * c['chunk_size']
c['total_time'] = c['chunk_size']
instructions = np.zeros(4 * n, dtype=wfsim.instruction_dtype)
instructions['event_number'] = np.repeat(np.arange(n), 4)
instructions['type'] = np.tile([1, 2], 2 * n)
instructions['recoil'] = [7 for i in range(4 * n)]
r = np.sqrt(np.random.uniform(0, 2500, n))
t = np.random.uniform(-np.pi, np.pi, n)
instructions['x'] = np.repeat(r * np.cos(t), 4)
instructions['y'] = np.repeat(r * np.sin(t), 4)
instructions['z'] = np.repeat(np.random.uniform(-100, 0, n), 4)
#To get the correct times we'll need to include the 156.94 ns half life of the intermediate state.
uniform_times = c['total_time'] * (np.arange(n) + 0.5) / n
delayed_times = uniform_times + np.random.exponential(half_life/np.log(2),len(uniform_times))
instructions['time'] = np.repeat(list(zip(uniform_times,delayed_times)),2) * 1e9
# Here we'll define our XENON-like detector
nc = nestpy.NESTcalc(nestpy.VDetector())
A = 131.293
Z = 54.
density = 2.862 # g/cm^3 #SR1 Value
drift_field = 82 # V/cm #SR1 Value
interaction = nestpy.INTERACTION_TYPE(7)
energy = np.tile(decay_energies,n)
quanta = []
exciton = []
for en in energy:
y = nc.GetYields(interaction,
en,
density,
drift_field,
A,
Z,
(1, 1))
q = nc.GetQuanta(y, density)
quanta.append(q.photons)
quanta.append(q.electrons)
exciton.append(q.excitons)
exciton.append(0)
instructions['amp'] = quanta
instructions['local_field'] = drift_field
instructions['n_excitons'] = exciton
return instructions
Now here comes the magic line:
[81]:
wfsim.strax_interface.rand_instructions = super_awesome_custom_instruction
This changes the default rand_instruction function in our own super awesome function. So when the simulator will call rand_instruction the code from super_awesome_custom_instruction will be executed
CSV
The format for csv files is the same as the instructions dtype. So on every line specify event_number,type,time ,x,y,z, amp and recoil in that order. Then tell the simulator it exists:
[82]:
st.set_config(dict(fax_file='instructions.csv'))
Ofcourse if you do not have this file it will not work
Geant4
For starters you’ll need to register the Epix plugin. This requires an epix configuration (it’s on private_nt_aux_files), and if you want a start and stop event.
[83]:
st.register(wfsim.strax_interface.RawRecordsFromMcChain)
st.set_config(dict(fax_file = 'path_to_g4_file',
epix_config = 'path_to_config'))
Optical
Similar to above, register correct plugin and file:
[84]:
st.register(wfsim.strax_interface.RawRecordsFromMcChain)
st.set_config(dict(fax_file = 'path_to_g4_file',))
nVeto
Because the nveto is a different system it has a slightly different configuration. This is not entirely finished. Once it is you can run like so:
[85]:
st.register(wfsim.strax_interface.RawRecordsFromFaxnVeto)
st.set_config(dict(fax_config_nveto = 'path_to_nveto_config_file',
fax_file = 'path_to_g4_file',))
Configuration customization
The simulator using a larger large amount of configuration settings to do it’s magic. Some of them are best left along, like pmt_circuit_load_resistor. Others on the other hand are things you might want change a bit to see how the data will change. Unfortunately currently the full list is spread out over two different places. One is the fax config json which is on github. The other is the option list in strax. Besides those things like pattern maps are hardcoded in load_resource.py. These files are overrideable with the option “fax_config_override”
The strax config is viewable like this and can be changed by st.set_config(dict(option you want=value you want))
[86]:
st.show_config('raw_records')
[86]:
option | default | current | applies_to | help | |
---|---|---|---|---|---|
0 | gain_model_mc | cmt://to_pe_model?version=ONLINE&run_id=plugin... | cmt://to_pe_model?run_id=00001&version=ONLINE | (raw_records, raw_records_he, raw_records_aqmo... | PMT gain model. Specify as (model_type, model_... |
1 | detector | XENONnT | XENONnT | (raw_records, raw_records_he, raw_records_aqmo... | |
2 | event_rate | 1000 | 5 | (raw_records, raw_records_he, raw_records_aqmo... | Average number of events per second |
3 | chunk_size | 100 | 1 | (raw_records, raw_records_he, raw_records_aqmo... | Duration of each chunk in seconds |
4 | n_chunk | 10 | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | Number of chunks to simulate |
5 | per_pmt_truth | False | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | Store the info per channel in the truth file |
6 | fax_file | None | path_to_g4_file | (raw_records, raw_records_he, raw_records_aqmo... | Directory with fax instructions |
7 | fax_config | fax_config_nt_design.json | fax_config_nt_design.json | (raw_records, raw_records_he, raw_records_aqmo... | |
8 | fax_config_override | None | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | Dictionary with configuration option overrides |
9 | fax_config_override_from_cmt | None | {'drift_time_gate': ('cmt_run_id', '00001', 'e... | (raw_records, raw_records_he, raw_records_aqmo... | Dictionary of fax parameter names (key) mapped... |
10 | channel_map | <OMITTED> | (tpc, he, aqmon, aqmon_nv, tpc_blank, mv, aux_... | (raw_records, raw_records_he, raw_records_aqmo... | immutabledict mapping subdetector to (min, max... |
11 | n_tpc_pmts | <OMITTED> | 494 | (raw_records, raw_records_he, raw_records_aqmo... | Number of pmts in tpc. Provided by context |
12 | n_top_pmts | <OMITTED> | 253 | (raw_records, raw_records_he, raw_records_aqmo... | Number of pmts in top array. Provided by context |
13 | right_raw_extension | 100000 | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | |
14 | seed | False | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | Option for setting the seed of the random numb... |
15 | gain_model_nv | <OMITTED> | cmt://to_pe_model_nv?run_id=00001&version=ONLINE | (raw_records, raw_records_he, raw_records_aqmo... | nveto gain model, provided by context |
16 | epix_config | {} | path_to_config | (raw_records, raw_records_he, raw_records_aqmo... | Dict with epix configuration |
17 | entry_start | 0 | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | |
18 | entry_stop | None | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | G4 id event number to stop at. If -1 process t... |
19 | fax_config_nveto | None | path_to_nveto_config_file | (raw_records, raw_records_he, raw_records_aqmo... | |
20 | fax_config_override_nveto | None | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | Dictionary with configuration option overrides |
21 | targets | (tpc,) | <OMITTED> | (raw_records, raw_records_he, raw_records_aqmo... | tuple with what data to simulate (tpc, nveto o... |
With the following block you can visualized which fax config you are using. Remeber that the fax config files are stored in private_nt_aux_files github repository, in the sim_files folder.
[87]:
st.config['fax_config']
[87]:
'fax_config_nt_design.json'
The config from github can be loaded as:
[88]:
straxen.get_resource('fax_config_nt_design.json',fmt='json').keys()
[88]:
dict_keys(['s1_model_type', 's2_time_model', 's2_luminescence_model', 's2_garfield_confine_position', 'se_gain_from_map', 'ext_eff_from_map', 'enable_gas_gap_warping', 'enable_pmt_afterpulses', 'enable_electron_afterpulses', 'enable_noise', 'field_distortion_on', 'field_distortion_model', 'enable_field_dependencies', 'url_base', 'photon_area_distribution', 's1_pattern_map', 's2_pattern_map', 's1_lce_correction_map', 's2_correction_map', 'se_gain_map', 's1_time_spline', 's2_time_spline', 'field_distortion_comsol_map', 'photon_ap_cdfs', 'noise_file', 's2_luminescence_gg', 's2_luminescence', 'gas_gap_map', 'garfield_gas_gap_map', 'field_dependencies_map', 'temperature', 'pressure', 'lxe_dielectric_constant', 'tpc_length', 'tpc_radius', 'anode_wire_radius', 'anode_field_domination_distance', 'elr_gas_gap_length', 'gate_to_anode_distance', 'drift_field', 'anode_voltage', 'diffusion_constant_longitudinal', 'diffusion_constant_transverse', 'drift_time_gate', 'drift_velocity_liquid', 'singlet_fraction_gas', 'singlet_lifetime_gas', 'singlet_lifetime_liquid', 'triplet_lifetime_gas', 'triplet_lifetime_liquid', 's1_ER_alpha_singlet_fraction', 's1_ER_primary_singlet_fraction', 's1_ER_recombination_fraction', 's1_ER_recombination_time', 's1_ER_secondary_singlet_fraction', 's1_NR_singlet_fraction', 'maximum_recombination_time', 's1_decay_spread', 's1_decay_time', 's1_detection_efficiency', 's2_mean_area_fraction_top', 's2_aft_sigma', 's2_aft_skewness', 's2_secondary_sc_gain', 's2_time_spread', 'electron_extraction_yield', 'electron_lifetime_liquid', 'electron_trapping_time', 'gas_drift_velocity_slope', 'g2_mean', 'p_double_pe_emision', 'pe_pulse_ts', 'pe_pulse_ys', 'pmt_ap_modifier', 'pmt_ap_t_modifier', 'pmt_pulse_time_rounding', 'pmt_transit_time_mean', 'pmt_transit_time_spread', 'photoelectric_modifier', 'photoelectric_p', 'photoelectric_t_center', 'photoelectric_t_spread', 'photoionization_modifier', 'sample_duration', 'samples_after_pulse_center', 'samples_before_pulse_center', 'samples_to_store_after', 'samples_to_store_before', 'pmt_circuit_load_resistor', 'external_amplification', 'high_energy_deamplification_factor', 'trigger_window', 'digitizer_bits', 'digitizer_reference_baseline', 'digitizer_voltage_range', 'special_thresholds', 'zle_threshold'])
Changing things in this guy goes slightly different. In the strax option list there is the option called “fax_config_override”. This takes a dict which will be used to override any values in the json config. So changing the ‘s2_secondary_sc_gain’ is done as:
[89]:
st.set_config(dict(fax_config_override = dict(s2_secondary_sc_gain=23)))
Things you might want to change
There are 4 main things you might be interested in changing. They are: * Noise (True/False) * PMTAfterpulses (True/False) * Electron Afterpulses (True/False) * S2 luminescence (Simple or garfield)
They are controled by config settings you can change as so:
[90]:
st.set_config(dict(fax_config_override=dict(
s2_luminescence_model='simple',
enable_noise=False,
enable_pmt_afterpulses=False,
enable_electron_afterpulses=False,)))
What actually happens?
What happens behind the scenes is that the instructions are first grouped together in chunks. Then we loop over the instructions and the full chunk is returned before starting with the next one.
We use a S1 and S2 class to calculate the arrival times of the photons and the channels which have been hit. Then we’ll hand them over to the Pulse class to calculate the currents in the channels. Finally the currents go to a RawData class where we fake the digitizer response.
S1
For S1s we start with calculating the light yield based on the position of the interaction, and draw the number of photons seen from a Poisson distribution.
Second we calculate the arrival times of the photons. This is based on the scintillation of the xenon atoms. It is dependend on the recombination time, the singlet and triplet fractions.
Finally the channels are calculated. Based on the pattern map we use a interpolation map to get a probability distribution for channels to be hit for a S1 signal based on the position of the interaction, and then we draw from this distribution for every photon.
S2
S2s are slightly more complicated. First we need to drift the electrons up, and while doing so we’ll lose some of them. To get the photon timings, we first need to get the arrival times of the electrons at the gas interface based on a diffusion model. Then we can calculate the photon timings based on a luminescence model for every individual electron. And for the channels we do the same trick with the interpolating map.
Pulse
When we have our lists of channels and timing we can generate actual pulses. First we add a pmt transition time. Then we loop over all channels, calculate the double pe emission probabilities, and add a current in the pmt channel based on the arrival time. This is all stored in a big dictionary. Afterwards this is passed to our fake digitizers which then returns you with your very own pretty data
Getting down to bussiness
Now we have access to all the normal strax data types, and another one called ‘truth’ which holds the simulation instructions. Calling it follows the normal strax convention.
[91]:
st = straxen.contexts.xenonnt_simulation(cmt_run_id_sim='00001')
st.set_config(dict(event_rate=5, chunk_size=1, fax_file=None))
[92]:
# Remove any previously simulated data, if such exists
# !rm -r strax_data
records = st.get_array(run_id,'records')
# peaks = st.get_array(run_id, ['peaks','peak_basics'])
# data = st.get_df(run_id, 'event_info')
truth = st.get_df(run_id, 'truth')
Now it is time to make pretty plots and see if what we makes actually makes any sense
[93]:
peak_basics = st.get_df(run_id, 'peak_basics')
[94]:
peak_basics.head(10)
[94]:
time | endtime | center_time | area | n_hits | n_channels | max_pmt | max_pmt_area | n_saturated_channels | range_50p_area | range_90p_area | area_fraction_top | length | dt | rise_time | tight_coincidence | type | max_diff | min_diff | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 100000030 | 100000670 | 100000205 | 45.194511 | 14 | 13 | 345 | 5.892519 | 0 | 198.728180 | 327.420563 | 0.138998 | 64 | 10 | 94.678757 | 7 | 2 | 80 | 10 |
1 | 100622470 | 100626210 | 100624258 | 143.898376 | 41 | 17 | 50 | 45.115543 | 0 | 1164.394287 | 2533.276855 | 0.875330 | 187 | 20 | 930.115356 | 7 | 2 | 500 | 0 |
2 | 300000020 | 300000410 | 300000128 | 22.863621 | 7 | 6 | 483 | 5.560007 | 0 | 73.276711 | 153.896225 | 0.276379 | 39 | 10 | 53.755302 | 5 | 1 | 40 | 10 |
3 | 300724040 | 300724470 | 300724142 | 7.752567 | 3 | 3 | 461 | 3.555671 | 0 | 92.052193 | 296.905457 | 0.000000 | 43 | 10 | 25.530293 | 2 | 1 | 180 | 10 |
4 | 500000050 | 500000560 | 500000248 | 51.385868 | 11 | 11 | 408 | 20.320269 | 0 | 166.012741 | 310.153564 | 0.129670 | 51 | 10 | 142.900955 | 2 | 2 | 90 | 0 |
5 | 500589980 | 500593580 | 500591982 | 37.373817 | 15 | 10 | 241 | 8.749174 | 0 | 1940.112183 | 3275.072021 | 0.688202 | 180 | 20 | 2096.490967 | 3 | 2 | -1 | -1 |
6 | 700000040 | 700000610 | 700000139 | 15.011408 | 4 | 4 | 324 | 4.820670 | 0 | 57.526924 | 345.084930 | 0.000000 | 57 | 10 | 49.048229 | 3 | 1 | 270 | 20 |
7 | 700699840 | 700706240 | 700703462 | 102.122833 | 25 | 14 | 129 | 33.482544 | 0 | 3729.012451 | 5596.686035 | 0.880986 | 160 | 40 | 2050.706543 | 4 | 2 | -1 | -1 |
8 | 900000050 | 900000720 | 900000210 | 28.657455 | 8 | 8 | 346 | 7.821901 | 0 | 151.675720 | 443.599609 | 0.229542 | 67 | 10 | 58.658257 | 5 | 1 | 290 | 0 |
9 | 900703540 | 900707830 | 900705461 | 99.501129 | 24 | 16 | 64 | 27.903099 | 0 | 1144.565796 | 3354.698730 | 0.942021 | 143 | 30 | 838.284546 | 2 | 2 | 610 | 10 |
Matching
To do matching with the truth the easiest way is to write a new strax plugin where you loop over peaks and get the truth arrays where the mean arrival time of the photons are within the time window of the peak So that will look something like this:
[95]:
class MatchedPeaks(strax.LoopPlugin):
depends_on = ('peak_basics', 'truth')
provides = 'matched_peaks'
__version__ = '0.0.2'
dtype = [('time', np.int),
('endtime', np.int),
('area', np.int),
('n_photon', np.int)]
def compute(self, peaks, truth):
result = np.zeros(len(truth), self.dtype)
temp = strax.touching_windows(peaks, truth)
idx_truth = []
idx_peak = []
for ix, (l, r) in enumerate(temp):
res = result[ix]
res['time'] = truth['time'][ix]
res['endtime'] = strax.endtime(truth)[ix]
res['n_photon'] = truth['n_photon'][ix]
if r > l:
i_peak = l+np.argmax(peaks['area'][l:r])
res['area'] = peaks['area'][i_peak]
return result
/tmp/jobs/29284557/ipykernel_277/3800134942.py:6: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
dtype = [('time', np.int),
/tmp/jobs/29284557/ipykernel_277/3800134942.py:7: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
('endtime', np.int),
/tmp/jobs/29284557/ipykernel_277/3800134942.py:8: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
('area', np.int),
/tmp/jobs/29284557/ipykernel_277/3800134942.py:9: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
('n_photon', np.int)]
Of course this doesn’t actually work. An electron afterpulse can be very spread out leading it to be interpreted as multiple peaks while coming from 1 instruction falling outside of the specified range. It would be very much appreciated if someone wants make a more sturdy selection criteria :)
For example, checkout out peak matching algorithm for WFSim: https://github.com/XENONnT/pema
[96]:
st.register(MatchedPeaks)
[96]:
__main__.MatchedPeaks
[97]:
st.get_df(run_id, 'matched_peaks').head(10)
[97]:
time | endtime | area | n_photon | |
---|---|---|---|---|
0 | 100000029 | 100000630 | 45 | 251 |
1 | 100000333 | 100000791 | 45 | 69 |
2 | 100620431 | 100627474 | 143 | 856 |
3 | 100622911 | 100627768 | 143 | 211 |
4 | 300000006 | 300000565 | 22 | 255 |
5 | 300000067 | 300000556 | 22 | 69 |
6 | 300723015 | 300728784 | 7 | 186 |
7 | 300723660 | 300727606 | 7 | 128 |
8 | 500000003 | 500000619 | 51 | 235 |
9 | 500000111 | 500000642 | 51 | 49 |
Externally call functions
The s1/s2 functions used for generating photon timings and channels are now doocumented and callable from the outside. This you can use if you want to do something like compare photon arrival timings between wfsims results and g4
[98]:
st = straxen.contexts.xenonnt_simulation(cmt_run_id_sim='00001')
Object `wfsim.core.S1.get_n_photons` not found.
[119]:
config = straxen.get_resource('fax_config_nt_sr0_v4.json', fmt='json')
config.update({'detector':'XENONnT', 'right_raw_extension':50000})
st.config['gains'] = [1 for _ in range(straxen.n_tpc_pmts)]
config.update(st.config)
[122]:
s2 = wfsim.core.s2.S2(config)
WARNING:wfsim.resource:Loading XENONnT_SR0_spe_distributions_20210713_no_noise_scaled.csv from mongo downloader to /dali/lgrandi/strax/resource_cache/99c2cbc580cfa8eeebe831456076e136
WARNING:wfsim.resource:Loading XENONnT_s1_xyz_patterns_LCE_MCvf051911_wires.pkl from mongo downloader to /dali/lgrandi/strax/resource_cache/02ed2634596793f6e7aa13783745089a
WARNING:wfsim.resource:A file has value False, assuming this is intentional.
WARNING:wfsim.resource:Loading XENONnT_s2_xy_patterns_GXe_LCE_corrected_qes_MCv4.3.0_wires.pkl from mongo downloader to /dali/lgrandi/strax/resource_cache/6639b01b73d4fdbb703174ad424801f5
WARNING:wfsim.resource:Loading XENONnT_s2_xy_map_v4_210503_mlp_3_in_1_iterated.json from mongo downloader to /dali/lgrandi/strax/resource_cache/ca348171778ae5cdb5de6c1f5814f156
WARNING:wfsim.resource:Loading XENONnT_se_xy_map_v1_mlp.json from mongo downloader to /dali/lgrandi/strax/resource_cache/4ed37f1d06853503899ac37a0a2237ba
WARNING:wfsim.resource:Loading XENONnT_pmt_afterpulse_config_018435.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/a38b18cd61ca67a065a3a32d9b57c2b2
WARNING:wfsim.resource:Loading XENONnT_GARFIELD_SR0_B2d75n_C2d75n_G0d3p_A4d9p_T0d9n_PMTs1d3n_FSR0d65p.npz from mongo downloader to /dali/lgrandi/strax/resource_cache/42d87bc750ceb2f3c0491a586e016341
WARNING:wfsim.resource:Loading garfield_timing_map_gas_gap_sr0.npy from mongo downloader to /dali/lgrandi/strax/resource_cache/803c5f08cb89063a9671ac8da5cb7e75
WARNING:wfsim.resource:Loading gas_gap_warping_map_January_2021.pkl from mongo downloader to /dali/lgrandi/strax/resource_cache/d37b34dda3e865ef607b0bee2057772c
WARNING:wfsim.resource:Loading garfield_gas_gap_map_sr0.json from mongo downloader to /dali/lgrandi/strax/resource_cache/d39d7eda03373635c60e209067470dc8
WARNING:wfsim.resource:Loading x1t_se_afterpulse_delaytime.pkl.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/014aa048758a51f154cdc3b32dac2773
WARNING:wfsim.resource:Loading XENONnT_noise_tpc_only_2ms_25118.npz from mongo downloader to /dali/lgrandi/strax/resource_cache/e560759e28f74d5c049425c25e5ae0d5
WARNING:wfsim.resource:Loading XnT_3D_FDC_xyt_dummy_all_zeros_v0.1.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/55b30b2d26d151ae8e2aa55ffe6693ad
WARNING:wfsim.resource:Loading field_dependent_radius_depth_maps_B2d75n_C2d75n_G0d3p_A4d9p_T0d9n_PMTs1d3n_FSR0d65p_QPTFE_0d5n_0d4p.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/487272e33b4c832ae715350b6275ba0b
WARNING:wfsim.resource:Loading data_driven_diffusion_map_XENONnTSR0V2.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/8a6c6dc1bfe5380f48e96462161a3313
WARNING:wfsim.resource:Loading XENONnT_s1_proponly_pc_reflection_optPhot_perPMT_S1_local_20220510.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/91f4a6162d5e335c4416165349222061
WARNING:wfsim.resource:Loading XENONnT_s2_opticalprop_time_v0.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/1d257e215e00873e37ae49f16a04d717
WARNING:wfsim.resource:Using json for unspecified XENONnT_se_xy_map_v1_mlp.json
/opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/wfsim/load_resource.py:262: RuntimeWarning: invalid value encountered in true_divide
orig_aft_=np.mean((s2map_topeff_/s2map_toteff_)[s2map_toteff_>0.0])
WARNING:wfsim.resource:Using json.gz for unspecified XENONnT_s2_opticalprop_time_v0.json.gz
WARNING:wfsim.resource:Using noise data XENONnT_noise_tpc_only_2ms_25118.npz with 494 channels for XENONnT
[124]:
s2.electron_timings?
Signature:
s2.electron_timings(
t,
n_electron,
drift_time_mean,
drift_time_spread,
sc_gain,
timings,
gains,
electron_trapping_time,
)
Call signature: s2.electron_timings(*args, **kwargs)
Type: CPUDispatcher
String form: CPUDispatcher(<function S2.electron_timings at 0x7f8d3dae25e0>)
File: /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/wfsim/core/s2.py
Docstring:
Calculate arrival times of the electrons. Data is written to the timings and gains arrays
:param t: 1d array of ints
:param n_electron:1 d array of ints
:param drift_time_mean: 1d array of floats
:param drift_time_spread: 1d array of floats
:param sc_gain: secondary scintillation gain
:param timings: empty array with length sum(n_electron)
:param gains: empty array with length sum(n_electron)
:param electron_trapping_time: configuration values
Class docstring:
Implementation of user-facing dispatcher objects (i.e. created using
the @jit decorator).
This is an abstract base class. Subclasses should define the targetdescr
class attribute.
Init docstring:
Parameters
----------
py_func: function object to be compiled
locals: dict, optional
Mapping of local variable names to Numba types. Used to override
the types deduced by the type inference engine.
targetoptions: dict, optional
Target-specific config options.
impl_kind: str
Select the compiler mode for `@jit` and `@generated_jit`
pipeline_class: type numba.compiler.CompilerBase
The compiler pipeline type.
[ ]:
Understanding Waveform Simulation for XENONnT
Nov 30
[1]:
import strax, straxen, wfsim
import numpy as np
import pandas as pd
*** Detector definition message ***
You are currently using the default XENON10 template detector.
[2]:
straxen.print_versions(modules=('strax', 'straxen', 'cutax', 'wfsim', 'ntauxfiles'))
[2]:
module | version | path | git | |
---|---|---|---|---|
0 | python | 3.9.17 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/b... | None |
1 | strax | 1.5.2 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... | None |
2 | straxen | 2.1.1 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... | None |
3 | cutax | 1.15.1 | /home/shenyangshi/cutax/cutax | branch:bdt_ms | 277f174 |
4 | wfsim | 1.0.2 | /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... | None |
5 | ntauxfiles | 0.3.0 | /project2/lgrandi/shenyangshi/home/private_nt_... | branch:AmBe_ms_fit | 3555246 |
[4]:
config = straxen.get_resource('fax_config_nt_sr0_dev.json', fmt='json')
config.update({'detector':'XENONnT', 'right_raw_extension':50000})
Part 1, Simulation flow
Conceptual Overview / Between Interface and Core / Pulse Classes / Workflow Walkthrough
Conceptual Overview
What it’s for?
Imagine a monoenergetic source like Kr83m, producing photons and electrons uniformally in the detector, what would the events look like, can the processing software correctly reconstruct them? Or Would you like some fake events in the blinded regions?
The signals from XENON experiments have been extensively studied, so that we can mostly model them from the bottom up approach. The WFSim use those models do construct waveforms as realistic as possible while keeping track of the inputs.
Let’s break down the WFSim bit by bit.
How it works
The WFSim from the outside works like a python iterator, and is composed of four levels of iterators, where the deepest are Pulse
classes (those are not exactly iterator) taking instruction groups and return a list of pseudo pulses. The RawData
take the pseudo pulses and yield
digitized pulses, similar to what physical digitizers would do. The ChunkRawRecords
and RawRecordsFromFaxNT
takes care of chunking and interface with strax respectively.
Instruction groups
However, it is not exactly iterating over instructions, the instructions are just one of the arguments for __init__
. It is designed to turn instruction like the one below into the lowest input structure of the processing software, chunk
containing raw_records
(event
containing pulses
).
[ ]:
[5]:
instructions = inst_array = wfsim.strax_interface.rand_instructions(c={'event_rate':3, 'chunk_size':1, 'nchunk':1, **config})
inst_df = pd.DataFrame(inst_array)
inst_df
WARNING:wfsim.interface:rand_instructions is deprecated, please use wfsim.random_instructions
WARNING:wfsim.interface:g4id is not (fully) filled
WARNING:wfsim.interface:vol_id is not (fully) filled
[5]:
event_number | type | time | x | y | z | amp | recoil | e_dep | g4id | vol_id | local_field | n_excitons | x_pri | y_pri | z_pri | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 166666666 | 32.122543 | -22.621212 | -43.662350 | 818 | 7 | 13.101433 | -1 | -1 | 22.92 | 113 | 32.122543 | -22.621212 | -43.662350 |
1 | 0 | 2 | 166666666 | 32.122543 | -22.621212 | -43.662350 | 145 | 7 | 13.101433 | -1 | -1 | 22.92 | 0 | 32.122543 | -22.621212 | -43.662350 |
2 | 0 | 1 | 500000000 | 7.289836 | -28.672121 | -58.539639 | 3675 | 7 | 57.908840 | -1 | -1 | 22.92 | 670 | 7.289836 | -28.672121 | -58.539639 |
3 | 0 | 2 | 500000000 | 7.289836 | -28.672121 | -58.539639 | 609 | 7 | 57.908840 | -1 | -1 | 22.92 | 0 | 7.289836 | -28.672121 | -58.539639 |
4 | 0 | 1 | 833333333 | 36.476028 | -32.952290 | -136.419952 | 4234 | 7 | 65.917046 | -1 | -1 | 22.92 | 755 | 36.476028 | -32.952290 | -136.419952 |
5 | 0 | 2 | 833333333 | 36.476028 | -32.952290 | -136.419952 | 625 | 7 | 65.917046 | -1 | -1 | 22.92 | 0 | 36.476028 | -32.952290 | -136.419952 |
6 | 1 | 1 | 1166666666 | -13.274544 | -16.667040 | -7.988991 | 1040 | 7 | 16.401899 | -1 | -1 | 22.92 | 159 | -13.274544 | -16.667040 | -7.988991 |
7 | 1 | 2 | 1166666666 | -13.274544 | -16.667040 | -7.988991 | 164 | 7 | 16.401899 | -1 | -1 | 22.92 | 0 | -13.274544 | -16.667040 | -7.988991 |
8 | 1 | 1 | 1500000000 | -6.285732 | 65.843475 | -28.266441 | 2797 | 7 | 43.466827 | -1 | -1 | 22.92 | 486 | -6.285732 | 65.843475 | -28.266441 |
9 | 1 | 2 | 1500000000 | -6.285732 | 65.843475 | -28.266441 | 383 | 7 | 43.466827 | -1 | -1 | 22.92 | 0 | -6.285732 | 65.843475 | -28.266441 |
10 | 1 | 1 | 1833333333 | -45.993202 | -14.296285 | -122.466698 | 346 | 7 | 5.972553 | -1 | -1 | 22.92 | 24 | -45.993202 | -14.296285 | -122.466698 |
11 | 1 | 2 | 1833333333 | -45.993202 | -14.296285 | -122.466698 | 102 | 7 | 5.972553 | -1 | -1 | 22.92 | 0 | -45.993202 | -14.296285 | -122.466698 |
The event_number
are all 0, but don’t worry about it. It is used as event index when using with pax, while having no significant meaning when using with straxen.
The instruction is sorted by the physical time of the signal, that is S2-esque signal are delayed by drift time. And clustered into instruction groups, similar to gap size clustering that split when the gap is larger than right_raw_extension
.
[6]:
# Pre-load some constents from config
v = config['drift_velocity_liquid']
rext = config['right_raw_extension']
# Primary instructions must be sorted by signal time
# int(type) by design S1-esque being odd, S2-esque being even
# thus type%2-1 is 0:S1-esque; -1:S2-esque
# Make a list of clusters of instructions, with gap smaller then rext
inst_time = instructions['time'] + instructions['z'] / v * (instructions['type'] % 2 - 1)
inst_queue = np.argsort(inst_time)
inst_queue = np.split(inst_queue, np.where(np.diff(inst_time[inst_queue]) > rext)[0]+1)
Between Interface and Core
Let’s now examine what’s been passing between the interface and core, specifically between ChunkRawRecord
and ChunkRawRecord.rawdata (rawdata)
. The most important things passed into rawdata
are the instructions. But we also pass the truth buffer (empty array of truth dtype with 10000 slots) into rawdata
. In return, we have (channel, left, right, data)
valuse of digitized pulses, needed to build raw records.
At the same time three properties are interally used as chunking indicators rawdata.left, rawdata.right, radata.source_finished
. Whereas the left and right above returned by calling __next__
on rawdata
are of individual pulses, the rawdata.left, rawdata.right
are of all the pulses originated from an instruction group. So that when we stop and chunk, we can be sure we have finished an instruction group, and the next pulse will come at least right_raw_extension
away.
Sim data is the function where Pulse
class are called.
Convert to raw records
The pulses returned are in the form of (channel, left, right, data)
, not exactly the form of raw records. So one of the two main functions of ChunkRawRecord
is to covert them.
[7]:
# !!! Do not run this cell
pulse_length = right - left + 1
records_needed = int(np.ceil(pulse_length / samples_per_record))
# WARNING baseline and area fields are zeros before finish_results
s = slice(blevel, blevel + records_needed)
record_buffer[s]['channel'] = channel
record_buffer[s]['dt'] = dt
record_buffer[s]['time'] = dt * (left + samples_per_record * np.arange(records_needed))
record_buffer[s]['length'] = [min(pulse_length, samples_per_record * (i+1))
- samples_per_record * i for i in range(records_needed)]
record_buffer[s]['pulse_length'] = pulse_length
record_buffer[s]['record_i'] = np.arange(records_needed)
record_buffer[s]['data'] = np.pad(data,
(0, records_needed * samples_per_record - pulse_length), 'constant').reshape((-1, samples_per_record))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 3
1 # !!! Do not run this cell
----> 3 pulse_length = right - left + 1
4 records_needed = int(np.ceil(pulse_length / samples_per_record))
6 # WARNING baseline and area fields are zeros before finish_results
NameError: name 'right' is not defined
Digitize pulse cache
The psuedo pulses are cached in _pulses_cache
, once an instruction group is over, we digitize all the psuedo pulses. This is done by summing up all the pulse current in each channel and convert unit from PE to acd count thus becoming adc_wave
. Three functions add_noise
, add_baseline
and digitizer_saturation
are done afterwards.
[8]:
# !!! Do not run this cell
current_2_adc = config['pmt_circuit_load_resistor'] \
* config['external_amplification'] \
/ (config['digitizer_voltage_range'] / 2 ** (config['digitizer_bits']))
left = np.min([p['left'] for p in _pulses_cache]) - config['trigger_window']
right = np.max([p['right'] for p in _pulses_cache]) + config['trigger_window']
assert right - left < 200000, "Pulse cache too long"
if left % 2 != 0: left -= 1 # Seems like a digizier effect
_raw_data = np.zeros((801,
right - left + 1), dtype=('<i8'))
for ix, _pulse in enumerate(_pulses_cache):
ch = _pulse['channel']
adc_wave = - np.trunc(_pulse['current'] * current_2_adc).astype(int)
_slice = slice(_pulse['left'] - left, _pulse['right'] - left + 1)
_raw_data[ch, _slice] += adc_wave
if config['detector'] == 'XENONnT':
adc_wave_he = adc_wave * int(config['high_energy_deamplification_factor'])
if ch <= config['channels_top'][-1]:
ch_he = config['channels_top_high_energy'][ch]
_raw_data[ch_he, _slice] += adc_wave_he
elif ch <= config['channels_bottom'][-1]:
sum_signal(adc_wave_he,
_pulse['left'] - left,
_pulse['right'] - left + 1,
_raw_data[config['channels_in_detector']['sum_signal']])
# Adding noise, baseline and digitizer saturation
add_noise(data=_raw_data,
channel_mask=_channel_mask,
noise_data=resource.noise_data,
noise_data_length=len(resource.noise_data))
add_baseline(_raw_data, _channel_mask,
config['digitizer_reference_baseline'],)
digitizer_saturation(_raw_data, _channel_mask)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 8
1 # !!! Do not run this cell
4 current_2_adc = config['pmt_circuit_load_resistor'] \
5 * config['external_amplification'] \
6 / (config['digitizer_voltage_range'] / 2 ** (config['digitizer_bits']))
----> 8 left = np.min([p['left'] for p in _pulses_cache]) - config['trigger_window']
9 right = np.max([p['right'] for p in _pulses_cache]) + config['trigger_window']
10 assert right - left < 200000, "Pulse cache too long"
NameError: name '_pulses_cache' is not defined
Z(ero) L(ength) E(ncoding)
Right after digitize pulse cache, we run ZLE, which uses find_intervals_below_threshold
. For each interval, this yields a pulse, similar to what physical digitizers are doing
Pulse Classes
Pulse classes are another monster we will go into with more details in other notebooks. But in general, there’s the parent class Pulse
while different types of signal are children of it. And S2-esque after pulsing all inherite from S2
.
Workflow Walkthrough
[ ]:
Understanding Waveform Simulation for XENONnT
Nov 30
[1]:
import strax, straxen, wfsim
import numpy as np
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
[2]:
config = straxen.get_resource('https://raw.githubusercontent.com/XENONnT/'
'strax_auxiliary_files/master/fax_files/fax_config_nt.json', fmt='json')
config.update({'detector':'XENONnT'})
Part 2, Pulse (parent class)
The simulated pulses are formed using single-photon hits. Each hit has assigned time, channel, and gain using a common single photo-electron (SPE) shape.
SPE Shape is extracted from data and saved into a single template, with two entries in the config pe_pulse_ts
and pe_pulse_ys
.
[3]:
def init_pmt_current_templates():
"""
Create spe templates, for 10ns sample duration and 1ns rounding we have:
_pmt_current_templates[i] : photon timing fall between [10*m+i, 10*m+i+1)
(i, m are integers)
"""
global pe_pulse_function, _pmt_current_templates
# Interpolate on cdf ensures that each spe pulse would sum up to 1 pe*sample duration^-1
pe_pulse_function = interp1d(
config.get('pe_pulse_ts'),
np.cumsum(config.get('pe_pulse_ys')),
bounds_error=False, fill_value=(0, 1))
# Samples are always multiples of sample_duration
sample_duration = config.get('sample_duration', 10)
samples_before = config.get('samples_before_pulse_center', 2)
samples_after = config.get('samples_after_pulse_center', 20)
pmt_pulse_time_rounding = config.get('pmt_pulse_time_rounding', 1.0)
# Let's fix this, so everything can be turned into int
assert pmt_pulse_time_rounding == 1
samples = np.linspace(-samples_before * sample_duration,
+ samples_after * sample_duration,
1 + samples_before + samples_after)
_template_length = np.int(len(samples) - 1)
templates = []
for r in np.arange(0, sample_duration, pmt_pulse_time_rounding):
pmt_current = np.diff(pe_pulse_function(samples - r)) / sample_duration # pe / 10 ns
# Normalize here to counter tiny rounding error from interpolation
pmt_current *= (1 / sample_duration) / np.sum(pmt_current) # pe / 10 ns
templates.append(pmt_current)
_pmt_current_templates = np.array(templates)
init_pmt_current_templates()
[4]:
fig, axes = plt.subplots(2, 5, figsize=(20, 10))
for i, temp in enumerate(_pmt_current_templates):
plt.sca(axes[i//5, i%5])
# Pluse shape taken from config with 1 ns resolution
plt.plot(np.array(config.get('pe_pulse_ts')) + i, config.get('pe_pulse_ys'), alpha=0.5, color='b', lw=3.0, label='model')
# Caching 10 pulse shape templates with 10 ns resolution
plt.step(np.arange(22) * 10 - 10, temp, alpha=1, lw=3.0, color='k', where='mid', label='pulse')
plt.legend()
plt.title('peak time +%d ns'%(i))
plt.xlabel('ns')
plt.ylabel('PE/10ns')
plt.tight_layout()
plt.show()

SPE gain mean and spread
While pax save gains in the config, strax uses the function get_to_pe
. We are a bit careless here and hard code the conversion here to reintroduce gains
into config
[5]:
to_pe = straxen.get_to_pe(1, ('to_pe_per_run', 'https://raw.githubusercontent.com/XENONnT/'
'strax_auxiliary_files/master/fax_files/to_pe_nt.npy'),
len(config['channels_in_detector']['tpc']))
config['gains'] = 1 / to_pe * (1e-8 * 2.25 / 2**14) / (1.6e-19 * 10 * 50)
config['gains'][to_pe==0] = 0
/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.6/site-packages/ipykernel_launcher.py:3: DeprecationWarning: to_pe_per_run will be replaced by CorrectionsManagementSevices
This is separate from the ipykernel package so we can avoid doing imports until
/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.6/site-packages/ipykernel_launcher.py:4: RuntimeWarning: divide by zero encountered in true_divide
after removing the cwd from sys.path.
[6]:
photon_area_distribution = straxen.get_resource('https://raw.githubusercontent.com/XENONnT/'
'strax_auxiliary_files/master/fax_files/XENONnT_spe_distributions.csv', fmt='csv')
def init_spe_scaling_factor_distributions():
global grid_cdf, spe_shapes, __uniform_to_pe_arr
# Extract the spe pdf from a csv file into a pandas dataframe
spe_shapes = photon_area_distribution
# Create a converter array from uniform random numbers to SPE gains (one interpolator per channel)
# Scale the distributions so that they have an SPE mean of 1 and then calculate the cdf
uniform_to_pe_arr = []
for ch in spe_shapes.columns[1:]: # skip the first element which is the 'charge' header
if spe_shapes[ch].sum() > 0:
mean_spe = (spe_shapes['charge'].values * spe_shapes[ch]).sum() / spe_shapes[ch].sum()
scaled_bins = spe_shapes['charge'].values / mean_spe
cdf = np.cumsum(spe_shapes[ch]) / np.sum(spe_shapes[ch])
else:
# if sum is 0, just make some dummy axes to pass to interpolator
cdf = np.linspace(0, 1, 10)
scaled_bins = np.zeros_like(cdf)
grid_cdf = np.linspace(0, 1, 2001)
grid_scale = interp1d(cdf, scaled_bins,
bounds_error=False,
fill_value=(scaled_bins[0], scaled_bins[-1]))(grid_cdf)
uniform_to_pe_arr.append(grid_scale)
if len(uniform_to_pe_arr):
__uniform_to_pe_arr = np.stack(uniform_to_pe_arr)
def uniform_to_pe_arr(p, channel=0):
indices = (p * 2000).astype(int)
return __uniform_to_pe_arr[channel, indices]
init_spe_scaling_factor_distributions()
[7]:
fig, axes = plt.subplots(1, 2, figsize=(8, 5))
plt.sca(axes[0])
ch = '0'
mean_spe = (spe_shapes['charge'].values * spe_shapes[ch]).sum() / spe_shapes[ch].sum()
scaled_bins = spe_shapes['charge'].values / mean_spe
cdf = np.cumsum(spe_shapes[ch]) / np.sum(spe_shapes[ch])
channel = int(ch)
l1, = plt.plot(scaled_bins, spe_shapes[ch], alpha=0.5, color='b', lw=3.0, label=r'pdf $f(x)$')
plt.xlabel('Gain scaling factor')
plt.ylabel('pdf')
axt = plt.twinx()
l2, = plt.plot(scaled_bins, cdf, alpha=1.0, color='k', lw=3.0, label=r'cdf $F(x)=\int_{-\infty}^{x}f(t)dt$')
#plt.plot(__uniform_to_pe_arr[channel], grid_cdf, alpha=0.5, color='k', lw=3.0, ls='--')
plt.ylabel('cdf')
plt.title('input spe spectrum')
plt.legend(handles=[l1, l2], loc='center right')
plt.sca(axes[1])
plt.plot(cdf, scaled_bins, alpha=1.0, color='k', lw=3.0, label=r'ppf $F^{-1}(x)$')
#plt.plot(grid_cdf, __uniform_to_pe_arr[channel], alpha=0.5, color='k', lw=3.0, ls='--', )
plt.xlabel('cdf')
plt.ylabel('Gain scaling factor')
plt.title('inverse transform sampling')
plt.legend(loc='center left')
plt.tight_layout()
plt.show()

The spe (gain scaling factor) pdf are saved on github, show in blue and coverted into ppf for the inverse transform sampling. So that we can map randomly genarated number between 0 and 1 to the scaling factor.
Add Current
[ ]:
def add_current(photon_timings,
photon_gains,
pulse_left,
dt,
pmt_current_templates,
pulse_current):
# """
# Simulate single channel waveform given the photon timings
# photon_timing - dim-1 integer array of photon timings in unit of ns
# photon_gain - dim-1 float array of ph. 2 el. gain individual photons
# pulse_left - left of the pulse in unit of 10 ns
# dt - mostly it is 10 ns
# pmt_current_templates - list of spe templates of different reminders
# pulse_current - waveform
# """
if not len(photon_timings):
return
template_length = len(pmt_current_templates[0])
i_photons = np.argsort(photon_timings)
# Convert photon_timings to int outside this function
# photon_timings = photon_timings // 1
gain_total = 0
tmp_photon_timing = photon_timings[i_photons[0]]
for i in i_photons:
if photon_timings[i] > tmp_photon_timing:
start = int(tmp_photon_timing // dt) - pulse_left
reminder = int(tmp_photon_timing % dt)
pulse_current[start:start + template_length] += \
pmt_current_templates[reminder] * gain_total
gain_total = photon_gains[i]
tmp_photon_timing = photon_timings[i]
else:
gain_total += photon_gains[i]
else:
start = int(tmp_photon_timing // dt) - pulse_left
reminder = int(tmp_photon_timing % dt)
pulse_current[start:start + template_length] += \
pmt_current_templates[reminder] * gain_total
Pulse Class Call Workflow
[ ]:
def clear_pulse_cache():
global _pulses
_pulses = []
__dict__ = {}
def call():
"""
PMTs' response to incident photons
Use _photon_timings, _photon_channels to build pulses
"""
global _photon_timings, _photon_channels, turned_off_pmts
# The pulse cache should be immediately transfered after call this function
clear_pulse_cache()
# Correct for PMT Transition Time Spread (skip for pmt afterpulses)
_photon_timings += np.random.normal(config['pmt_transit_time_mean'],
config['pmt_transit_time_spread'],
len(_photon_timings))
dt = config.get('sample_duration', 10) # Getting dt from the lib just once
_n_double_pe = _n_double_pe_bot = 0 # For truth aft output
counts_start = 0 # Secondary loop index for assigning channel
for channel, counts in zip(*np.unique(_photon_channels, return_counts=True)):
#TODO: This is temporary continue to avoid out-of-range error.
# It should be added a proper method for nVeto PMTs also.
if channel >= 2000:
continue
# Use 'counts' amount of photon for this channel
_channel_photon_timings = _photon_timings[counts_start:counts_start+counts]
counts_start += counts
if channel in turned_off_pmts: continue
# If gain of each photon is not specifically assigned
# Sample from spe scaling factor distribution and to individual gain
# In contrast to pmt afterpulse that should have gain determined before this step
if '_photon_gains' not in __dict__:
if config['detector'] == 'XENON1T':
_channel_photon_gains = config['gains'][channel] \
* uniform_to_pe_arr(np.random.random(len(_channel_photon_timings)), channel)
else:
_channel_photon_gains = config['gains'][channel] \
* uniform_to_pe_arr(np.random.random(len(_channel_photon_timings)))
# Add some double photoelectron emission by adding another sampled gain
n_double_pe = np.random.binomial(len(_channel_photon_timings),
p=config['p_double_pe_emision'])
_n_double_pe += n_double_pe
if channel in config['channels_bottom']:
_n_double_pe_bot += n_double_pe
#_dpe_index = np.random.randint(len(_channel_photon_timings),
# size=n_double_pe)
if config['detector'] == 'XENON1T':
_channel_photon_gains[:n_double_pe] += config['gains'][channel] \
* uniform_to_pe_arr(np.random.random(n_double_pe), channel)
else:
_channel_photon_gains[:n_double_pe] += config['gains'][channel] \
* uniform_to_pe_arr(np.random.random(n_double_pe))
else:
_channel_photon_gains = np.array(_photon_gains[_photon_channels == channel])
# Build a simulated waveform, length depends on min and max of photon timings
min_timing, max_timing = np.min(
_channel_photon_timings), np.max(_channel_photon_timings)
pulse_left = int(min_timing // dt) - int(config['samples_to_store_before'])
pulse_right = int(max_timing // dt) + int(config['samples_to_store_after'])
pulse_current = np.zeros(pulse_right - pulse_left + 1)
add_current(_channel_photon_timings.astype(int),
_channel_photon_gains,
pulse_left,
dt,
_pmt_current_templates,
pulse_current)
# For single event, data of pulse level is small enough to store in dataframe
_pulses.append(dict(
photons = len(_channel_photon_timings),
channel = channel,
left = pulse_left,
right = pulse_right,
duration = pulse_right - pulse_left + 1,
current = pulse_current,))
[ ]:
Advanced Tricks
So you’ve tried making normal waveforms and now you need to spice up your life by making some way more weird waveforms letting the detector be whatever you want it to be? You have come to the right place!
By default fax uses some configuration file which is a huge pain to modify. So we made fax such that if you add a parameter in the instruction which corresponds to a parameter in the config it will overwrite what the value was in the config and let you deside what it should be!
This example shows how to modify the electron lifetime and the anode voltage
[1]:
import numpy as np
import strax
import straxen
import wfsim
import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from multihist import Histdd, Hist1d
from scipy import stats
[2]:
st = strax.Context(
config=dict(
detector='XENON1T',
fax_config='https://raw.githubusercontent.com/XENONnT/'
'strax_auxiliary_files/master/sim_files/fax_config_1t.json',
fax_config_override={'field_distortion_on':True, 's2_luminescence_model':'simple'},
**straxen.contexts.xnt_common_config),
**straxen.contexts.common_opts)
st.register(wfsim.RawRecordsFromFax1T)
[2]:
wfsim.strax_interface.RawRecordsFromFax1T
[3]:
# Just some id from post-SR1, so the corrections work
run_id = '000001'
[4]:
strax.Mailbox.DEFAULT_TIMEOUT=10000
[5]:
dtype = wfsim.strax_interface.instruction_dtype
for new_dtype in [('electron_lifetime_liquid', np.int32),
('anode_voltage', np.int32)]:
if new_dtype not in dtype:
dtype.append(new_dtype)
def rand_instructions(c):
n = c['nevents'] = c['event_rate'] * c['chunk_size'] * c['nchunk']
c['total_time'] = c['chunk_size'] * c['nchunk']
instructions = np.zeros(2 * n, dtype=dtype)
uniform_times = c['total_time'] * (np.arange(n) + 0.5) / n
instructions['time'] = np.repeat(uniform_times, 2) * int(1e9)
instructions['event_number'] = np.digitize(instructions['time'],
1e9 * np.arange(c['nchunk']) * c['chunk_size']) - 1
instructions['type'] = np.tile([1, 2], n)
instructions['recoil'] = ['er' for i in range(n * 2)]
r = np.sqrt(np.random.uniform(0, 2500, n))
t = np.random.uniform(-np.pi, np.pi, n)
instructions['x'] = np.repeat(r * np.cos(t), 2)
instructions['y'] = np.repeat(r * np.sin(t), 2)
instructions['z'] = np.repeat(np.random.uniform(-100, 0, n), 2)
nphotons = np.random.uniform(2000, 2050, n)
nelectrons = 10 ** (np.random.uniform(1, 4, n))
instructions['amp'] = np.vstack([nphotons, nelectrons]).T.flatten().astype(int)
instructions['electron_lifetime_liquid'] = np.repeat(600e10,len(instructions))
instructions['anode_voltage'] = np.repeat(1e10,len(instructions))
return instructions
wfsim.strax_interface.rand_instructions = rand_instructions
wfsim.strax_interface.instruction_dtype = dtype
[6]:
st.set_config(dict(fax_file=None))
st.set_config(dict(nchunk=1, event_rate=1, chunk_size=100))
[10]:
# Remove any previously simulated data, if such exists
# !rm -r strax_data
records = st.get_array(run_id,'raw_records', progress_bar=False)
peaks = st.get_array(run_id, ['peak_basics'], progress_bar=False)
data = st.get_df(run_id, 'event_info', progress_bar=False)
truth = st.get_df(run_id, 'truth', progress_bar=False)
[11]:
truth.head()
[11]:
event_number | type | time | x | y | z | amp | recoil | electron_lifetime_liquid | anode_voltage | ... | n_photon_bottom | t_first_photon | t_last_photon | t_mean_photon | t_sigma_photon | t_first_electron | t_last_electron | t_mean_electron | t_sigma_electron | endtime | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 500000026 | -27.127699 | -36.640770 | -12.655296 | 2009 | er | -2147483648 | -2147483648 | ... | 143.0 | 5.000000e+08 | 5.000004e+08 | 5.000001e+08 | 48.421194 | NaN | NaN | NaN | NaN | 500000353 |
1 | 0 | 2 | 500097027 | -27.127699 | -36.640770 | -12.655296 | 39 | er | -2147483648 | -2147483648 | ... | 344.0 | 5.000970e+08 | 5.000997e+08 | 5.000984e+08 | 490.645247 | 5.000970e+08 | 5.000993e+08 | 5.000982e+08 | 479.974221 | 500099672 |
2 | 0 | 4 | 500768320 | 20.171877 | 4.092766 | -89.123016 | 1 | er | -2147483648 | -2147483648 | ... | 10.0 | 5.007683e+08 | 5.007690e+08 | 5.007685e+08 | 167.279008 | 5.007683e+08 | 5.007683e+08 | 5.007683e+08 | 0.000000 | 500769026 |
3 | 0 | 1 | 1500000019 | -19.476303 | 10.349952 | -35.648590 | 2010 | er | -2147483648 | -2147483648 | ... | 192.0 | 1.500000e+09 | 1.500000e+09 | 1.500000e+09 | 41.286365 | NaN | NaN | NaN | NaN | 1500000227 |
4 | 0 | 2 | 1500265504 | -19.476303 | 10.349952 | -35.648590 | 2817 | er | -2147483648 | -2147483648 | ... | 26229.0 | 1.500266e+09 | 1.500273e+09 | 1.500269e+09 | 961.919333 | 1.500265e+09 | 1.500272e+09 | 1.500269e+09 | 955.008653 | 1500272834 |
5 rows × 22 columns
[ ]:
wfsim package
Submodules
wfsim.load_resource module
- class wfsim.load_resource.DummyMap(const, shape=())[source]
Bases:
object
Return constant results the length match the length of input but from the second dimensions the shape is user defined input
- class wfsim.load_resource.Resource(config=None)[source]
Bases:
object
- Get the configs needed for running WFSim. Configs can be obtained in
two ways: 1. Get it directly from the mongo database. This only needs the
name of the file.
- Load it with straxen get_resource, this can either:
Download from a public repository Read from local cache Download from a private repository if credentials are properly setup
- static config_to_file(config)[source]
Find and complete all file paths
returns dictionary mapping item name to path
- static get_file_path(base, fname)[source]
Find the full path to the resource file Try 4 methods in the following order 1. The base is not url, return base + name 2. If ntauxfiles (straxauxfiles) is installed, return will be package dir + name
pip install won’t work, try python setup.py in the packages
Download the latest version using straxen mongo downloader from database, return the cached file path + md5
Download using straxen get_resource from the url (github raw) simply return base + name
Be careful with the forth options, straxen creates cache files that might not be updated with the latest github commit.
wfsim.pax_interface module
- class wfsim.pax_interface.PaxEventSimulator(config=None)[source]
Bases:
object
Simulate wf from instruction and stored in wfsim.pax_datastructure.datastructure.Event mimicing pax.datastructure.Event Then pickled, compressed and saved mimicing pax raw data zips.
Call compute to start the simulation process.
- class WriteZipped(config)[source]
Bases:
object
- close_current_file()[source]
Closes the currently open file, if there is one. Also handles temporary file renaming.
- file_extension = 'zip'
wfsim.strax_interface module
- class wfsim.strax_interface.ChunkRawRecords(config, rawdata_generator=<class 'wfsim.core.rawdata.RawData'>, **kwargs)[source]
Bases:
object
- class wfsim.strax_interface.RawRecordsFromFax1T[source]
Bases:
RawRecordsFromFaxNT
- class wfsim.strax_interface.RawRecordsFromFaxNT[source]
Bases:
SimulatorPlugin
- class wfsim.strax_interface.RawRecordsFromFaxOpticalNT[source]
Bases:
RawRecordsFromFaxNT
- class wfsim.strax_interface.RawRecordsFromFaxnVeto[source]
Bases:
RawRecordsFromMcChain
- class wfsim.strax_interface.RawRecordsFromMcChain[source]
Bases:
SimulatorPlugin
- data_kind: Union[str, immutabledict, dict] = immutabledict({'raw_records': 'raw_records', 'raw_records_he': 'raw_records_he', 'raw_records_aqmon': 'raw_records_aqmon', 'raw_records_nv': 'raw_records_nv', 'truth': 'truth', 'truth_nv': 'truth_nv'})
- gain_model_nv
Dispatch on URL protocol.
unrecognized protocol returns identity inspired by dasks Dispatch and fsspec fs protocols.
- get_instructions()[source]
Run epix and save instructions as self.instructions_epix epix_config with all the epix run arguments is passed as a dictionary, where source_rate must be set to 0 (epix default), since time handling is done outside epix.
epix needs to be imported in here to avoid circle imports
- provides: tuple = ('raw_records', 'raw_records_he', 'raw_records_aqmon', 'raw_records_nv', 'truth', 'truth_nv')
- set_timing()[source]
Set timing information in such a way to synchronize instructions for the TPC and nVeto
- takes_config = immutabledict({'gain_model_mc': <straxen.url_config.URLConfig object>, 'detector': <strax.config.Option object>, 'event_rate': <strax.config.Option object>, 'chunk_size': <strax.config.Option object>, 'n_chunk': <strax.config.Option object>, 'per_pmt_truth': <strax.config.Option object>, 'fax_file': <strax.config.Option object>, 'fax_config': <strax.config.Option object>, 'fax_config_override': <strax.config.Option object>, 'fax_config_override_from_cmt': <strax.config.Option object>, 'channel_map': <strax.config.Option object>, 'n_tpc_pmts': <strax.config.Option object>, 'n_top_pmts': <strax.config.Option object>, 'right_raw_extension': <strax.config.Option object>, 'seed': <strax.config.Option object>, 'gain_model_nv': <straxen.url_config.URLConfig object>, 'epix_config': <strax.config.Option object>, 'entry_start': <strax.config.Option object>, 'entry_stop': <strax.config.Option object>, 'fax_config_nveto': <strax.config.Option object>, 'fax_config_override_nveto': <strax.config.Option object>, 'targets': <strax.config.Option object>})
- class wfsim.strax_interface.RawRecordsFromMcChain1T[source]
Bases:
RawRecordsFromMcChain
- wfsim.strax_interface.extra_truth_dtype_per_pmt(n_pmt: bool | int) List[tuple] [source]
Generate the extra - truth dtype output
Return total/bottom seperation from truth_extra_dtype if n_pmt is False
- Parameters:
n_pmt – Number of PMTs, when false (no PMTs, use total/bottom separation)
- Returns:
dtype list
wfsim.units module
Define unit system for pax (i.e., seconds, etc.) This sets up variables for the various unit abbreviations, ensuring we always have a ‘consistent’ unit system. There are almost no cases that you should change this without talking with a maintainer.
wfsim.utils module
- wfsim.utils.find_intervals_below_threshold(w, threshold, holdoff, result_buffer)[source]
Fills result_buffer with l, r bounds of intervals in w < threshold. :param w: Waveform to do hitfinding in :param threshold: Threshold for including an interval :param holdoff: Holdoff number of samples after the pulse return back down to threshold :param result_buffer: numpy N*2 array of ints, will be filled by function.
if more than N intervals are found, none past the first N will be processed.
:returns : number of intervals processed Boundary indices are inclusive, i.e. the right boundary is the last index which was < threshold
- wfsim.utils.optical_adjustment(instructions, timings, channels)[source]
Helper function to process optical instructions so that for each entry 1) Move the instruction timing to the first photon in the entry and move photon timings 2) Split photon timing into maximum interval of PULSE_MAX_DURATION default 1us
The split photon are put into new instruction append at the end of the instructions