Welcome to wfsim’s documentation!

Github page: https://github.com/XENONnT/WFSim

WFSim is the wavefrom simulation framework for XENONnT, built on top of the straxen framework.

Installation

See the installation guide

Installing wfsim

  1. To get started for simulating XENONnT please follow the straxen setup guide. Simulating XENON1T works out of the box.

  2. On top of straxen, install wfsim:

pip install wfsim

Developer setup

  1. To get started for simulating XENONnT please follow the straxen setup guide.

  2. Install wfsim in editable mode from source:

git clone https://github.com/XENONnT/WFSim wfsim
pip install -e wfsim

Getting started with WFSim

Hello friend. Welcome to the basic tutorial on how to simulate waveforms with the latest wfsim version in strax. Here we’ll just demonstrate the basic functionality. For more in depth analysis stuff, checkout the straxen tutorials for more detailed thing.

[1]:
import numpy as np
import strax
import straxen
import wfsim

import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from multihist import Histdd, Hist1d
from scipy import stats

import json
*** Detector definition message ***
You are currently using the default XENON10 template detector.

[79]:
straxen.print_versions(modules=('strax', 'straxen', 'cutax', 'wfsim', 'ntauxfiles'))
ntauxfiles is not installed
[79]:
module version path git
0 python 3.9.17 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/b... None
1 strax 1.5.2 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... None
2 straxen 2.1.1 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... None
3 cutax 1.15.1 /home/shenyangshi/cutax/cutax branch:bdt_ms | 277f174
4 wfsim 1.0.2 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... None

Different simulators

Because we’d like to make simulations we have multiple different ways of making data. They are: * RawRecordsFromFaxNT(1T) * RawRecordsFromFaxEpix * RawRecordsFromFaxOptical * RawRecordsFromFaxNveto

The main difference is the way you specify instructions. The NT(1T) simulator depends on either a csv file or a default/user specified function to generate instructions. Epix takes input from G4 processed with epix. Optical takes optical input from G4 for photon channels and timings. And Nveto is a small modification to optical to provide nveto data

Setting everything up

First we need to define the right context. For this we made two contexts to help you out. One for 1T and one for nT. By default in your context RawRecordsFromFaxNT will be registered. If you want another one be sure to register it

For 1T do:

[2]:
st = straxen.contexts.xenon1t_simulation()

Or nT do:

[5]:
st = straxen.contexts.xenonnt_simulation(cmt_run_id_sim='00001')
/opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/straxen/url_config.py:711: UserWarning: From straxen version 2.1.0 onward, URLConfig parameterswill be sorted alphabetically before being passed to the plugins, this will change the lineage hash for non-sorted URLs. To load data processed with non-sorted URLs, you will need to use an older version.
  warnings.warn("From straxen version 2.1.0 onward, URLConfig parameters"

Now we need to define a run id. What you give it doesn’t really matter, since strax will look for data and make new if it doesn’t find anything. And this is what you want. Strax will use the run id to get the electron lifetime and pmt gains from a database, and returns placeholders if the run doesn’t exist.

[6]:
run_id = '1'

Defining instructions

For the instructions there are multiple different ways to do it. The simulator has this option called “fax_file”. If it has a value (None by default) the simulator will either read it as a csv or root file. If not it will use some predefined functions to make your events. The number of event you’ll simulate based on the product of the config values “nchunk”,”event_rate” and “chunk_size”. Which you can set as follows:

[7]:
st.set_config(dict(nchunk=1, event_rate=5, chunk_size=10,))

Strax groups data together in chunks based on time (for low level data). nchunk is the number of chunks you want to simulate event_rate is the number of events per seconds, so this effects the amount of spacing between events. Finally chunk_size is the length in seconds of your chunk For the DAQ this is 5 seconds. For simulations you can do whatever you want. It is important to note that Strax will write out data per chunk So when your chunks are small you’ll, among other things, call Strax’ IO functions a lot, giving a substantial overhead. On the other hand, to large chunks will hog all your memory and your kernel might crash. Based on my experimentation setting chunk_size to 500 gives best performance

These are ways you can give instructions ### Random This is the default where simulator will generate some random instructions for you. ### Custom For this you will need to overwrite the instruction generator function ### CSV You can provide a csv file with the instruction (Like the output of nSort) ### Geant4 For this you’ll need to use epix to do some clustering for you.

Random

I guess this is pretty self explanatory. The simulator has this function called “rand_instructions” which will make something up for you.

[8]:
wfsim.strax_interface.rand_instructions??
Signature: wfsim.strax_interface.rand_instructions(c)
Source:
def rand_instructions(c):
    """Random instruction generator function. This will be called by wfsim if you do not specify 
    specific instructions.
    :params c: wfsim configuration dict"""
    log.warning('rand_instructions is deprecated, please use wfsim.random_instructions')
    if 'drift_field' not in c:
        log.warning('drift field not specified!')
    # Do the conversion for now, don't break everything all at once
    return _rand_instructions(event_rate=c.get('event_rate', 10),
                              chunk_size=c.get('chunk_size', 5),
                              n_chunk=c.get('n_chunk', 2),
                              energy_range=[1, 100],  # keV
                              drift_field=c.get('drift_field', 100),  # V/cm
                              tpc_radius=c.get('tpc_radius', straxen.tpc_r),
                              tpc_length=c.get('tpc_length', straxen.tpc_z),
                              nest_inst_types=[7]
                              )
File:      /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/wfsim/strax_interface.py
Type:      function
Custom

This is some more fun. To do this we’ll write a new function which returns a structured numpy array with the correct dtype.

[9]:
wfsim.instruction_dtype
[9]:
[(('Waveform simulator event number.', 'event_number'), numpy.int32),
 (('Quanta type (S1 photons or S2 electrons)', 'type'), numpy.int8),
 (('Time of the interaction [ns]', 'time'), numpy.int64),
 (('X position of the cluster [cm]', 'x'), numpy.float32),
 (('Y position of the cluster [cm]', 'y'), numpy.float32),
 (('Z position of the cluster [cm]', 'z'), numpy.float32),
 (('Number of quanta', 'amp'), numpy.int32),
 (('Recoil type of interaction.', 'recoil'), numpy.int8),
 (('Energy deposit of interaction', 'e_dep'), numpy.float32),
 (('Eventid like in geant4 output rootfile', 'g4id'), numpy.int32),
 (('Volume id giving the detector subvolume', 'vol_id'), numpy.int32),
 (('Local field [ V / cm ]', 'local_field'), numpy.float64),
 (('Number of excitons', 'n_excitons'), numpy.int32),
 (('X position of the primary particle [cm]', 'x_pri'), numpy.float32),
 (('Y position of the primary particle [cm]', 'y_pri'), numpy.float32),
 (('Z position of the primary particle [cm]', 'z_pri'), numpy.float32)]

Event number is just a lable which peaks are together. type is either 1(S1) or 2 (S2). In the truth higher numbers are also used to refer to different types of afterpulses. time,x,y,z are the time and positions of the signal. Amp is the number of photons or electrons generated, and recoil can be used for different types of recoil (but only Electronic recoil is supported). Recoil is a number according to the NEST convention which will be used to indicate which interaction. For the key look here: https://github.com/XENONnT/WFSim/blob/2c614b0f7b0d7c7adc516f6188e281857e8d7e22/wfsim/core.py#L22

Now, lets say we want some krypton peaks. For this we’ll need to change the default instruction function to include this double decay and use Nestpy to convert energy deposits into a number of photons and electrons.

In this case I’ll use 1 “event” per full decay, that where all the 4’s are coming from.

[80]:
def super_awesome_custom_instruction(c):
    import nestpy
    half_life = 156.94e-9 #Kr intermediate state half-life in ns
    decay_energies = [32.2,9.4] # Decay energies in kev

    n = c['nevents'] = c['event_rate'] * c['chunk_size']
    c['total_time'] = c['chunk_size']

    instructions = np.zeros(4 * n, dtype=wfsim.instruction_dtype)
    instructions['event_number'] = np.repeat(np.arange(n), 4)

    instructions['type'] = np.tile([1, 2], 2 * n)
    instructions['recoil'] = [7 for i in range(4 * n)]

    r = np.sqrt(np.random.uniform(0, 2500, n))
    t = np.random.uniform(-np.pi, np.pi, n)
    instructions['x'] = np.repeat(r * np.cos(t), 4)
    instructions['y'] = np.repeat(r * np.sin(t), 4)
    instructions['z'] = np.repeat(np.random.uniform(-100, 0, n), 4)

    #To get the correct times we'll need to include the 156.94 ns half life of the intermediate state.

    uniform_times = c['total_time'] * (np.arange(n) + 0.5) / n
    delayed_times = uniform_times + np.random.exponential(half_life/np.log(2),len(uniform_times))
    instructions['time'] = np.repeat(list(zip(uniform_times,delayed_times)),2) * 1e9


    # Here we'll define our XENON-like detector
    nc = nestpy.NESTcalc(nestpy.VDetector())
    A = 131.293
    Z = 54.
    density = 2.862  # g/cm^3   #SR1 Value
    drift_field = 82  # V/cm    #SR1 Value
    interaction = nestpy.INTERACTION_TYPE(7)

    energy = np.tile(decay_energies,n)
    quanta = []
    exciton = []
    for en in energy:
        y = nc.GetYields(interaction,
                         en,
                         density,
                         drift_field,
                         A,
                         Z,
                         (1, 1))
        q = nc.GetQuanta(y, density)
        quanta.append(q.photons)
        quanta.append(q.electrons)
        exciton.append(q.excitons)
        exciton.append(0)

    instructions['amp'] = quanta
    instructions['local_field'] = drift_field
    instructions['n_excitons'] = exciton

    return instructions

Now here comes the magic line:

[81]:
wfsim.strax_interface.rand_instructions = super_awesome_custom_instruction

This changes the default rand_instruction function in our own super awesome function. So when the simulator will call rand_instruction the code from super_awesome_custom_instruction will be executed

CSV

The format for csv files is the same as the instructions dtype. So on every line specify event_number,type,time ,x,y,z, amp and recoil in that order. Then tell the simulator it exists:

[82]:
st.set_config(dict(fax_file='instructions.csv'))

Ofcourse if you do not have this file it will not work

Geant4

For starters you’ll need to register the Epix plugin. This requires an epix configuration (it’s on private_nt_aux_files), and if you want a start and stop event.

[83]:
st.register(wfsim.strax_interface.RawRecordsFromMcChain)
st.set_config(dict(fax_file    = 'path_to_g4_file',
                   epix_config = 'path_to_config'))
Optical

Similar to above, register correct plugin and file:

[84]:
st.register(wfsim.strax_interface.RawRecordsFromMcChain)
st.set_config(dict(fax_file = 'path_to_g4_file',))
nVeto

Because the nveto is a different system it has a slightly different configuration. This is not entirely finished. Once it is you can run like so:

[85]:
st.register(wfsim.strax_interface.RawRecordsFromFaxnVeto)
st.set_config(dict(fax_config_nveto = 'path_to_nveto_config_file',
                  fax_file          = 'path_to_g4_file',))

Configuration customization

The simulator using a larger large amount of configuration settings to do it’s magic. Some of them are best left along, like pmt_circuit_load_resistor. Others on the other hand are things you might want change a bit to see how the data will change. Unfortunately currently the full list is spread out over two different places. One is the fax config json which is on github. The other is the option list in strax. Besides those things like pattern maps are hardcoded in load_resource.py. These files are overrideable with the option “fax_config_override”

The strax config is viewable like this and can be changed by st.set_config(dict(option you want=value you want))

[86]:
st.show_config('raw_records')
[86]:
option default current applies_to help
0 gain_model_mc cmt://to_pe_model?version=ONLINE&run_id=plugin... cmt://to_pe_model?run_id=00001&version=ONLINE (raw_records, raw_records_he, raw_records_aqmo... PMT gain model. Specify as (model_type, model_...
1 detector XENONnT XENONnT (raw_records, raw_records_he, raw_records_aqmo...
2 event_rate 1000 5 (raw_records, raw_records_he, raw_records_aqmo... Average number of events per second
3 chunk_size 100 1 (raw_records, raw_records_he, raw_records_aqmo... Duration of each chunk in seconds
4 n_chunk 10 <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... Number of chunks to simulate
5 per_pmt_truth False <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... Store the info per channel in the truth file
6 fax_file None path_to_g4_file (raw_records, raw_records_he, raw_records_aqmo... Directory with fax instructions
7 fax_config fax_config_nt_design.json fax_config_nt_design.json (raw_records, raw_records_he, raw_records_aqmo...
8 fax_config_override None <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... Dictionary with configuration option overrides
9 fax_config_override_from_cmt None {'drift_time_gate': ('cmt_run_id', '00001', 'e... (raw_records, raw_records_he, raw_records_aqmo... Dictionary of fax parameter names (key) mapped...
10 channel_map <OMITTED> (tpc, he, aqmon, aqmon_nv, tpc_blank, mv, aux_... (raw_records, raw_records_he, raw_records_aqmo... immutabledict mapping subdetector to (min, max...
11 n_tpc_pmts <OMITTED> 494 (raw_records, raw_records_he, raw_records_aqmo... Number of pmts in tpc. Provided by context
12 n_top_pmts <OMITTED> 253 (raw_records, raw_records_he, raw_records_aqmo... Number of pmts in top array. Provided by context
13 right_raw_extension 100000 <OMITTED> (raw_records, raw_records_he, raw_records_aqmo...
14 seed False <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... Option for setting the seed of the random numb...
15 gain_model_nv <OMITTED> cmt://to_pe_model_nv?run_id=00001&version=ONLINE (raw_records, raw_records_he, raw_records_aqmo... nveto gain model, provided by context
16 epix_config {} path_to_config (raw_records, raw_records_he, raw_records_aqmo... Dict with epix configuration
17 entry_start 0 <OMITTED> (raw_records, raw_records_he, raw_records_aqmo...
18 entry_stop None <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... G4 id event number to stop at. If -1 process t...
19 fax_config_nveto None path_to_nveto_config_file (raw_records, raw_records_he, raw_records_aqmo...
20 fax_config_override_nveto None <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... Dictionary with configuration option overrides
21 targets (tpc,) <OMITTED> (raw_records, raw_records_he, raw_records_aqmo... tuple with what data to simulate (tpc, nveto o...

With the following block you can visualized which fax config you are using. Remeber that the fax config files are stored in private_nt_aux_files github repository, in the sim_files folder.

[87]:
st.config['fax_config']
[87]:
'fax_config_nt_design.json'

The config from github can be loaded as:

[88]:
straxen.get_resource('fax_config_nt_design.json',fmt='json').keys()
[88]:
dict_keys(['s1_model_type', 's2_time_model', 's2_luminescence_model', 's2_garfield_confine_position', 'se_gain_from_map', 'ext_eff_from_map', 'enable_gas_gap_warping', 'enable_pmt_afterpulses', 'enable_electron_afterpulses', 'enable_noise', 'field_distortion_on', 'field_distortion_model', 'enable_field_dependencies', 'url_base', 'photon_area_distribution', 's1_pattern_map', 's2_pattern_map', 's1_lce_correction_map', 's2_correction_map', 'se_gain_map', 's1_time_spline', 's2_time_spline', 'field_distortion_comsol_map', 'photon_ap_cdfs', 'noise_file', 's2_luminescence_gg', 's2_luminescence', 'gas_gap_map', 'garfield_gas_gap_map', 'field_dependencies_map', 'temperature', 'pressure', 'lxe_dielectric_constant', 'tpc_length', 'tpc_radius', 'anode_wire_radius', 'anode_field_domination_distance', 'elr_gas_gap_length', 'gate_to_anode_distance', 'drift_field', 'anode_voltage', 'diffusion_constant_longitudinal', 'diffusion_constant_transverse', 'drift_time_gate', 'drift_velocity_liquid', 'singlet_fraction_gas', 'singlet_lifetime_gas', 'singlet_lifetime_liquid', 'triplet_lifetime_gas', 'triplet_lifetime_liquid', 's1_ER_alpha_singlet_fraction', 's1_ER_primary_singlet_fraction', 's1_ER_recombination_fraction', 's1_ER_recombination_time', 's1_ER_secondary_singlet_fraction', 's1_NR_singlet_fraction', 'maximum_recombination_time', 's1_decay_spread', 's1_decay_time', 's1_detection_efficiency', 's2_mean_area_fraction_top', 's2_aft_sigma', 's2_aft_skewness', 's2_secondary_sc_gain', 's2_time_spread', 'electron_extraction_yield', 'electron_lifetime_liquid', 'electron_trapping_time', 'gas_drift_velocity_slope', 'g2_mean', 'p_double_pe_emision', 'pe_pulse_ts', 'pe_pulse_ys', 'pmt_ap_modifier', 'pmt_ap_t_modifier', 'pmt_pulse_time_rounding', 'pmt_transit_time_mean', 'pmt_transit_time_spread', 'photoelectric_modifier', 'photoelectric_p', 'photoelectric_t_center', 'photoelectric_t_spread', 'photoionization_modifier', 'sample_duration', 'samples_after_pulse_center', 'samples_before_pulse_center', 'samples_to_store_after', 'samples_to_store_before', 'pmt_circuit_load_resistor', 'external_amplification', 'high_energy_deamplification_factor', 'trigger_window', 'digitizer_bits', 'digitizer_reference_baseline', 'digitizer_voltage_range', 'special_thresholds', 'zle_threshold'])

Changing things in this guy goes slightly different. In the strax option list there is the option called “fax_config_override”. This takes a dict which will be used to override any values in the json config. So changing the ‘s2_secondary_sc_gain’ is done as:

[89]:
st.set_config(dict(fax_config_override = dict(s2_secondary_sc_gain=23)))

Things you might want to change

There are 4 main things you might be interested in changing. They are: * Noise (True/False) * PMTAfterpulses (True/False) * Electron Afterpulses (True/False) * S2 luminescence (Simple or garfield)

They are controled by config settings you can change as so:

[90]:
st.set_config(dict(fax_config_override=dict(
            s2_luminescence_model='simple',
            enable_noise=False,
            enable_pmt_afterpulses=False,
            enable_electron_afterpulses=False,)))

What actually happens?

What happens behind the scenes is that the instructions are first grouped together in chunks. Then we loop over the instructions and the full chunk is returned before starting with the next one.

We use a S1 and S2 class to calculate the arrival times of the photons and the channels which have been hit. Then we’ll hand them over to the Pulse class to calculate the currents in the channels. Finally the currents go to a RawData class where we fake the digitizer response.

S1

For S1s we start with calculating the light yield based on the position of the interaction, and draw the number of photons seen from a Poisson distribution.

Second we calculate the arrival times of the photons. This is based on the scintillation of the xenon atoms. It is dependend on the recombination time, the singlet and triplet fractions.

Finally the channels are calculated. Based on the pattern map we use a interpolation map to get a probability distribution for channels to be hit for a S1 signal based on the position of the interaction, and then we draw from this distribution for every photon.

S2

S2s are slightly more complicated. First we need to drift the electrons up, and while doing so we’ll lose some of them. To get the photon timings, we first need to get the arrival times of the electrons at the gas interface based on a diffusion model. Then we can calculate the photon timings based on a luminescence model for every individual electron. And for the channels we do the same trick with the interpolating map.

Pulse

When we have our lists of channels and timing we can generate actual pulses. First we add a pmt transition time. Then we loop over all channels, calculate the double pe emission probabilities, and add a current in the pmt channel based on the arrival time. This is all stored in a big dictionary. Afterwards this is passed to our fake digitizers which then returns you with your very own pretty data

Getting down to bussiness

Now we have access to all the normal strax data types, and another one called ‘truth’ which holds the simulation instructions. Calling it follows the normal strax convention.

[91]:
st = straxen.contexts.xenonnt_simulation(cmt_run_id_sim='00001')
st.set_config(dict(event_rate=5, chunk_size=1, fax_file=None))
[92]:
# Remove any previously simulated data, if such exists
# !rm -r strax_data

records = st.get_array(run_id,'records')
# peaks = st.get_array(run_id, ['peaks','peak_basics'])
# data = st.get_df(run_id, 'event_info')

truth = st.get_df(run_id, 'truth')

Now it is time to make pretty plots and see if what we makes actually makes any sense

[93]:
peak_basics = st.get_df(run_id, 'peak_basics')
[94]:
peak_basics.head(10)
[94]:
time endtime center_time area n_hits n_channels max_pmt max_pmt_area n_saturated_channels range_50p_area range_90p_area area_fraction_top length dt rise_time tight_coincidence type max_diff min_diff
0 100000030 100000670 100000205 45.194511 14 13 345 5.892519 0 198.728180 327.420563 0.138998 64 10 94.678757 7 2 80 10
1 100622470 100626210 100624258 143.898376 41 17 50 45.115543 0 1164.394287 2533.276855 0.875330 187 20 930.115356 7 2 500 0
2 300000020 300000410 300000128 22.863621 7 6 483 5.560007 0 73.276711 153.896225 0.276379 39 10 53.755302 5 1 40 10
3 300724040 300724470 300724142 7.752567 3 3 461 3.555671 0 92.052193 296.905457 0.000000 43 10 25.530293 2 1 180 10
4 500000050 500000560 500000248 51.385868 11 11 408 20.320269 0 166.012741 310.153564 0.129670 51 10 142.900955 2 2 90 0
5 500589980 500593580 500591982 37.373817 15 10 241 8.749174 0 1940.112183 3275.072021 0.688202 180 20 2096.490967 3 2 -1 -1
6 700000040 700000610 700000139 15.011408 4 4 324 4.820670 0 57.526924 345.084930 0.000000 57 10 49.048229 3 1 270 20
7 700699840 700706240 700703462 102.122833 25 14 129 33.482544 0 3729.012451 5596.686035 0.880986 160 40 2050.706543 4 2 -1 -1
8 900000050 900000720 900000210 28.657455 8 8 346 7.821901 0 151.675720 443.599609 0.229542 67 10 58.658257 5 1 290 0
9 900703540 900707830 900705461 99.501129 24 16 64 27.903099 0 1144.565796 3354.698730 0.942021 143 30 838.284546 2 2 610 10

Matching

To do matching with the truth the easiest way is to write a new strax plugin where you loop over peaks and get the truth arrays where the mean arrival time of the photons are within the time window of the peak So that will look something like this:

[95]:
class MatchedPeaks(strax.LoopPlugin):
    depends_on = ('peak_basics', 'truth')
    provides = 'matched_peaks'
    __version__ = '0.0.2'

    dtype = [('time', np.int),
             ('endtime', np.int),
             ('area', np.int),
             ('n_photon', np.int)]

    def compute(self, peaks, truth):
        result = np.zeros(len(truth), self.dtype)

        temp = strax.touching_windows(peaks, truth)
        idx_truth = []
        idx_peak = []
        for ix, (l, r) in enumerate(temp):
            res = result[ix]
            res['time'] = truth['time'][ix]
            res['endtime'] = strax.endtime(truth)[ix]
            res['n_photon'] = truth['n_photon'][ix]

            if r > l:
                i_peak = l+np.argmax(peaks['area'][l:r])
                res['area'] = peaks['area'][i_peak]

        return result
/tmp/jobs/29284557/ipykernel_277/3800134942.py:6: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = [('time', np.int),
/tmp/jobs/29284557/ipykernel_277/3800134942.py:7: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ('endtime', np.int),
/tmp/jobs/29284557/ipykernel_277/3800134942.py:8: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ('area', np.int),
/tmp/jobs/29284557/ipykernel_277/3800134942.py:9: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ('n_photon', np.int)]

Of course this doesn’t actually work. An electron afterpulse can be very spread out leading it to be interpreted as multiple peaks while coming from 1 instruction falling outside of the specified range. It would be very much appreciated if someone wants make a more sturdy selection criteria :)

For example, checkout out peak matching algorithm for WFSim: https://github.com/XENONnT/pema

[96]:
st.register(MatchedPeaks)
[96]:
__main__.MatchedPeaks
[97]:
st.get_df(run_id, 'matched_peaks').head(10)
[97]:
time endtime area n_photon
0 100000029 100000630 45 251
1 100000333 100000791 45 69
2 100620431 100627474 143 856
3 100622911 100627768 143 211
4 300000006 300000565 22 255
5 300000067 300000556 22 69
6 300723015 300728784 7 186
7 300723660 300727606 7 128
8 500000003 500000619 51 235
9 500000111 500000642 51 49

Externally call functions

The s1/s2 functions used for generating photon timings and channels are now doocumented and callable from the outside. This you can use if you want to do something like compare photon arrival timings between wfsims results and g4

[98]:
st = straxen.contexts.xenonnt_simulation(cmt_run_id_sim='00001')
Object `wfsim.core.S1.get_n_photons` not found.
[119]:
config = straxen.get_resource('fax_config_nt_sr0_v4.json', fmt='json')
config.update({'detector':'XENONnT', 'right_raw_extension':50000})
st.config['gains'] = [1 for _ in range(straxen.n_tpc_pmts)]
config.update(st.config)
[122]:
s2 = wfsim.core.s2.S2(config)
WARNING:wfsim.resource:Loading XENONnT_SR0_spe_distributions_20210713_no_noise_scaled.csv from mongo downloader to /dali/lgrandi/strax/resource_cache/99c2cbc580cfa8eeebe831456076e136
WARNING:wfsim.resource:Loading XENONnT_s1_xyz_patterns_LCE_MCvf051911_wires.pkl from mongo downloader to /dali/lgrandi/strax/resource_cache/02ed2634596793f6e7aa13783745089a
WARNING:wfsim.resource:A file has value False, assuming this is intentional.
WARNING:wfsim.resource:Loading XENONnT_s2_xy_patterns_GXe_LCE_corrected_qes_MCv4.3.0_wires.pkl from mongo downloader to /dali/lgrandi/strax/resource_cache/6639b01b73d4fdbb703174ad424801f5
WARNING:wfsim.resource:Loading XENONnT_s2_xy_map_v4_210503_mlp_3_in_1_iterated.json from mongo downloader to /dali/lgrandi/strax/resource_cache/ca348171778ae5cdb5de6c1f5814f156
WARNING:wfsim.resource:Loading XENONnT_se_xy_map_v1_mlp.json from mongo downloader to /dali/lgrandi/strax/resource_cache/4ed37f1d06853503899ac37a0a2237ba
WARNING:wfsim.resource:Loading XENONnT_pmt_afterpulse_config_018435.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/a38b18cd61ca67a065a3a32d9b57c2b2
WARNING:wfsim.resource:Loading XENONnT_GARFIELD_SR0_B2d75n_C2d75n_G0d3p_A4d9p_T0d9n_PMTs1d3n_FSR0d65p.npz from mongo downloader to /dali/lgrandi/strax/resource_cache/42d87bc750ceb2f3c0491a586e016341
WARNING:wfsim.resource:Loading garfield_timing_map_gas_gap_sr0.npy from mongo downloader to /dali/lgrandi/strax/resource_cache/803c5f08cb89063a9671ac8da5cb7e75
WARNING:wfsim.resource:Loading gas_gap_warping_map_January_2021.pkl from mongo downloader to /dali/lgrandi/strax/resource_cache/d37b34dda3e865ef607b0bee2057772c
WARNING:wfsim.resource:Loading garfield_gas_gap_map_sr0.json from mongo downloader to /dali/lgrandi/strax/resource_cache/d39d7eda03373635c60e209067470dc8
WARNING:wfsim.resource:Loading x1t_se_afterpulse_delaytime.pkl.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/014aa048758a51f154cdc3b32dac2773
WARNING:wfsim.resource:Loading XENONnT_noise_tpc_only_2ms_25118.npz from mongo downloader to /dali/lgrandi/strax/resource_cache/e560759e28f74d5c049425c25e5ae0d5
WARNING:wfsim.resource:Loading XnT_3D_FDC_xyt_dummy_all_zeros_v0.1.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/55b30b2d26d151ae8e2aa55ffe6693ad
WARNING:wfsim.resource:Loading field_dependent_radius_depth_maps_B2d75n_C2d75n_G0d3p_A4d9p_T0d9n_PMTs1d3n_FSR0d65p_QPTFE_0d5n_0d4p.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/487272e33b4c832ae715350b6275ba0b
WARNING:wfsim.resource:Loading data_driven_diffusion_map_XENONnTSR0V2.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/8a6c6dc1bfe5380f48e96462161a3313
WARNING:wfsim.resource:Loading XENONnT_s1_proponly_pc_reflection_optPhot_perPMT_S1_local_20220510.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/91f4a6162d5e335c4416165349222061
WARNING:wfsim.resource:Loading XENONnT_s2_opticalprop_time_v0.json.gz from mongo downloader to /dali/lgrandi/strax/resource_cache/1d257e215e00873e37ae49f16a04d717
WARNING:wfsim.resource:Using json for unspecified XENONnT_se_xy_map_v1_mlp.json
/opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/wfsim/load_resource.py:262: RuntimeWarning: invalid value encountered in true_divide
  orig_aft_=np.mean((s2map_topeff_/s2map_toteff_)[s2map_toteff_>0.0])
WARNING:wfsim.resource:Using json.gz for unspecified XENONnT_s2_opticalprop_time_v0.json.gz
WARNING:wfsim.resource:Using noise data XENONnT_noise_tpc_only_2ms_25118.npz with 494 channels for XENONnT
[124]:
s2.electron_timings?
Signature:
s2.electron_timings(
    t,
    n_electron,
    drift_time_mean,
    drift_time_spread,
    sc_gain,
    timings,
    gains,
    electron_trapping_time,
)
Call signature:  s2.electron_timings(*args, **kwargs)
Type:            CPUDispatcher
String form:     CPUDispatcher(<function S2.electron_timings at 0x7f8d3dae25e0>)
File:            /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/lib/python3.9/site-packages/wfsim/core/s2.py
Docstring:
Calculate arrival times of the electrons. Data is written to the timings and gains arrays
:param t: 1d array of ints
:param n_electron:1 d array of ints
:param drift_time_mean: 1d array of floats
:param drift_time_spread: 1d array of floats
:param sc_gain: secondary scintillation gain
:param timings: empty array with length sum(n_electron)
:param gains: empty array with length sum(n_electron)
:param electron_trapping_time: configuration values
Class docstring:
Implementation of user-facing dispatcher objects (i.e. created using
the @jit decorator).
This is an abstract base class. Subclasses should define the targetdescr
class attribute.
Init docstring:
Parameters
----------
py_func: function object to be compiled
locals: dict, optional
    Mapping of local variable names to Numba types.  Used to override
    the types deduced by the type inference engine.
targetoptions: dict, optional
    Target-specific config options.
impl_kind: str
    Select the compiler mode for `@jit` and `@generated_jit`
pipeline_class: type numba.compiler.CompilerBase
    The compiler pipeline type.
[ ]:

Understanding Waveform Simulation for XENONnT

Nov 30

[1]:
import strax, straxen, wfsim
import numpy as np
import pandas as pd
*** Detector definition message ***
You are currently using the default XENON10 template detector.

[2]:
straxen.print_versions(modules=('strax', 'straxen', 'cutax', 'wfsim', 'ntauxfiles'))
[2]:
module version path git
0 python 3.9.17 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/b... None
1 strax 1.5.2 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... None
2 straxen 2.1.1 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... None
3 cutax 1.15.1 /home/shenyangshi/cutax/cutax branch:bdt_ms | 277f174
4 wfsim 1.0.2 /opt/XENONnT/anaconda/envs/XENONnT_2023.07.1/l... None
5 ntauxfiles 0.3.0 /project2/lgrandi/shenyangshi/home/private_nt_... branch:AmBe_ms_fit | 3555246
[4]:
config = straxen.get_resource('fax_config_nt_sr0_dev.json', fmt='json')
config.update({'detector':'XENONnT', 'right_raw_extension':50000})

Part 1, Simulation flow

Conceptual Overview / Between Interface and Core / Pulse Classes / Workflow Walkthrough

Conceptual Overview

What it’s for?

Imagine a monoenergetic source like Kr83m, producing photons and electrons uniformally in the detector, what would the events look like, can the processing software correctly reconstruct them? Or Would you like some fake events in the blinded regions?

The signals from XENON experiments have been extensively studied, so that we can mostly model them from the bottom up approach. The WFSim use those models do construct waveforms as realistic as possible while keeping track of the inputs.

Let’s break down the WFSim bit by bit.

How it works

The WFSim from the outside works like a python iterator, and is composed of four levels of iterators, where the deepest are Pulse classes (those are not exactly iterator) taking instruction groups and return a list of pseudo pulses. The RawData take the pseudo pulses and yield digitized pulses, similar to what physical digitizers would do. The ChunkRawRecords and RawRecordsFromFaxNT takes care of chunking and interface with strax respectively.

a443cec3599f489bb32e99374350e4da

Instruction groups

However, it is not exactly iterating over instructions, the instructions are just one of the arguments for __init__. It is designed to turn instruction like the one below into the lowest input structure of the processing software, chunk containing raw_records(event containing pulses).

[ ]:

[5]:
instructions = inst_array = wfsim.strax_interface.rand_instructions(c={'event_rate':3, 'chunk_size':1, 'nchunk':1, **config})
inst_df = pd.DataFrame(inst_array)
inst_df
WARNING:wfsim.interface:rand_instructions is deprecated, please use wfsim.random_instructions
WARNING:wfsim.interface:g4id is not (fully) filled
WARNING:wfsim.interface:vol_id is not (fully) filled
[5]:
event_number type time x y z amp recoil e_dep g4id vol_id local_field n_excitons x_pri y_pri z_pri
0 0 1 166666666 32.122543 -22.621212 -43.662350 818 7 13.101433 -1 -1 22.92 113 32.122543 -22.621212 -43.662350
1 0 2 166666666 32.122543 -22.621212 -43.662350 145 7 13.101433 -1 -1 22.92 0 32.122543 -22.621212 -43.662350
2 0 1 500000000 7.289836 -28.672121 -58.539639 3675 7 57.908840 -1 -1 22.92 670 7.289836 -28.672121 -58.539639
3 0 2 500000000 7.289836 -28.672121 -58.539639 609 7 57.908840 -1 -1 22.92 0 7.289836 -28.672121 -58.539639
4 0 1 833333333 36.476028 -32.952290 -136.419952 4234 7 65.917046 -1 -1 22.92 755 36.476028 -32.952290 -136.419952
5 0 2 833333333 36.476028 -32.952290 -136.419952 625 7 65.917046 -1 -1 22.92 0 36.476028 -32.952290 -136.419952
6 1 1 1166666666 -13.274544 -16.667040 -7.988991 1040 7 16.401899 -1 -1 22.92 159 -13.274544 -16.667040 -7.988991
7 1 2 1166666666 -13.274544 -16.667040 -7.988991 164 7 16.401899 -1 -1 22.92 0 -13.274544 -16.667040 -7.988991
8 1 1 1500000000 -6.285732 65.843475 -28.266441 2797 7 43.466827 -1 -1 22.92 486 -6.285732 65.843475 -28.266441
9 1 2 1500000000 -6.285732 65.843475 -28.266441 383 7 43.466827 -1 -1 22.92 0 -6.285732 65.843475 -28.266441
10 1 1 1833333333 -45.993202 -14.296285 -122.466698 346 7 5.972553 -1 -1 22.92 24 -45.993202 -14.296285 -122.466698
11 1 2 1833333333 -45.993202 -14.296285 -122.466698 102 7 5.972553 -1 -1 22.92 0 -45.993202 -14.296285 -122.466698

The event_number are all 0, but don’t worry about it. It is used as event index when using with pax, while having no significant meaning when using with straxen.

The instruction is sorted by the physical time of the signal, that is S2-esque signal are delayed by drift time. And clustered into instruction groups, similar to gap size clustering that split when the gap is larger than right_raw_extension.

[6]:
# Pre-load some constents from config
v = config['drift_velocity_liquid']
rext = config['right_raw_extension']

# Primary instructions must be sorted by signal time
# int(type) by design S1-esque being odd, S2-esque being even
# thus type%2-1 is 0:S1-esque;  -1:S2-esque
# Make a list of clusters of instructions, with gap smaller then rext
inst_time = instructions['time'] + instructions['z']  / v * (instructions['type'] % 2 - 1)
inst_queue = np.argsort(inst_time)
inst_queue = np.split(inst_queue, np.where(np.diff(inst_time[inst_queue]) > rext)[0]+1)

Between Interface and Core

Let’s now examine what’s been passing between the interface and core, specifically between ChunkRawRecord and ChunkRawRecord.rawdata (rawdata). The most important things passed into rawdata are the instructions. But we also pass the truth buffer (empty array of truth dtype with 10000 slots) into rawdata. In return, we have (channel, left, right, data) valuse of digitized pulses, needed to build raw records.

At the same time three properties are interally used as chunking indicators rawdata.left, rawdata.right, radata.source_finished. Whereas the left and right above returned by calling __next__ on rawdata are of individual pulses, the rawdata.left, rawdata.right are of all the pulses originated from an instruction group. So that when we stop and chunk, we can be sure we have finished an instruction group, and the next pulse will come at least right_raw_extension away.

9cf6e478a9a2462a8b4afc3900baf825

Sim data is the function where Pulse class are called.

Convert to raw records

The pulses returned are in the form of (channel, left, right, data), not exactly the form of raw records. So one of the two main functions of ChunkRawRecord is to covert them.

[7]:
# !!! Do not run this cell

pulse_length = right - left + 1
records_needed = int(np.ceil(pulse_length / samples_per_record))

# WARNING baseline and area fields are zeros before finish_results
s = slice(blevel, blevel + records_needed)
record_buffer[s]['channel'] = channel
record_buffer[s]['dt'] = dt
record_buffer[s]['time'] = dt * (left + samples_per_record * np.arange(records_needed))
record_buffer[s]['length'] = [min(pulse_length, samples_per_record * (i+1))
    - samples_per_record * i for i in range(records_needed)]
record_buffer[s]['pulse_length'] = pulse_length
record_buffer[s]['record_i'] = np.arange(records_needed)
record_buffer[s]['data'] = np.pad(data,
    (0, records_needed * samples_per_record - pulse_length), 'constant').reshape((-1, samples_per_record))

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 3
      1 # !!! Do not run this cell
----> 3 pulse_length = right - left + 1
      4 records_needed = int(np.ceil(pulse_length / samples_per_record))
      6 # WARNING baseline and area fields are zeros before finish_results

NameError: name 'right' is not defined
Digitize pulse cache

The psuedo pulses are cached in _pulses_cache, once an instruction group is over, we digitize all the psuedo pulses. This is done by summing up all the pulse current in each channel and convert unit from PE to acd count thus becoming adc_wave. Three functions add_noise, add_baseline and digitizer_saturation are done afterwards.

[8]:
# !!! Do not run this cell


current_2_adc = config['pmt_circuit_load_resistor'] \
    * config['external_amplification'] \
    / (config['digitizer_voltage_range'] / 2 ** (config['digitizer_bits']))

left = np.min([p['left'] for p in _pulses_cache]) - config['trigger_window']
right = np.max([p['right'] for p in _pulses_cache]) + config['trigger_window']
assert right - left < 200000, "Pulse cache too long"

if left % 2 != 0: left -= 1 # Seems like a digizier effect

_raw_data = np.zeros((801,
    right - left + 1), dtype=('<i8'))

for ix, _pulse in enumerate(_pulses_cache):
    ch = _pulse['channel']
    adc_wave = - np.trunc(_pulse['current'] * current_2_adc).astype(int)
    _slice = slice(_pulse['left'] - left, _pulse['right'] - left + 1)

    _raw_data[ch, _slice] += adc_wave

    if config['detector'] == 'XENONnT':
        adc_wave_he = adc_wave * int(config['high_energy_deamplification_factor'])
        if ch <= config['channels_top'][-1]:
            ch_he = config['channels_top_high_energy'][ch]
            _raw_data[ch_he, _slice] += adc_wave_he
        elif ch <= config['channels_bottom'][-1]:
            sum_signal(adc_wave_he,
                _pulse['left'] - left,
                _pulse['right'] - left + 1,
                _raw_data[config['channels_in_detector']['sum_signal']])

# Adding noise, baseline and digitizer saturation
add_noise(data=_raw_data,
               channel_mask=_channel_mask,
               noise_data=resource.noise_data,
               noise_data_length=len(resource.noise_data))
add_baseline(_raw_data, _channel_mask,
    config['digitizer_reference_baseline'],)
digitizer_saturation(_raw_data, _channel_mask)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 8
      1 # !!! Do not run this cell
      4 current_2_adc = config['pmt_circuit_load_resistor'] \
      5     * config['external_amplification'] \
      6     / (config['digitizer_voltage_range'] / 2 ** (config['digitizer_bits']))
----> 8 left = np.min([p['left'] for p in _pulses_cache]) - config['trigger_window']
      9 right = np.max([p['right'] for p in _pulses_cache]) + config['trigger_window']
     10 assert right - left < 200000, "Pulse cache too long"

NameError: name '_pulses_cache' is not defined
Z(ero) L(ength) E(ncoding)

Right after digitize pulse cache, we run ZLE, which uses find_intervals_below_threshold. For each interval, this yields a pulse, similar to what physical digitizers are doing

Pulse Classes

Pulse classes are another monster we will go into with more details in other notebooks. But in general, there’s the parent class Pulse while different types of signal are children of it. And S2-esque after pulsing all inherite from S2.

270d8ff80375411a82e56d0ec303f9c7

Workflow Walkthrough

[ ]:

Understanding Waveform Simulation for XENONnT

Nov 30

[1]:
import strax, straxen, wfsim
import numpy as np
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
[2]:
config = straxen.get_resource('https://raw.githubusercontent.com/XENONnT/'
                 'strax_auxiliary_files/master/fax_files/fax_config_nt.json', fmt='json')
config.update({'detector':'XENONnT'})

Part 2, Pulse (parent class)

The simulated pulses are formed using single-photon hits. Each hit has assigned time, channel, and gain using a common single photo-electron (SPE) shape.

SPE Shape is extracted from data and saved into a single template, with two entries in the config pe_pulse_ts and pe_pulse_ys.

[3]:
def init_pmt_current_templates():
    """
    Create spe templates, for 10ns sample duration and 1ns rounding we have:
    _pmt_current_templates[i] : photon timing fall between [10*m+i, 10*m+i+1)
    (i, m are integers)
    """
    global pe_pulse_function, _pmt_current_templates

    # Interpolate on cdf ensures that each spe pulse would sum up to 1 pe*sample duration^-1
    pe_pulse_function = interp1d(
        config.get('pe_pulse_ts'),
        np.cumsum(config.get('pe_pulse_ys')),
        bounds_error=False, fill_value=(0, 1))

    # Samples are always multiples of sample_duration
    sample_duration = config.get('sample_duration', 10)
    samples_before = config.get('samples_before_pulse_center', 2)
    samples_after = config.get('samples_after_pulse_center', 20)
    pmt_pulse_time_rounding = config.get('pmt_pulse_time_rounding', 1.0)

    # Let's fix this, so everything can be turned into int
    assert pmt_pulse_time_rounding == 1

    samples = np.linspace(-samples_before * sample_duration,
                          + samples_after * sample_duration,
                          1 + samples_before + samples_after)
    _template_length = np.int(len(samples) - 1)

    templates = []
    for r in np.arange(0, sample_duration, pmt_pulse_time_rounding):
        pmt_current = np.diff(pe_pulse_function(samples - r)) / sample_duration  # pe / 10 ns
        # Normalize here to counter tiny rounding error from interpolation
        pmt_current *= (1 / sample_duration) / np.sum(pmt_current)  # pe / 10 ns
        templates.append(pmt_current)
    _pmt_current_templates = np.array(templates)

init_pmt_current_templates()
[4]:
fig, axes = plt.subplots(2, 5, figsize=(20, 10))
for i, temp in enumerate(_pmt_current_templates):
    plt.sca(axes[i//5, i%5])

    # Pluse shape taken from config with 1 ns resolution
    plt.plot(np.array(config.get('pe_pulse_ts')) + i, config.get('pe_pulse_ys'), alpha=0.5, color='b', lw=3.0, label='model')
    # Caching 10 pulse shape templates with 10 ns resolution
    plt.step(np.arange(22) * 10 - 10, temp, alpha=1, lw=3.0, color='k', where='mid', label='pulse')

    plt.legend()
    plt.title('peak time +%d ns'%(i))
    plt.xlabel('ns')
    plt.ylabel('PE/10ns')

plt.tight_layout()
plt.show()
_images/tutorials_Understanding_WFSim_P2_6_0.png
SPE gain mean and spread

While pax save gains in the config, strax uses the function get_to_pe. We are a bit careless here and hard code the conversion here to reintroduce gains into config

[5]:
to_pe = straxen.get_to_pe(1, ('to_pe_per_run', 'https://raw.githubusercontent.com/XENONnT/'
                 'strax_auxiliary_files/master/fax_files/to_pe_nt.npy'),
    len(config['channels_in_detector']['tpc']))
config['gains'] = 1 / to_pe * (1e-8 * 2.25 / 2**14) / (1.6e-19 * 10 * 50)
config['gains'][to_pe==0] = 0
/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.6/site-packages/ipykernel_launcher.py:3: DeprecationWarning: to_pe_per_run will be replaced by CorrectionsManagementSevices
  This is separate from the ipykernel package so we can avoid doing imports until
/opt/XENONnT/anaconda/envs/XENONnT_development/lib/python3.6/site-packages/ipykernel_launcher.py:4: RuntimeWarning: divide by zero encountered in true_divide
  after removing the cwd from sys.path.
[6]:
photon_area_distribution = straxen.get_resource('https://raw.githubusercontent.com/XENONnT/'
                 'strax_auxiliary_files/master/fax_files/XENONnT_spe_distributions.csv', fmt='csv')

def init_spe_scaling_factor_distributions():
    global grid_cdf, spe_shapes, __uniform_to_pe_arr
    # Extract the spe pdf from a csv file into a pandas dataframe
    spe_shapes = photon_area_distribution

    # Create a converter array from uniform random numbers to SPE gains (one interpolator per channel)
    # Scale the distributions so that they have an SPE mean of 1 and then calculate the cdf
    uniform_to_pe_arr = []
    for ch in spe_shapes.columns[1:]:  # skip the first element which is the 'charge' header
        if spe_shapes[ch].sum() > 0:
            mean_spe = (spe_shapes['charge'].values * spe_shapes[ch]).sum() / spe_shapes[ch].sum()
            scaled_bins = spe_shapes['charge'].values / mean_spe
            cdf = np.cumsum(spe_shapes[ch]) / np.sum(spe_shapes[ch])
        else:
            # if sum is 0, just make some dummy axes to pass to interpolator
            cdf = np.linspace(0, 1, 10)
            scaled_bins = np.zeros_like(cdf)

        grid_cdf = np.linspace(0, 1, 2001)
        grid_scale = interp1d(cdf, scaled_bins,
            bounds_error=False,
            fill_value=(scaled_bins[0], scaled_bins[-1]))(grid_cdf)

        uniform_to_pe_arr.append(grid_scale)

    if len(uniform_to_pe_arr):
        __uniform_to_pe_arr = np.stack(uniform_to_pe_arr)

def uniform_to_pe_arr(p, channel=0):
    indices = (p * 2000).astype(int)
    return __uniform_to_pe_arr[channel, indices]

init_spe_scaling_factor_distributions()
[7]:
fig, axes = plt.subplots(1, 2, figsize=(8, 5))


plt.sca(axes[0])
ch = '0'
mean_spe = (spe_shapes['charge'].values * spe_shapes[ch]).sum() / spe_shapes[ch].sum()
scaled_bins = spe_shapes['charge'].values / mean_spe
cdf = np.cumsum(spe_shapes[ch]) / np.sum(spe_shapes[ch])

channel = int(ch)
l1, = plt.plot(scaled_bins, spe_shapes[ch], alpha=0.5, color='b', lw=3.0, label=r'pdf $f(x)$')
plt.xlabel('Gain scaling factor')
plt.ylabel('pdf')

axt = plt.twinx()
l2, = plt.plot(scaled_bins, cdf, alpha=1.0, color='k', lw=3.0, label=r'cdf $F(x)=\int_{-\infty}^{x}f(t)dt$')
#plt.plot(__uniform_to_pe_arr[channel], grid_cdf, alpha=0.5, color='k', lw=3.0, ls='--')
plt.ylabel('cdf')
plt.title('input spe spectrum')
plt.legend(handles=[l1, l2], loc='center right')


plt.sca(axes[1])
plt.plot(cdf, scaled_bins, alpha=1.0, color='k', lw=3.0, label=r'ppf $F^{-1}(x)$')
#plt.plot(grid_cdf, __uniform_to_pe_arr[channel], alpha=0.5, color='k', lw=3.0, ls='--', )
plt.xlabel('cdf')
plt.ylabel('Gain scaling factor')
plt.title('inverse transform sampling')
plt.legend(loc='center left')

plt.tight_layout()
plt.show()
_images/tutorials_Understanding_WFSim_P2_10_0.png

The spe (gain scaling factor) pdf are saved on github, show in blue and coverted into ppf for the inverse transform sampling. So that we can map randomly genarated number between 0 and 1 to the scaling factor.

Add Current
[ ]:
def add_current(photon_timings,
                photon_gains,
                pulse_left,
                dt,
                pmt_current_templates,
                pulse_current):
    #         """
    #         Simulate single channel waveform given the photon timings
    #         photon_timing         - dim-1 integer array of photon timings in unit of ns
    #         photon_gain           - dim-1 float array of ph. 2 el. gain individual photons
    #         pulse_left            - left of the pulse in unit of 10 ns
    #         dt                    - mostly it is 10 ns
    #         pmt_current_templates - list of spe templates of different reminders
    #         pulse_current         - waveform
    #         """
    if not len(photon_timings):
        return

    template_length = len(pmt_current_templates[0])
    i_photons = np.argsort(photon_timings)
    # Convert photon_timings to int outside this function
    # photon_timings = photon_timings // 1

    gain_total = 0
    tmp_photon_timing = photon_timings[i_photons[0]]
    for i in i_photons:
        if photon_timings[i] > tmp_photon_timing:
            start = int(tmp_photon_timing // dt) - pulse_left
            reminder = int(tmp_photon_timing % dt)
            pulse_current[start:start + template_length] += \
                pmt_current_templates[reminder] * gain_total

            gain_total = photon_gains[i]
            tmp_photon_timing = photon_timings[i]
        else:
            gain_total += photon_gains[i]
    else:
        start = int(tmp_photon_timing // dt) - pulse_left
        reminder = int(tmp_photon_timing % dt)
        pulse_current[start:start + template_length] += \
            pmt_current_templates[reminder] * gain_total

Pulse Class Call Workflow

ba7e4438b02c49aa85bdef44096a19c1

[ ]:
def clear_pulse_cache():
    global _pulses
    _pulses = []

__dict__ = {}

def call():
    """
    PMTs' response to incident photons
    Use _photon_timings, _photon_channels to build pulses
    """
    global _photon_timings, _photon_channels, turned_off_pmts

    # The pulse cache should be immediately transfered after call this function
    clear_pulse_cache()

    # Correct for PMT Transition Time Spread (skip for pmt afterpulses)
    _photon_timings += np.random.normal(config['pmt_transit_time_mean'],
                                        config['pmt_transit_time_spread'],
                                        len(_photon_timings))

    dt = config.get('sample_duration', 10) # Getting dt from the lib just once
    _n_double_pe = _n_double_pe_bot = 0 # For truth aft output

    counts_start = 0 # Secondary loop index for assigning channel
    for channel, counts in zip(*np.unique(_photon_channels, return_counts=True)):

        #TODO: This is temporary continue to avoid out-of-range error.
        # It should be added a proper method for nVeto PMTs also.
        if channel >= 2000:
            continue
        # Use 'counts' amount of photon for this channel
        _channel_photon_timings = _photon_timings[counts_start:counts_start+counts]
        counts_start += counts
        if channel in turned_off_pmts: continue

        # If gain of each photon is not specifically assigned
        # Sample from spe scaling factor distribution and to individual gain
        # In contrast to pmt afterpulse that should have gain determined before this step
        if '_photon_gains' not in __dict__:
            if config['detector'] == 'XENON1T':
                _channel_photon_gains = config['gains'][channel] \
                * uniform_to_pe_arr(np.random.random(len(_channel_photon_timings)), channel)

            else:
                _channel_photon_gains = config['gains'][channel] \
                * uniform_to_pe_arr(np.random.random(len(_channel_photon_timings)))

            # Add some double photoelectron emission by adding another sampled gain
            n_double_pe = np.random.binomial(len(_channel_photon_timings),
                                             p=config['p_double_pe_emision'])
            _n_double_pe += n_double_pe
            if channel in config['channels_bottom']:
                _n_double_pe_bot += n_double_pe

            #_dpe_index = np.random.randint(len(_channel_photon_timings),
            #                               size=n_double_pe)
            if config['detector'] == 'XENON1T':
                _channel_photon_gains[:n_double_pe] += config['gains'][channel] \
                * uniform_to_pe_arr(np.random.random(n_double_pe), channel)
            else:
                _channel_photon_gains[:n_double_pe] += config['gains'][channel] \
                * uniform_to_pe_arr(np.random.random(n_double_pe))
        else:
            _channel_photon_gains = np.array(_photon_gains[_photon_channels == channel])

        # Build a simulated waveform, length depends on min and max of photon timings
        min_timing, max_timing = np.min(
            _channel_photon_timings), np.max(_channel_photon_timings)
        pulse_left = int(min_timing // dt) - int(config['samples_to_store_before'])
        pulse_right = int(max_timing // dt) + int(config['samples_to_store_after'])
        pulse_current = np.zeros(pulse_right - pulse_left + 1)

        add_current(_channel_photon_timings.astype(int),
                    _channel_photon_gains,
                    pulse_left,
                    dt,
                    _pmt_current_templates,
                    pulse_current)

        # For single event, data of pulse level is small enough to store in dataframe
        _pulses.append(dict(
            photons  = len(_channel_photon_timings),
            channel  = channel,
            left     = pulse_left,
            right    = pulse_right,
            duration = pulse_right - pulse_left + 1,
            current  = pulse_current,))
[ ]:

Advanced Tricks

So you’ve tried making normal waveforms and now you need to spice up your life by making some way more weird waveforms letting the detector be whatever you want it to be? You have come to the right place!

By default fax uses some configuration file which is a huge pain to modify. So we made fax such that if you add a parameter in the instruction which corresponds to a parameter in the config it will overwrite what the value was in the config and let you deside what it should be!

This example shows how to modify the electron lifetime and the anode voltage

[1]:
import numpy as np
import strax
import straxen
import wfsim

import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from multihist import Histdd, Hist1d
from scipy import stats
[2]:
st = strax.Context(
    config=dict(
        detector='XENON1T',
        fax_config='https://raw.githubusercontent.com/XENONnT/'
                   'strax_auxiliary_files/master/sim_files/fax_config_1t.json',
        fax_config_override={'field_distortion_on':True, 's2_luminescence_model':'simple'},
        **straxen.contexts.xnt_common_config),
    **straxen.contexts.common_opts)
st.register(wfsim.RawRecordsFromFax1T)
[2]:
wfsim.strax_interface.RawRecordsFromFax1T
[3]:
# Just some id from post-SR1, so the corrections work
run_id = '000001'
[4]:
strax.Mailbox.DEFAULT_TIMEOUT=10000
[5]:
dtype = wfsim.strax_interface.instruction_dtype

for new_dtype in [('electron_lifetime_liquid', np.int32),
                  ('anode_voltage', np.int32)]:
    if new_dtype not in dtype:
        dtype.append(new_dtype)

def rand_instructions(c):
    n = c['nevents'] = c['event_rate'] * c['chunk_size'] * c['nchunk']
    c['total_time'] = c['chunk_size'] * c['nchunk']

    instructions = np.zeros(2 * n, dtype=dtype)
    uniform_times = c['total_time'] * (np.arange(n) + 0.5) / n
    instructions['time'] = np.repeat(uniform_times, 2) * int(1e9)
    instructions['event_number'] = np.digitize(instructions['time'],
         1e9 * np.arange(c['nchunk']) * c['chunk_size']) - 1
    instructions['type'] = np.tile([1, 2], n)
    instructions['recoil'] = ['er' for i in range(n * 2)]

    r = np.sqrt(np.random.uniform(0, 2500, n))
    t = np.random.uniform(-np.pi, np.pi, n)
    instructions['x'] = np.repeat(r * np.cos(t), 2)
    instructions['y'] = np.repeat(r * np.sin(t), 2)
    instructions['z'] = np.repeat(np.random.uniform(-100, 0, n), 2)

    nphotons = np.random.uniform(2000, 2050, n)
    nelectrons = 10 ** (np.random.uniform(1, 4, n))
    instructions['amp'] = np.vstack([nphotons, nelectrons]).T.flatten().astype(int)
    instructions['electron_lifetime_liquid'] = np.repeat(600e10,len(instructions))
    instructions['anode_voltage'] = np.repeat(1e10,len(instructions))
    return instructions

wfsim.strax_interface.rand_instructions = rand_instructions
wfsim.strax_interface.instruction_dtype = dtype
[6]:
st.set_config(dict(fax_file=None))
st.set_config(dict(nchunk=1, event_rate=1, chunk_size=100))
[10]:
# Remove any previously simulated data, if such exists
# !rm -r strax_data

records = st.get_array(run_id,'raw_records', progress_bar=False)
peaks = st.get_array(run_id, ['peak_basics'], progress_bar=False)
data = st.get_df(run_id, 'event_info', progress_bar=False)
truth = st.get_df(run_id, 'truth', progress_bar=False)
[11]:
truth.head()
[11]:
event_number type time x y z amp recoil electron_lifetime_liquid anode_voltage ... n_photon_bottom t_first_photon t_last_photon t_mean_photon t_sigma_photon t_first_electron t_last_electron t_mean_electron t_sigma_electron endtime
0 0 1 500000026 -27.127699 -36.640770 -12.655296 2009 er -2147483648 -2147483648 ... 143.0 5.000000e+08 5.000004e+08 5.000001e+08 48.421194 NaN NaN NaN NaN 500000353
1 0 2 500097027 -27.127699 -36.640770 -12.655296 39 er -2147483648 -2147483648 ... 344.0 5.000970e+08 5.000997e+08 5.000984e+08 490.645247 5.000970e+08 5.000993e+08 5.000982e+08 479.974221 500099672
2 0 4 500768320 20.171877 4.092766 -89.123016 1 er -2147483648 -2147483648 ... 10.0 5.007683e+08 5.007690e+08 5.007685e+08 167.279008 5.007683e+08 5.007683e+08 5.007683e+08 0.000000 500769026
3 0 1 1500000019 -19.476303 10.349952 -35.648590 2010 er -2147483648 -2147483648 ... 192.0 1.500000e+09 1.500000e+09 1.500000e+09 41.286365 NaN NaN NaN NaN 1500000227
4 0 2 1500265504 -19.476303 10.349952 -35.648590 2817 er -2147483648 -2147483648 ... 26229.0 1.500266e+09 1.500273e+09 1.500269e+09 961.919333 1.500265e+09 1.500272e+09 1.500269e+09 955.008653 1500272834

5 rows × 22 columns

[ ]:

wfsim package

Submodules

wfsim.load_resource module

class wfsim.load_resource.DummyMap(const, shape=())[source]

Bases: object

Return constant results the length match the length of input but from the second dimensions the shape is user defined input

reduce_last_dim()[source]
class wfsim.load_resource.Resource(config=None)[source]

Bases: object

Get the configs needed for running WFSim. Configs can be obtained in

two ways: 1. Get it directly from the mongo database. This only needs the

name of the file.

  1. Load it with straxen get_resource, this can either:

    Download from a public repository Read from local cache Download from a private repository if credentials are properly setup

static config_to_file(config)[source]

Find and complete all file paths

returns dictionary mapping item name to path

static get_file_path(base, fname)[source]

Find the full path to the resource file Try 4 methods in the following order 1. The base is not url, return base + name 2. If ntauxfiles (straxauxfiles) is installed, return will be package dir + name

pip install won’t work, try python setup.py in the packages

  1. Download the latest version using straxen mongo downloader from database, return the cached file path + md5

  2. Download using straxen get_resource from the url (github raw) simply return base + name

    Be careful with the forth options, straxen creates cache files that might not be updated with the latest github commit.

wfsim.load_resource.load_config(config)[source]

Create a Resource instance from the configuration

Uses a cache to avoid re-creating instances from the same config

wfsim.load_resource.make_patternmap(map_file, fmt=None, method='WeightedNearestNeighbors', pmt_mask=None)[source]

This is special interpretation of the of previous make_map(), but designed for pattern map loading with provided PMT mask. This way simplifies both S1 and S2 cases

wfsim.pax_interface module

class wfsim.pax_interface.PaxEventSimulator(config=None)[source]

Bases: object

Simulate wf from instruction and stored in wfsim.pax_datastructure.datastructure.Event mimicing pax.datastructure.Event Then pickled, compressed and saved mimicing pax raw data zips.

Call compute to start the simulation process.

class WriteZipped(config)[source]

Bases: object

close_current_file()[source]

Closes the currently open file, if there is one. Also handles temporary file renaming.

file_extension = 'zip'
open_new_file(first_event_number)[source]

Opens a new file, closing any old open ones

write_event(event_proxy)[source]

Write one more event to the folder, opening/closing files as needed

class WriteZippedEncoder(config)[source]

Bases: object

static make_event_proxy(event, data, block_id=None)[source]
transfer_event(event)[source]
compute()[source]
class wfsim.pax_interface.PaxEvents(config)[source]

Bases: object

wfsim.strax_interface module

class wfsim.strax_interface.ChunkRawRecords(config, rawdata_generator=<class 'wfsim.core.rawdata.RawData'>, **kwargs)[source]

Bases: object

final_results()[source]
source_finished()[source]
class wfsim.strax_interface.RawRecordsFromFax1T[source]

Bases: RawRecordsFromFaxNT

provides: tuple = ('raw_records', 'truth')
class wfsim.strax_interface.RawRecordsFromFaxNT[source]

Bases: SimulatorPlugin

check_instructions()[source]
compute()[source]
data_kind: Union[str, immutabledict, dict] = immutabledict({'raw_records': 'raw_records', 'raw_records_he': 'raw_records_he', 'raw_records_aqmon': 'raw_records_aqmon', 'truth': 'truth'})
get_instructions()[source]
infer_dtype()[source]

Return dtype of computed data; used only if no dtype attribute defined.

provides: tuple = ('raw_records', 'raw_records_he', 'raw_records_aqmon', 'truth')
class wfsim.strax_interface.RawRecordsFromFaxOpticalNT[source]

Bases: RawRecordsFromFaxNT

get_instructions()[source]
class wfsim.strax_interface.RawRecordsFromFaxnVeto[source]

Bases: RawRecordsFromMcChain

data_kind: Union[str, immutabledict, dict] = immutabledict({'raw_records_nv': 'raw_records_nv', 'truth_nv': 'truth_nv'})
provides: tuple = ('raw_records_nv', 'truth_nv')
class wfsim.strax_interface.RawRecordsFromMcChain[source]

Bases: SimulatorPlugin

check_instructions()[source]
compute()[source]
data_kind: Union[str, immutabledict, dict] = immutabledict({'raw_records': 'raw_records', 'raw_records_he': 'raw_records_he', 'raw_records_aqmon': 'raw_records_aqmon', 'raw_records_nv': 'raw_records_nv', 'truth': 'truth', 'truth_nv': 'truth_nv'})
gain_model_nv

Dispatch on URL protocol.

unrecognized protocol returns identity inspired by dasks Dispatch and fsspec fs protocols.

get_instructions()[source]

Run epix and save instructions as self.instructions_epix epix_config with all the epix run arguments is passed as a dictionary, where source_rate must be set to 0 (epix default), since time handling is done outside epix.

epix needs to be imported in here to avoid circle imports

infer_dtype()[source]

Return dtype of computed data; used only if no dtype attribute defined.

provides: tuple = ('raw_records', 'raw_records_he', 'raw_records_aqmon', 'raw_records_nv', 'truth', 'truth_nv')
set_config()[source]
set_timing()[source]

Set timing information in such a way to synchronize instructions for the TPC and nVeto

source_finished()[source]

Return whether all instructions has been used.

takes_config = immutabledict({'gain_model_mc': <straxen.url_config.URLConfig object>, 'detector': <strax.config.Option object>, 'event_rate': <strax.config.Option object>, 'chunk_size': <strax.config.Option object>, 'n_chunk': <strax.config.Option object>, 'per_pmt_truth': <strax.config.Option object>, 'fax_file': <strax.config.Option object>, 'fax_config': <strax.config.Option object>, 'fax_config_override': <strax.config.Option object>, 'fax_config_override_from_cmt': <strax.config.Option object>, 'channel_map': <strax.config.Option object>, 'n_tpc_pmts': <strax.config.Option object>, 'n_top_pmts': <strax.config.Option object>, 'right_raw_extension': <strax.config.Option object>, 'seed': <strax.config.Option object>, 'gain_model_nv': <straxen.url_config.URLConfig object>, 'epix_config': <strax.config.Option object>, 'entry_start': <strax.config.Option object>, 'entry_stop': <strax.config.Option object>, 'fax_config_nveto': <strax.config.Option object>, 'fax_config_override_nveto': <strax.config.Option object>, 'targets': <strax.config.Option object>})
class wfsim.strax_interface.RawRecordsFromMcChain1T[source]

Bases: RawRecordsFromMcChain

data_kind: Union[str, immutabledict, dict] = immutabledict({'raw_records': 'raw_records', 'truth': 'truth'})
provides: tuple = ('raw_records', 'truth')
wfsim.strax_interface.extra_truth_dtype_per_pmt(n_pmt: bool | int) List[tuple][source]

Generate the extra - truth dtype output

Return total/bottom seperation from truth_extra_dtype if n_pmt is False

Parameters:

n_pmt – Number of PMTs, when false (no PMTs, use total/bottom separation)

Returns:

dtype list

wfsim.strax_interface.read_optical(config)[source]

Function will be executed when wfsim in run in optical mode. This function expects c[‘fax_file’] to be a root file from optical mc :params config: wfsim configuration dict

wfsim.units module

Define unit system for pax (i.e., seconds, etc.) This sets up variables for the various unit abbreviations, ensuring we always have a ‘consistent’ unit system. There are almost no cases that you should change this without talking with a maintainer.

wfsim.utils module

wfsim.utils.find_intervals_below_threshold(w, threshold, holdoff, result_buffer)[source]

Fills result_buffer with l, r bounds of intervals in w < threshold. :param w: Waveform to do hitfinding in :param threshold: Threshold for including an interval :param holdoff: Holdoff number of samples after the pulse return back down to threshold :param result_buffer: numpy N*2 array of ints, will be filled by function.

if more than N intervals are found, none past the first N will be processed.

:returns : number of intervals processed Boundary indices are inclusive, i.e. the right boundary is the last index which was < threshold

wfsim.utils.optical_adjustment(instructions, timings, channels)[source]

Helper function to process optical instructions so that for each entry 1) Move the instruction timing to the first photon in the entry and move photon timings 2) Split photon timing into maximum interval of PULSE_MAX_DURATION default 1us

The split photon are put into new instruction append at the end of the instructions

Module contents

Indices and tables