Getting started with WFSim¶

Hello friend. Welcome to the basic tutorial on how to simulate waveforms with the latest wfsim version in strax. Here we’ll just demonstrate the basic functionality. For more in depth analysis stuff, checkout the straxen tutorials for more detailed thing.

[1]:

import numpy as np
import strax
import straxen
import wfsim

import matplotlib.pyplot as plt
from matplotlib.colors import LogNorm
from multihist import Histdd, Hist1d
from scipy import stats

import json

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-1-28690857fe6c> in <module>
      1 import numpy as np
      2 import strax
----> 3 import straxen
      4 import wfsim
      5

~/Desktop/python/straxen/straxen/__init__.py in <module>
      1 __version__ = '0.15.8'
      2
----> 3 from utilix import uconfig
      4 from .common import *
      5 # contexts.py below

~/Desktop/python/utilix/utilix/__init__.py in <module>
      2
      3 from utilix.config import Config
----> 4 uconfig = Config()
      5
      6 from utilix.rundb import DB

~/Desktop/python/utilix/utilix/config.py in __init__(self, path)
     19     def __init__(self, path=None):
     20         if not Config.instance:
---> 21             Config.instance = Config.__Config()
     22
     23     def __getattr__(self, name):

~/Desktop/python/utilix/utilix/config.py in __init__(self)
     50
     51             else:
---> 52                 raise FileNotFoundError(f"Could not load a configuration file. "
     53                                         f"You can create one at {home_config}, or set a custom path using\n\n"
     54                                         f"export XENON_CONFIG=path/to/your/config\n")

FileNotFoundError: Could not load a configuration file. You can create one at /Users/petergaemers/.xenon_config, or set a custom path using

export XENON_CONFIG=path/to/your/config

Different simulators¶

Because we’d like to make simulations we have multiple different ways of making data. They are: * #### RawRecordsFromFaxNT(1T) * #### RawRecordsFromFaxEpix * #### RawRecordsFromFaxOptical * #### RawRecordsFromFaxNveto

The main difference is the way you specify instructions. The NT(1T) simulator depends on either a csv file or a defaul/user specified function to generate instructions. Epix takes input from G4 processed with epix. Optical takes optical input from G4 for photon channels and timings. And Nveto is a small modification to optical to provide nveto data

Setting everything up¶

First we need to define the right context. For this we made two contexts to help you out. One for 1T and one for nT. By default in your context RawRecordsFromFaxNT will be registered. If you want another one be sure to register it

For 1T do:

[ ]:

st = straxen.contexts.xenon1t_simulation()

Or nT do:

[2]:

st = straxen.contexts.xenonnt_simulation()

Now we need to define a run id. What you give it doesn’t really matter, since strax will look for data and make new if it doesn’t find anything. And this is what you want. Strax will use the run id to get the electron lifetime and pmt gains from a database, and returns placeholders if the run doesn’t exist.

[3]:

run_id = '1'

Defining instructions¶

For the instructions there are multiple different ways to do it. The simulator has this option called “fax_file”. If it has a value (None by default) the simulator will either read it as a csv or root file. If not it will use some predefined functions to make your events. The number of event you’ll simulate based on the product of the config values “nchunk”,”event_rate” and “chunk_size”. Which you can set as follows:

[4]:

st.set_config(dict(nchunk=1, event_rate=5, chunk_size=10,))

Strax groups data together in chunks based on time (for low level data). nchunk is the number of chunks you want to simulate event_rate is the number of events per seconds, so this effects the amount of spacing between events. Finally chunk_size is the length in seconds of your chunk For the DAQ this is 5 seconds. For simulations you can do whatever you want. It is important to note that Strax will write out data per chunk So when your chunks are small you’ll, among other things, call Strax’ IO functions a lot, giving a substantial overhead. On the other hand, to large chunks will hog all your memory and your kernel might crash. Based on my experimentation setting chunk_size to 500 gives best performance

These are ways you can give instructions * #### Random This is the default where simulator will generate some random instructions for you. * #### Custom For this you will need to overwrite the instruction generator function * #### CSV You can provide a csv file with the instruction (Like the output of nSort) * #### Geant4 For this you’ll need to use epix to do some clustering for you.

Random¶

I guess this is pretty self explanatory. The simulator has this function called “rand_instructions” which will make something up for you.

[6]:

wfsim.rand_instructions??

Custom¶

This is some more fun. To do this we’ll write a new function which returns a structured numpy array with the correct dtype.

[7]:

wfsim.instruction_dtype

[7]:

[('event_number', numpy.int32),
 ('type', numpy.int8),
 ('time', numpy.int64),
 ('x', numpy.float32),
 ('y', numpy.float32),
 ('z', numpy.float32),
 ('amp', numpy.int32),
 ('recoil', '<U2')]

Event number is just a lable which peaks are together. type is either 1(S1) or 2 (S2). In the truth higher numbers are also used to refer to different types of afterpulses. time,x,y,z are the time and positions of the signal. Amp is the number of photons or electrons generated, and recoil can be used for different types of recoil (but only Electronic recoil is supported). Recoil is a number according to the NEST convention which will be used to indicate which interaction. For the key look here: https://github.com/XENONnT/WFSim/blob/2c614b0f7b0d7c7adc516f6188e281857e8d7e22/wfsim/core.py#L22

Now, lets say we want some krypton peaks. For this we’ll need to change the default instruction function to include this double decay and use Nestpy to convert energy deposits into a number of photons and electrons.

In this case I’ll use 1 “event” per full decay, that where all the 4’s are coming from.

[8]:

def super_awesome_custom_instruction(c):
    import nestpy
    half_life = 156.94e-9 #Kr intermediate state half-life in ns
    decay_energies = [32.2,9.4] # Decay energies in kev

    n = c['nevents'] = c['event_rate'] * c['chunk_size'] * c['nchunk']
    c['total_time'] = c['chunk_size'] * c['nchunk']

    instructions = np.zeros(4 * n, dtype=wfsim.instruction_dtype)
    instructions['event_number'] = np.digitize(instructions['time'],
         1e9 * np.arange(c['nchunk']) * c['chunk_size']) - 1

    instructions['type'] = np.tile([1, 2], 2 * n)
    instructions['recoil'] = [7 for i in range(4 * n)]

    r = np.sqrt(np.random.uniform(0, 2500, n))
    t = np.random.uniform(-np.pi, np.pi, n)
    instructions['x'] = np.repeat(r * np.cos(t), 4)
    instructions['y'] = np.repeat(r * np.sin(t), 4)
    instructions['z'] = np.repeat(np.random.uniform(-100, 0, n), 4)

    #To get the correct times we'll need to include the 156.94 ns half life of the intermediate state.

    uniform_times = c['total_time'] * (np.arange(n) + 0.5) / n
    delayed_times = uniform_times + np.random.exponential(half_life/np.log(2),len(uniform_times))
    instructions['time'] = np.repeat(list(zip(uniform_times,delayed_times)),2) * 1e9


    # Here we'll define our XENON-like detector
    nc = nestpy.NESTcalc(nestpy.VDetector())
    A = 131.293
    Z = 54.
    density = 2.862  # g/cm^3   #SR1 Value
    drift_field = 82  # V/cm    #SR1 Value
    interaction = nestpy.INTERACTION_TYPE(7)

    energy = np.tile(decay_energies,n)
    quanta = []
    for en in energy:
        y = nc.GetYields(interaction,
                         en,
                         density,
                         drift_field,
                         A,
                         Z,
                         (1, 1))
        quanta.append(nc.GetQuanta(y, density).photons)
        quanta.append(nc.GetQuanta(y, density).electrons)

    instructions['amp'] = quanta

    return instructions

Now here comes the magic line:

[9]:

wfsim.strax_interface.rand_instruction = super_awesome_custom_instruction

This changes the default rand_instruction function in our own super awesome function. So when the simulator will call rand_instruction the code from super_awesome_custom_instruction will be executed

CSV¶

The format for csv files is the same as the instructions dtype. So on every line specify event_number,type,time ,x,y,z, amp and recoil in that order. Then tell the simulator it exists:

[10]:

st.set_config(dict(fax_file='instructions.csv'))

Ofcourse if you do not have this file it will not work

Geant4¶

For starters you’ll need to register the Epix plugin. This requires an epix configuration (it’s on private_nt_aux_files), and if you want a start and stop event.

[11]:

st.register(wfsim.strax_interface.RawRecordsFromFaxEpix)
st.set_config(dict(epix_config= path_to_config))

Configuration customization¶

The simulator using a larger large amount of configuration settings to do it’s magic. Some of them are best left along, like pmt_circuit_load_resistor. Others on the other hand are things you might want change a bit to see how the data will change. Unfortunately currently the full list is spread out over two different places. One is the fax config json which is on github. The other is the option list in strax. Besides those things like pattern maps are hardcoded in load_resource.py. These files are overrideable with the option “fax_config_override”

The strax config is viewable like this and can be changed by st.set_config(dict(option you want=value you want))

[12]:

st.show_config('raw_records')

/Users/petergaemers/Desktop/python/strax/strax/context.py:369: UserWarning: Option channel_map not taken by any registered plugin
  warnings.warn(f"Option {k} not taken by any registered plugin")

[12]:

	option	default	current	applies_to	help
0	fax_file	None	/dali/lgrandi/pgaemers/fax_files/Xenon1T_Whole...	(raw_records, truth)	Directory with fax instructions
1	nSort_path	None	<OMITTED>	(raw_records, truth)	If nSort is needed to convert GEANT4 to instru...
2	fax_config_override	None	<OMITTED>	(raw_records, truth)	Dictionary with configuration option overrides
3	event_rate	5	5	(raw_records, truth)	Average number of events per second
4	chunk_size	5	10	(raw_records, truth)	Duration of each chunk in seconds
5	nchunk	4	1	(raw_records, truth)	Number of chunks to simulate
6	fax_config	https://raw.githubusercontent.com/XENONnT/stra...	https://raw.githubusercontent.com/XENONnT/stra...	(raw_records, truth)
7	to_pe_file	https://raw.githubusercontent.com/XENONnT/stra...	https://raw.githubusercontent.com/XENONnT/stra...	(raw_records, truth)
8	gain_model	(to_pe_per_run, https://raw.githubusercontent....	(to_pe_constant, 0.005)	(raw_records, truth)	PMT gain model. Specify as (model_type, model_...
9	right_raw_extension	50000	<OMITTED>	(raw_records, truth)
10	zle_threshold	0	<OMITTED>	(raw_records, truth)
11	detector	XENON1T	XENONnT	(raw_records, truth)
12	timeout	1800	<OMITTED>	(raw_records, truth)	Terminate processing if any one mailbox receiv...

The config from github can be loaded as:

[13]:

straxen.get_resource('https://raw.githubusercontent.com/XENONnT/'
                               'strax_auxiliary_files/master/fax_files/fax_config_nt.json',fmt='json').keys()

[13]:

dict_keys(['pmt_pulse_time_rounding', 'gauss_noise_sigmas', 'real_noise_file', 'real_noise_sample_mode', 'maximum_recombination_time', 's1_detection_efficiency', 'singlet_lifetime_liquid', 'triplet_lifetime_liquid', 's1_ER_recombination_fraction', 's1_ER_primary_singlet_fraction', 's1_ER_secondary_singlet_fraction', 's1_NR_singlet_fraction', 's1_ER_alpha_singlet_fraction', 'electron_trapping_time', 'gas_drift_velocity_slope', 'lxe_dielectric_constant', 'singlet_lifetime_gas', 'triplet_lifetime_gas', 'singlet_fraction_gas', 'each_pmt_afterpulse_types', 'pmt_afterpulse_types', 'p_double_pe_emision', 's1_model_type', 's1_decay_time', 'pmt_transit_time_spread', 'real_noise_sample_size', 'pressure', 'temperature', 'anode_voltage', 'pmt_rise_time', 'pmt_fall_time', 'photon_area_distribution', 'pe_pulse_ts', 'pe_pulse_ys', 'samples_before_pulse_center', 'samples_after_pulse_center', 'pmt_transit_time_mean', 'led_pulse_length', 'drift_field', 'liquid_density', 'diffusion_constant_liquid', 'electron_extraction_yield', 'gate_to_anode_distance', 'elr_gas_gap_length', 's2_secondary_sc_gain', 'anode_field_domination_distance', 'anode_wire_radius', 's2_fitted_patterns_file', 's2_mean_area_fraction_top', 'rz_position_distortion_map', 'tpc_name', 'run_number', 'channels_top', 'channels_bottom', 'sample_duration', 'digitizer_voltage_range', 'nominal_gain', 'digitizer_bits', 'pmt_circuit_load_resistor', 'external_amplification', 'digitizer_reference_baseline', 's1_width_threshold', 's1_risetime_threshold', 'tight_coincidence_window_left', 'tight_coincidence_window_right', 'tight_coincidence_threshold', 'tpc_length', 'tpc_radius', 'electron_lifetime_liquid', 'drift_velocity_liquid', 'drift_time_gate', 'shrink_data_threshold', 'xy_posrec_preference', 'channels_in_detector', 'n_channels', 'gains', 'gain_sigmas', 'quantum_efficiencies', 'relative_qe_error', 'relative_gain_error', 'pmts', 'noise_file_index', 'noise_file_folder', 'max_intervals', 'samples_to_store_before', 'samples_to_store_after', 'zle_threshold', 'special_thresholds', 's1_ER_recombination_time', 'pmt_ap_modifier', 'photoionization_modifier', 'pmt_ap_t_modifier', 'trigger_window', 'photoelectric_p', 'photoelectric_modifier', 'photoelectric_t_center', 'photoelectric_t_spread'])

Changing things in this guy goes slightly different. In the strax option list there is the option called “fax_config_override”. This takes a dict which will be used to override any values in the json config. So changing the ‘s2_secondary_sc_gain’ is done as:

[14]:

st.set_config(dict(fax_config_override = dict(s2_secondary_sc_gain=23)))

What actually happens?¶

What happens behind the scenes is that the instructions are first grouped together in chunks. Then we loop over the instructions and the full chunk is returned before starting with the next one.

We use a S1 and S2 class to calculate the arrival times of the photons and the channels which have been hit. Then we’ll hand them over to the Pulse class to calculate the currents in the channels. Finally the currents go to a RawData class where we fake the digitizer response.

S1¶

For S1s we start with calculating the light yield based on the position of the interaction, and draw the number of photons seen from a Poisson distribution.

Second we calculate the arrival times of the photons. This is based on the scintillation of the xenon atoms. It is dependend on the recombination time, the singlet and triplet fractions.

Finally the channels are calculated. Based on the pattern map we use a interpolation map to get a probability distribution for channels to be hit for a S1 signal based on the position of the interaction, and then we draw from this distribution for every photon.

S2¶

S2s are slightly more complicated. First we need to drift the electrons up, and while doing so we’ll lose some of them. To get the photon timings, we first need to get the arrival times of the electrons at the gas interface based on a diffusion model. Then we can calculate the photon timings based on a luminescence model for every individual electron. And for the channels we do the same trick with the interpolating map.

Pulse¶

When we have our lists of channels and timing we can generate actual pulses. First we add a pmt transition time. Then we loop over all channels, calculate the double pe emission probabilities, and add a current in the pmt channel based on the arrival time. This is all stored in a big dictionary. Afterwards this is passed to our fake digitizers which then returns you with your very own pretty data

Getting down to bussiness¶

Now we have access to all the normal strax data types, and another one called ‘truth’ which holds the simulation instructions. Calling it follows the normal strax convention.

[15]:

st.set_config(dict(fax_file=None))

[17]:

# Remove any previously simulated data, if such exists
# !rm -r strax_data

records = st.get_array(run_id,'records')
# peaks = st.get_array(run_id, ['peaks','peak_basics'])
# data = st.get_df(run_id, 'event_info')

truth = st.get_df(run_id, 'truth')

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-17-98001a2a1c1a> in <module>
      2 # !rm -r strax_data
      3
----> 4 records = st.get_array(run_id,'records')
      5 # peaks = st.get_array(run_id, ['peaks','peak_basics'])
      6 # data = st.get_df(run_id, 'event_info')

~/Desktop/python/strax/strax/context.py in get_array(self, run_id, targets, save, max_workers, **kwargs)
   1106                 max_workers=max_workers,
   1107                 **kwargs)
-> 1108             results = [x.data for x in source]
   1109
   1110         results = np.concatenate(results)

~/Desktop/python/strax/strax/context.py in <listcomp>(.0)
   1106                 max_workers=max_workers,
   1107                 **kwargs)
-> 1108             results = [x.data for x in source]
   1109
   1110         results = np.concatenate(results)

~/Desktop/python/strax/strax/context.py in get_iter(self, run_id, targets, save, max_workers, time_range, seconds_range, time_within, time_selection, selection_str, keep_columns, _chunk_number, progress_bar, **kwargs)
    900                                          save=save,
    901                                          time_range=time_range,
--> 902                                          chunk_number=_chunk_number)
    903
    904         # Cleanup the temp plugins

~/Desktop/python/strax/strax/context.py in get_components(self, run_id, targets, save, time_range, chunk_number)
    768         for d in plugins.values():
    769             self._set_plugin_config(d, run_id, tolerant=False)
--> 770             d.setup()
    771         return strax.ProcessorComponents(
    772             plugins=plugins,

~/Desktop/python/wfsim/wfsim/strax_interface.py in setup(self)
    541
    542     def setup(self):
--> 543         self.set_config()
    544
    545         c=self.config

~/Desktop/python/wfsim/wfsim/strax_interface.py in set_config(self)
    564     def set_config(self,):
    565         c = self.config
--> 566         c.update(get_resource(c['fax_config'], fmt='json'))
    567         overrides = self.config['fax_config_override']
    568         if overrides is not None:

~/Desktop/python/straxen/straxen/common.py in get_resource(x, fmt)
    155     # 2. load from file
    156     elif os.path.exists(x):
--> 157         return open_resource(x, fmt=fmt)
    158     # 3. load from database
    159     elif straxen.uconfig is not None:

~/Desktop/python/straxen/straxen/common.py in open_resource(file_name, fmt)
    113     elif fmt == 'json':
    114         with open(file_name, mode='r') as f:
--> 115             result = json.load(f)
    116     elif fmt == 'binary':
    117         with open(file_name, mode='rb') as f:

~/anaconda3/envs/strax/lib/python3.7/json/__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    294         cls=cls, object_hook=object_hook,
    295         parse_float=parse_float, parse_int=parse_int,
--> 296         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
    297
    298

~/anaconda3/envs/strax/lib/python3.7/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    346             parse_int is None and parse_float is None and
    347             parse_constant is None and object_pairs_hook is None and not kw):
--> 348         return _default_decoder.decode(s)
    349     if cls is None:
    350         cls = JSONDecoder

~/anaconda3/envs/strax/lib/python3.7/json/decoder.py in decode(self, s, _w)
    335
    336         """
--> 337         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338         end = _w(s, end).end()
    339         if end != len(s):

~/anaconda3/envs/strax/lib/python3.7/json/decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Now it is time to make pretty plots and see if what we makes actually makes any sense

[17]:

peak_basics = st.get_df(run_id,'peak_basics')

[18]:

peak_basics[:10]

[18]:

	time	endtime	center_time	area	n_channels	max_pmt	max_pmt_area	range_50p_area	range_90p_area	area_fraction_top	length	dt	rise_time	tight_coincidence	type
0	99999510	100000190	100000098	1511.265015	438	425	11	45.498920	108.933258	0.090103	68	10	32.073212	278	1
1	100000190	100000460	100000224	160.055038	436	425	2	41.434231	93.053680	0.088126	27	10	27.741791	19	1
2	100000470	100002090	100001584	1.275000	1	477	1	19.543358	126.469147	0.000000	162	10	12.299422	1	0
3	100002730	100003030	100002842	2.070000	3	36	0	211.971954	235.813843	1.000000	30	10	17.707243	2	0
4	100164950	100166240	100165555	18.500000	17	128	3	202.121216	416.111206	0.173243	129	10	187.104172	3	2
5	100514580	100523780	100518825	11563.562500	491	206	1228	1771.331787	4564.572266	0.087043	184	50	1627.153198	59	2
6	100870810	100871920	100871312	18.850000	19	34	1	144.104645	346.100159	0.701591	111	10	124.979836	8	2
7	100938950	100940020	100939763	20.995001	19	97	5	183.766678	469.409851	0.568231	107	10	316.340454	4	2
8	299999510	300000190	300000098	1476.585693	458	332	9	44.220722	108.634956	0.156120	68	10	30.482187	313	1
9	300000190	300000770	300000223	158.209991	455	179	1	40.539444	93.234810	0.161873	58	10	26.819931	32	1

Matching¶

To do matching with the truth the easiest way is to write a new strax plugin where you loop over peaks and get the truth arrays where the mean arrival time of the photons are within the time window of the peak So that will look something like this:

[20]:

class MatchedPeaks(strax.LoopPlugin):
    depends_on = ('peak_basics','truth')
    provides = 'matched_peaks'
    __version__ = '0.0.2'
    dtype = [('time',np.int),
             ('endtime',np.int),
             ('area',np.int),
             ('n_photon',np.int)]

    def compute(self, peaks, truth):
        result = np.zeros(len(peaks), self.dtype)

        for ix, p in enumerate(peaks):
            t = truth[(p['time']<truth['t_mean_photon'])&
                      (p['endtime']>truth['t_mean_photon'])]
            r = result[ix]
            r['time'] = p['time']
            r['endtime'] = p['endtime']
            r['area'] = p['area']
            if len(t)==0:
                r['n_photon'] = 0
            elif len(t)>1:
                r['n_photon'] = np.sum(t['n_photon'])
            else:
                r['n_photon'] = t['n_photon']

        return result

Of course this doesn’t actually work. An electron afterpulse can be very spread out leading it to be interpreted as multiple peaks while coming from 1 instruction falling outside of the specified range. It would be very much appreciated if someone wants make a more sturdy selection criteria :)

For example, checkout out peak matching algorithm for WFSim: https://github.com/XENONnT/pema

[21]:

st.register(MatchedPeaks)

[21]:

__main__.MatchedPeaks

[22]:

st.get_array(run_id,'matched_peaks')

[ ]: