πŸ” INTRODUCTION#

This is a python library to process and analyze raw data from the ProtoDUNEs, which can be used in the future for the DUNE Far Detector modules. The design objectives are:

  • Unify the tools and efforts into a common framework for the PDS working group.

  • Avoid over calculation as possible

🧐 OBJECTIVES

  1. Characterize the detector’s response

    • SPE Calibration (Gain, persistence)

    • …

  2. Physics studies

    • Signal deconvolution

    • Physics fits

    • …

  3. Electronics studies

  4. Fast analysis

    • Look at the data during the protoDUNE data taking

    • Plot various quantities such as full waveforms, average waveform, FFT

  5. Share your codes with the PDS group

Contributing#

The idea of waffles framework is to unify efforts and to develop a robust analysis tool. Users should become developers! if you need something that is not available, or you have an idea on how to improve and optimize a part of the framework, try to do it and commit your changes. Join the waffles Slack channel. At the moment the channel is private: send a Slack message to Francesca Alemanno or Andrea Roche to be added. You can always ask in the Slack channel if you have any doubt. Your analysis can become part of the framework, and you can promote the tools you developed, if they can be useful to other people.

Nevertheless, it is very important to keep some common rules not to harm others work.

Tip

Good coding practises here

  • Create your own branch for developing code and make puntual commits with your changes. Once you want to share with the world, open a pull request and wait for two reviewers to approve the merging

  • To include functions/methods follow the analysis structure and coding conventions

  • Update this documentation with your latest additions and modifications to the main branch (see instructions below)

Writing Code and Documentation#

To keep the documentation always up to date, please follow these rules when adding new code:

  • Always write a docstring for every new function, class, and module.

  • Use either Google style or NumPy style docstrings (both are supported by Sphinx but NumPy is preferred).

  • The documentation is built automatically using Sphinx autodoc, so once you push your code, the docstrings will appear in the API Reference section without extra work.

Example of good practice for code writing (NumPy style)#

def function(param1, param2=None): β€œβ€β€ Short description of the function.

Parameters
----------
param1 : type
    Description of param1.
param2 : type, optional
    Description of param2 (default is ...).

Returns
-------
type
    Description of return value.

Raises
------
ExceptionName
    Condition when this exception is raised.

Examples
--------
>>> function(42)
'result'
"""

Current Workflow: brief introduction#

  1. Files location: the rucio paths for the protoDUNEs runs are stored in /eos/experiment/neutplatform/protodune/experiments/ProtoDUNE-VD/ruciopaths/ for NP02 and /eos/experiment/neutplatform/protodune/experiments/ProtoDUNE-II/PDS_Commissioning/waffles/1_rucio_paths for NP04. If they are not available, you can generate these files. First, you have to move to the waffles scripts directory, then source setup_rucio_a9.sh and finally run the scripts/fetch_rucio_replicas.py script, with options --runs run_number --max-files number_of_files.

  2. Data extraction: the raw data is stored in .hdf5 files. The optimal way of extracting the data is as follows:

    • Move to the waffles scripts directory

    • Modify the config.json file, according to your needs

    • Run python3 07_save_structured_from_config.py --config config.json

In this way, you can store the WaveformSet in a lighter .hdf5 file and load it whenever you want to work with it.

  1. Analysis and Visualization: the WaveformSet object can be visualized using the waffles.plotting.plot module.

Getting Started - SETUP βš™οΈ#

We recommend installing VSCode as editor. Some useful extensions are: Remote-SSH, Jupyter, vscode-numpy-viewer, Python Environment Manager

If it is your first time here you need to create an environment to be able to use all their tools. Depending on the scope of your work you can create a daq_env (run hdf5 file processing) or a ana_env (general analysis scope) environment.

DAQ ENVIRONMENT [NEEDED TO READ THE HDF5!!]#

In this case all the dependencies from the DAQ needed to read the information from the .hdf5 files are included. You don’t need this environment unless you plan to work on the decoding side.

source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh

setup_dbt latest
dbt-create -l 
dbt-create fddaq-v5.2.1-a9 <my_dir>

ANA ENVIRONMENT [OPTIONAL]#

This general environment is used to run the waffles library and all the tools needed to analyze the data. To create it just run in your terminal (from whatever lxplus machine or locally):

python3 -m venv /path/to/new/virtual/environment
source /path/to/new/virtual/environment/bin/activate

Or use the Python Environment Manager VScode extension to manage your environments.

In order to have access to the ROOT library you need to have it sourced in your environment. Add these lines at the end of the bin/activate file:

source /cvmfs/sft.cern.ch/lcg/app/releases/ROOT/6.32.02/x86_64-almalinux9.4-gcc114-opt/bin/thisroot.sh
export JUPYTER_CONFIG_DIR=$VIRTUAL_ENV

To deactivate the environment just run deactivate in your terminal.

If you are using Jupyter inside VSCode you may want the virtual enviroment to be recognized by the Kernel selector, for that follow:

#source /path/to/your/venv/bin/activate # Activate the virtual environment

# Install ipykernel in the virtual environment
pip install ipykernel 
# Add the virtual environment as a Jupyter kernel
python -m ipykernel install --user --name=your_env_name --display-name "Python_WAFFLES"

0. Download the library by cloning it from GitHub#

Move to the DAQ folder.

source env.sh # Activate you DAQ environment
# Ensure your SSH keys are properly set up, then:
git clone git@github.com:DUNE/waffles.git # Clone the repository
cd waffles
git checkout -b <your_branch_name>            # Create a branch to develop

Here is a summary of the folder structure of the repository:

.. admonition:: WARNING This folder structure is not updated, but it gives a good idea of the general structure

β”œβ”€β”€ 'docs'/ # FOLDER WITH THE REPO DOCUMENTATION (THIS TEXT CAN BE IMPROVED BY YOU!)
    β”œβ”€β”€ 'examples'/
        └── '4_Examples.rst'
    β”œβ”€β”€ '1_Intro.md'
    β”œβ”€β”€ '2_Scripts.md'
    β”œβ”€β”€ '3_Libraries.rst'
    β”œβ”€β”€ 'conf.py'
    β”œβ”€β”€ 'data_classes.rst'
    β”œβ”€β”€ 'input.rst'
    β”œβ”€β”€ 'np04_data.rst'
    β”œβ”€β”€ 'np04_data_classes.rst'
    β”œβ”€β”€ 'np04_utils.rst'
    β”œβ”€β”€ 'output.rst'
    β”œβ”€β”€ 'plotting.rst'
    └── 'utils.rst'

└── 'scripts'/ # FOLDER WITH THE SCRIPTS
    β”œβ”€β”€ 'cpp_utils'/ # C++raw functions and scripts (can be used in standalone mode) [Thanks Jairo!]
        β”œβ”€β”€ 'functions'/
            β”œβ”€β”€ 'channelmap.txt'
            β”œβ”€β”€ 'hdf5torootclass.h'
            └── 'wffunctions.h'
        β”œβ”€β”€ 'CMakeLists.txt'
        β”œβ”€β”€ 'compile_decoder.sh' #Script to compile c++ scripts (just 1st time) and be able to use them
        β”œβ”€β”€ 'HDF5LIBS_duplications.cpp' # C++ script to check for duplications in the hdf5 files
        β”œβ”€β”€ 'HDF5toROOT_decoder.cpp'    # C++ script to decode hdf5 files to root files
        β”œβ”€β”€ 'plotsAPA.C' # ROOT script to plot the APA map
        └── 'README.md'  # Instructions to compile and run the C++ scripts
    β”œβ”€β”€ '00_HDF5toROOT.py' # Python decoder (hdf5 to root) with multithreading
    β”œβ”€β”€ '00_HDF5toROOT.sh' # Bash script for managing CPP macros. If you already compiled (cpp_utils) them you can run this one.
    β”œβ”€β”€ 'get_protodunehd_files.sh' # Script to get rucio_paths from the hdf5 daq files
    β”œβ”€β”€ 'get_rucio.py' # RUN to make rucio_paths sincronize with /eos/ folder. You will save time and make others save time too!
    β”œβ”€β”€ 'README.md'
    └── 'setup_rucio.sh' # Standalone commands for setting up rucio once you are inside a SL7

└── 'src'/  # MAIN CODE CORE WITH ALL THE CLASSES DEFINITIONS HERE#
    β”œβ”€β”€ 'waffles'/
        β”œβ”€β”€ 'data_classes'/ # FOLDER WITH THE DATA CLASSES DEFINITIONS
            β”œβ”€β”€ 'BasicWfAna.py'
            β”œβ”€β”€ 'CalibrationHistogram.py'
            β”œβ”€β”€ 'ChannelWs.py'
            β”œβ”€β”€ 'ChannelWsGrid..py'
            β”œβ”€β”€ 'IODict.py'
            β”œβ”€β”€ 'IPDict.py'
            β”œβ”€β”€ 'Map.py'
            β”œβ”€β”€ 'ORDict.py'
            β”œβ”€β”€ 'PeakFindingWfAna.py'
            β”œβ”€β”€ 'TrackedHistogram.py'
            β”œβ”€β”€ 'Waveform.py'
            β”œβ”€β”€ 'WaveformAdcs.py'
            β”œβ”€β”€ 'WaveformSet.py'
            β”œβ”€β”€ 'WfAna.py'
            β”œβ”€β”€ 'WfAnaResult.py'
            └── 'WfPeak.py'
        β”œβ”€β”€ 'input'/ # FOLDER WITH THE INPUT UTILS
            β”œβ”€β”€ 'input_utils.py'
            β”œβ”€β”€ 'pickle_file_to_WaveformSet.py'
            β”œβ”€β”€ 'raw_hdf5_reader.py'
            └── 'raw_root_reader..py'
        β”œβ”€β”€ 'np04_analysis'/ # FOLDER WITH THE ANALYSIS UTILS
            β”œβ”€β”€ 'LED_calibration'
            └── 'np04_ana.py'
        β”œβ”€β”€ 'np04_data'/ # FOLDER WITH THE DATA UTILS
            └── 'ProtoDUNE_HD_APA_maps.py'
        β”œβ”€β”€ 'np04_data_classes'/ # FOLDER WITH THE DATA CLASSES 
            └── 'APAmap.py'
        β”œβ”€β”€ 'np04_utils'/ # FOLDER WITH NP04 UTILS
            └── 'utils.py'
        β”œβ”€β”€ 'plotting'/ # FOLDER WITH THE PLOTTING UTILS
            └── 'display'
            β”œβ”€β”€ 'plot_utils.py'
            └── 'plot.py'
        └── 'utils'/ # FOLDER WITH THE GENERAL UTILS
            β”œβ”€β”€ 'deconvolution'/ # FOLDER WITH THE DECONVOLUTION METHODS
            β”œβ”€β”€ 'fit_peaks'/     # FOLDER WITH THE FITTING PEAKS METHODS 
            β”œβ”€β”€ 'check_utils.py' 
            β”œβ”€β”€ 'filtering_utils.py' 
            β”œβ”€β”€ 'numerical_utils.py'
            β”œβ”€β”€ 'wf_maps_utils.py'
            └── 'Exceptions.py'
└── 'test' # FOLDER WITH FILES UNDER TEST (temporary)
β”œβ”€β”€ '.gitattributes'
β”œβ”€β”€ '.gitignore'
β”œβ”€β”€ '.readthedocs.yaml' # Configuration file for the documentation
β”œβ”€β”€ 'environment.yaml'  # Environment file for the conda environment
β”œβ”€β”€ '.README.md'
β”œβ”€β”€ 'requirements.md' # Requirements for the evironment
└── 'setup.py'        # Setup file for the library

1. Install packages needed for the library to run#

After activating the env with source env.sh or source /path/to/new/virtual/environment/bin/activate you can install all the requirements to run waffles by navigating to the repository main folder and running:

which python3 # Should show the .venv Python
python3 -m pip install -r requirements.txt .

If at some point you need to re-run waffles with the changes you have introduced to the source code, you need to run the second command.

2. Make sure you have access to data to analyze#

  • Make sure you know how to connect and work from @lxplus.cern.ch machines. It is better to work in your personal folder in /afs/cern.ch/work/, because of larger available disk quota.

  • To access raw data locations you need to be able to generate a FNAL.GOV ticket. This is already configured in the scripts/fetch_rucio_replicas.py script which is used to generate txt files with the data paths and store them in your local folder.

  • Request access to the np04-t0comp-users and np04-daq-dev egroup on the CERN egroups page. This also adds you to the np-comp Linux group.

3. Have a look at the examples and enjoy!#