List of Modules and Sub-Packages

Note

Once the API documentation is fixed (by cleaning up the import * statements), we can add links to each module here.

core

Provides the core components of pyspecdata. Currently, this is a very large file that we will slowly break down into separate modules or packages.

The classes nddata, nddata_hdf, ndshape, the function plot(), and the class fitdata are the core components of the N-Dimensional processing routines. Start by familiarizing yourself with those.

The figlist is the base class for “Figure lists.” Figure lists allows you to organize plots and text and to refer to plots by name, rather than number. They are designed so that same code can be used seamlessly from within ipython, jupyter, a python script, or a python environment within latex (JMF can also distribute latex code for this – nice python based installer is planned). The user does not initialize the figlist class directly, but rather initializes figlist_var. At the end of this file, there is a snippet of code that sets figlist_var to choice that’s appropriate for the working environment (i.e., python, latex environment, *etc.)

There are many helper and utility functions that need to be sorted an documented by JMF, and can be ignored. These are somewhat wide-ranging in nature. For example, box_muller() is a helper function (based on numerical recipes) used by nddata.add_noise(), while h5 functions are helper functions for using pytables in a fashion that will hopefull be intuitive to those familiar with SQL, etc.

figlist

Contains the figure list class

The figure list gives us three things:

  • Automatically handle the display and scaling of nddata units.

  • Refer to plots by name, rather than number (matplotlib has a mechanism for this, which we ignore)

  • A “basename” allowing us to generate multiple sets of plots for different datasets – e.g. 5 plots with 5 names plotted for 3 different datasets and labeled by 3 different basenames to give 15 plots total

  • Ability to run the same code from the command line or from within a python environment inside latex. * this is achieved by choosing figlist (default gui) and figlistl (inherits

    from figlist – renders to latex – the figlist.show() method is changed)

    • potential planned future ability to handle html

  • Ability to handle mayavi plots and matplotlib plots (switch to glumpy, etc.?) * potential planned future ability to handle gnuplot

Todo

Currently the “items” that the list tracks correspond to either plot formatting directives (see figlist.setprops()), text, or figures.

We should scrap most elements of the current implementation of figlist and rebuild it

  • currently the figlist is set up to use a context block. We will not only keep this, but also make it so the individual axes. Syntax (following a fl = figlist_var() should look like this: with fl['my plot name'] as p: and contents of the block would then be p.plot(...), etc.

  • define an “organization” function of the figlist block. This allows us to use standard matplotlib commands to set up and organize the axes, using standard matplotlib commands (twinx, subplot, etc.)

  • figlist will still have a “next” function, but its purpose will be to simply: * grab the current axis using matplotlib gca() (assuming the id of the axis isn’t yet assigned to an existing figlist_axis – see below) * otherwise, if the name argument to “next” has not yet been called,

    call matplotlib’s figure(), followed by subplot(111), then do the previous bullet point

    • the next function is only intended to be called explicitly from within the organization function

  • figlist will consist simply of a list of figlist_axis objects (a new object type), which have the following attributes: * type – indicating the type of object:

    • axis (default)

    • text (raw latex (or html))

    • H1 (first level header – translates to latex section)

    • H2 (second level…)

    • the name of the plot

    • a matplotlib or mayavi axes object

    • the units associated with the axes

    • a collection.OrderedDict giving the nddata that are associated with the plot, by name. * If these do not have a name, they will be automatically assigned a name. * The name should be used by the new “plot” method to generate

      the “label” for the legend, and can be subsequently used to quickly replace data – e.g. in a Qt application.

    • a dictionary giving any arguments to the pyspecdata.core.plot (or countour, waterfall, etc) function

    • the title – by default the name of the plot – can be a setter

    • the result of the id(…) function, called on the axes object –> this can be used to determine if the axes has been used yet

    • do not use check_units – the plot method (or contour, waterfall, etc.) will only add the nddata objects to the OrderedDict, add the arguments to the argument dictionary, then exit * In the event that more than one plot method is called, the name of the underlying nddaata should be changed

    • a boolean legend_suppress attribute

    • a boolean legend_internal attribute (to place the legend internally, rather than outside the axis)

    • a show method that is called by the figlistl show method. This will determine the appropriate units and use them to determine the units and scale of the axes, and then go through and call pyspecdata.core.plot on each dataset (in matplotlib, this should be done with a formatting statement rather than by manipulating the axes themselves) and finally call autolegend, unless the legend is supressed

  • The “plottype” (currently an argument to the plot function) should be an attribute of the axis object

general_functions

These are general functions that need to be accessible to everything inside pyspecdata.core. I can’t just put these inside pyspecdata.core, because that would lead to cyclic imports, and e.g. submodules of pyspecdata can’t find them.

datadir

Allows the user to run the same code on different machines, even though the location of the raw spectral data might change.

This is controlled by the ~/.pyspecdata or ~/_pyspecdata config file.

load_files

This subpackage holds all the routines for reading raw data in proprietary formats. It’s intended to be accessed entirely through the function find_file(), which uses :module:`datadir` to search for the filename, then automatically identifies the file type and calls the appropriate module to load the data into an nddata.

Currently, Bruker file formats (both ESR and NMR) are supported, as well as (at least some earlier iteration) of Magritek file formats.

Users/developers are very strongly encouraged to add support for new file types.

pyspecdata.load_files.find_file(searchstring, exp_type=None, postproc=None, print_result=True, verbose=False, prefilter=None, expno=None, dimname='', return_acq=False, add_sizes=[], add_dims=[], use_sweep=None, indirect_dimlabels=None, lookup={}, return_list=False, **kwargs)

Find the file given by the regular expression searchstring inside the directory identified by exp_type, load the nddata object, and postprocess with the function postproc.

It looks at the top level of the directory first, and if that fails, starts to look recursively. Whenever it finds a file in the current directory, it will not return data from files in the directories underneath. (For a more thorough description, see getDATADIR()).

Note that all loaded files will be logged in the data_files.log file in the directory that you run your python scripts from (so that you can make sure they are properly synced to the cloud, etc.).

It calls load_indiv_file(), which finds the specific routine from inside one of the modules (sub-packages) associated with a particular file-type.

If it can’t find any files matching the criterion, it logs the missing file and throws an exception.

Parameters:
  • searchstring (str) – Most commonly, this is just a fragment of the file name, with any literal *, ., or ? characters preceded by a backslash. More generally, it is a regular expression, where .*searchstring.* matches a filename inside the directory appropriate for exp_type.

  • exp_type (str) – Gives the name of a directory, known to be pyspecdata, that contains the file of interest. For a directory to be known to pyspecdata, it must be registered with the (terminal/shell/command prompt) command pyspecdata_register_dir or in a directory contained inside (underneath) such a directory.

  • expno (int) – For Bruker NMR and Prospa files, where the files are stored in numbered subdirectories, give the number of the subdirectory that you want. Currently, this parameter is needed to load Bruker and Kea files. If it finds multiple files that match the regular expression, it will try to load this experiment number from all the directories.

  • postproc (function, str, or None) –

    This function is fed the nddata data and the remaining keyword arguments (kwargs) as arguments. It’s assumed that each module for each different file type provides a dictionary called postproc_lookup (some are already available in pySpecData, but also, see the lookup argument, below).

    If postproc is a string, it looks up the string inside the postproc_lookup dictionary that’s appropriate for the file type.

    If postproc is None, it checks to see if the any of the loading functions that were called set the postproc_type property – i.e. it checks the value of data.get_prop('postproc_type') – if this is set, it uses this as a key to pull the corresponding value from postproc_lookup. For example, if this is a bruker file, it sets postproc to the name of the pulse sequence.

    For instance, when the acert module loads an ACERT HDF5 file, it sets postproc_type to the value of (h5 root).experiment.description['class']. This, in turn, is used to choose the type of post-processing.

    dimname:

    passed to load_indiv_file()

    return_acq:

    passed to load_indiv_file()

    add_sizes:

    passed to load_indiv_file()

    add_dims:

    passed to load_indiv_file()

    use_sweep:

    passed to load_indiv_file()

    indirect_dimlabels:

    passed to load_indiv_file()

  • lookup (dictionary with str:function pairs) – types of postprocessing to add to the postproc_lookup dictionary

fornotebook

This provides figlistl, the Latex figure list. Any other functions here are helper functions for the class. figlist is generally not chosen manually, but figlist_var will be assigned to figlistl when python code is embedded in a python environment inside latex.

latexscripts

Provides the pdflatex_notebook_wrapper shell/dos command, which you run instead of your normal Latex command to build a lab notebook. The results of python environments are cached and only re-run if the code changes, even if the python environments are moved around. This makes the compilation of a Latex lab notebook extremely efficient.

ipy

Provides the jupyter extension:

%load_ext pyspecdata.ipy

That allows for fancy representation nddata instances – i.e. you can type the name of an instance and hit shift-Enter, and a plot will appear rather than some text representation.

Also overrides plain text representation of numpy arrays with latex representation that we build ourselves or pull from sympy.

Also known as “generalized jupyter awesomeness” in only ~150 lines of code!

See [O’Reilly Book](https://www.safaribooksonline.com/blog/2014/02/11/altering-display-existing-classes-ipython/) for minimal guidance if you’re interested.

ndshape

The ndshape class allows you to allocate arrays and determine the shape of existing arrays.

units

Not yet implemented – a preliminary idea for how to handle actual unit conversion. (Currently, we only do s to Hz during FT and order of magnitude prefixes when plotting.)