List of Modules and Sub-Packages¶
Note
Once the API documentation is fixed (by cleaning up the import * statements), we can add links to each module here.
core¶
Provides the core components of pyspecdata. Currently, this is a very large file that we will slowly break down into separate modules or packages.
The classes nddata
, nddata_hdf
, ndshape
, the
function plot()
, and the class fitdata
are the core components of the N-Dimensional processing routines.
Start by familiarizing yourself with those.
The figlist
is the base class for “Figure lists.”
Figure lists allows you to organize plots and text and to refer to plots
by name, rather than number.
They are designed so that same code can be used seamlessly from within
ipython, jupyter, a python script, or a python environment within latex
(JMF can also distribute latex code for this – nice python based
installer is planned).
The user does not initialize the figlist class directly,
but rather initializes figlist_var
.
At the end of this file,
there is a snippet of code that sets
figlist_var
to choice that’s appropriate for the working environment
(i.e., python, latex environment, *etc.)
There are many helper and utility functions that need to be sorted an
documented by JMF,
and can be ignored.
These are somewhat wide-ranging in nature.
For example, box_muller()
is a helper function (based on numerical
recipes) used by nddata.add_noise()
,
while h5 functions are helper functions for using pytables in a fashion that
will hopefull be intuitive to those familiar with SQL, etc.
figlist¶
Contains the figure list class
The figure list gives us three things:
Automatically handle the display and scaling of nddata units.
Refer to plots by name, rather than number (matplotlib has a mechanism for this, which we ignore)
A “basename” allowing us to generate multiple sets of plots for different datasets – e.g. 5 plots with 5 names plotted for 3 different datasets and labeled by 3 different basenames to give 15 plots total
Ability to run the same code from the command line or from within a python environment inside latex. * this is achieved by choosing figlist (default gui) and figlistl (inherits
from figlist – renders to latex – the
figlist.show()
method is changed)potential planned future ability to handle html
Ability to handle mayavi plots and matplotlib plots (switch to glumpy, etc.?) * potential planned future ability to handle gnuplot
Todo
Currently the “items” that the list tracks correspond to either plot formatting directives (see figlist.setprops()
), text, or figures.
We should scrap most elements of the current implementation of figlist and rebuild it
currently the figlist is set up to use a context block. We will not only keep this, but also make it so the individual axes. Syntax (following a
fl = figlist_var()
should look like this:with fl['my plot name'] as p:
and contents of the block would then bep.plot(...)
, etc.define an “organization” function of the figlist block. This allows us to use standard matplotlib commands to set up and organize the axes, using standard matplotlib commands (twinx, subplot, etc.)
figlist will still have a “next” function, but its purpose will be to simply: * grab the current axis using matplotlib gca() (assuming the id of the axis isn’t yet assigned to an existing figlist_axis – see below) * otherwise, if the name argument to “next” has not yet been called,
call matplotlib’s figure(), followed by subplot(111), then do the previous bullet point
the next function is only intended to be called explicitly from within the organization function
figlist will consist simply of a list of figlist_axis objects (a new object type), which have the following attributes: * type – indicating the type of object:
axis (default)
text (raw latex (or html))
H1 (first level header – translates to latex section)
H2 (second level…)
the name of the plot
a matplotlib or mayavi axes object
the units associated with the axes
a collection.OrderedDict giving the nddata that are associated with the plot, by name. * If these do not have a name, they will be automatically assigned a name. * The name should be used by the new “plot” method to generate
the “label” for the legend, and can be subsequently used to quickly replace data – e.g. in a Qt application.
a dictionary giving any arguments to the pyspecdata.core.plot (or countour, waterfall, etc) function
the title – by default the name of the plot – can be a setter
the result of the id(…) function, called on the axes object –> this can be used to determine if the axes has been used yet
do not use check_units – the plot method (or contour, waterfall, etc.) will only add the nddata objects to the OrderedDict, add the arguments to the argument dictionary, then exit * In the event that more than one plot method is called, the name of the underlying nddaata should be changed
a boolean legend_suppress attribute
a boolean legend_internal attribute (to place the legend internally, rather than outside the axis)
a show method that is called by the figlistl show method. This will determine the appropriate units and use them to determine the units and scale of the axes, and then go through and call pyspecdata.core.plot on each dataset (in matplotlib, this should be done with a formatting statement rather than by manipulating the axes themselves) and finally call autolegend, unless the legend is supressed
The “plottype” (currently an argument to the plot function) should be an attribute of the axis object
general_functions¶
These are general functions that need to be accessible to everything inside pyspecdata.core. I can’t just put these inside pyspecdata.core, because that would lead to cyclic imports, and e.g. submodules of pyspecdata can’t find them.
datadir¶
Allows the user to run the same code on different machines, even though the location of the raw spectral data might change.
This is controlled by the ~/.pyspecdata
or ~/_pyspecdata
config file.
load_files¶
This subpackage holds all the routines for reading raw data in proprietary
formats.
It’s intended to be accessed entirely through the function find_file()
,
which uses :module:`datadir` to search for the filename, then automatically
identifies the file type and calls the appropriate module to load the data into
an nddata.
Currently, Bruker file formats (both ESR and NMR) are supported, as well as (at least some earlier iteration) of Magritek file formats.
Users/developers are very strongly encouraged to add support for new file types.
- pyspecdata.load_files.find_file(searchstring, exp_type=None, postproc=None, print_result=True, verbose=False, prefilter=None, expno=None, dimname='', return_acq=False, add_sizes=[], add_dims=[], use_sweep=None, indirect_dimlabels=None, lookup={}, return_list=False, **kwargs)¶
Find the file given by the regular expression searchstring inside the directory identified by exp_type, load the nddata object, and postprocess with the function postproc.
Used to find data in a way that works seamlessly across different computers (and operating systems). The basic scheme we assume is that:
Laboratory data is stored on the cloud (on something like Microsoft Teams or Google Drive, etc.)
The user wants to seamlessly access the data on their laptop.
The
.pyspecdata
config file stores all the info about where the data lives + is stored locally. You have basically two options:Point the source directories for the different data folders (
exp_type
) to a synced folder on your laptop.Recommended Point the source directories to a local directory on your computer, where local copies of files are stored, and then also set up one or more remotes using rclone (which is an open source cloud access tool). * pyspecdata can automatically search all your rclone remotes when
you try to load a file. This is obviously slow.
After the auto-search, it adds a line to
.pyspecdata
so that it knows how to find that directory in the future.It will tell you when it’s searching the remotes. If you know what you’re doing, we highly recommend pressing ctrl-C and then manually adding the appropriate line to RcloneRemotes. (Once you allow it to auto-search and add a line once, the format should be obvious.)
Supports the case where data is processed both on a laboratory computer and (e.g. after transferring via ssh or a syncing client) on a user’s laptop. While it will return a default directory without any arguments, it is typically used with the keyword argument exp_type, described below.
It looks at the top level of the directory first, and if that fails, starts to look recursively. Whenever it finds a file in the current directory, it will not return data from files in the directories underneath. (For a more thorough description, see
getDATADIR()
).Note that all loaded files will be logged in the data_files.log file in the directory that you run your python scripts from (so that you can make sure they are properly synced to the cloud, etc.).
It calls
load_indiv_file()
, which finds the specific routine from inside one of the modules (sub-packages) associated with a particular file-type.If it can’t find any files matching the criterion, it logs the missing file and throws an exception.
- Parameters:
searchstring (str) –
If you don’t know what a regular expression is, you probably want to wrap your filename with re.escape(, like this: re.escape(filename), and use that for your searchstring. (Where you have to import the re module.)
If you know what a regular expression is, pass one here, and it will find any filenames that match.
exp_type (str) – Gives the name of a directory, known to be pyspecdata, that contains the file of interest. For a directory to be known to pyspecdata, it must be registered with the (terminal/shell/command prompt) command pyspecdata_register_dir or in a directory contained inside (underneath) such a directory.
expno (int) – For Bruker NMR and Prospa files, where the files are stored in numbered subdirectories, give the number of the subdirectory that you want. Currently, this parameter is needed to load Bruker and Kea files. If it finds multiple files that match the regular expression, it will try to load this experiment number from all the directories.
postproc (function, str, or None) –
This function is fed the nddata data and the remaining keyword arguments (kwargs) as arguments. It’s assumed that each module for each different file type provides a dictionary called postproc_lookup (some are already available in pySpecData, but also, see the lookup argument, below).
If postproc is a string, it looks up the string inside the postproc_lookup dictionary that’s appropriate for the file type.
If postproc is None, it checks to see if the any of the loading functions that were called set the postproc_type property – i.e. it checks the value of
data.get_prop('postproc_type')
– if this is set, it uses this as a key to pull the corresponding value from postproc_lookup. For example, if this is a bruker file, it sets postproc to the name of the pulse sequence.For instance, when the acert module loads an ACERT HDF5 file, it sets postproc_type to the value of
(h5 root).experiment.description['class']
. This, in turn, is used to choose the type of post-processing.- dimname:
passed to
load_indiv_file()
- return_acq:
passed to
load_indiv_file()
- add_sizes:
passed to
load_indiv_file()
- add_dims:
passed to
load_indiv_file()
- use_sweep:
passed to
load_indiv_file()
- indirect_dimlabels:
passed to
load_indiv_file()
lookup : dictionary with str:function pairs
types of postprocessing to add to the postproc_lookup dictionary
fornotebook¶
This provides figlistl
, the Latex figure list.
Any other functions here are helper functions for the class.
figlist
is generally not chosen manually,
but figlist_var
will be assigned to figlistl
when
python code is embedded in a python environment inside latex.
latexscripts¶
Provides the pdflatex_notebook_wrapper
shell/dos command, which you run
instead of your normal Latex command to build a lab notebook.
The results of python environments are cached and only re-run if the code changes,
even if the python environments are moved around.
This makes the compilation of a Latex lab notebook extremely efficient.
ipy¶
Provides the jupyter extension:
%load_ext pyspecdata.ipy
That allows for fancy representation nddata instances – i.e. you can type the name of an instance and hit shift-Enter, and a plot will appear rather than some text representation.
Also overrides plain text representation of numpy arrays with latex representation that we build ourselves or pull from sympy.
Also known as “generalized jupyter awesomeness” in only ~150 lines of code!
See [O’Reilly Book](https://www.safaribooksonline.com/blog/2014/02/11/altering-display-existing-classes-ipython/) for minimal guidance if you’re interested.
ndshape¶
The ndshape
class allows you to allocate arrays and determine the shape of existing arrays.
units¶
Not yet implemented – a preliminary idea for how to handle actual unit conversion. (Currently, we only do s to Hz during FT and order of magnitude prefixes when plotting.)