This page holds stubs for functions that will be included in the api documentation

load_file¶

pyspecdata.load_files.find_file(searchstring, exp_type=None, postproc=None, print_result=True, verbose=False, prefilter=None, expno=None, dimname='', return_acq=False, add_sizes=[], add_dims=[], use_sweep=None, indirect_dimlabels=None, lookup={}, return_list=False, zenodo=None, **kwargs)¶

Find the file given by the regular expression searchstring inside the directory identified by exp_type, load the nddata object, and postprocess with the function postproc.

Used to find data in a way that works seamlessly across different computers (and operating systems). The basic scheme we assume is that:

Laboratory data is stored on the cloud (on something like Microsoft Teams or Google Drive, etc.)
The user wants to seamlessly access the data on their laptop.

The .pyspecdata config file stores all the info about where the data lives + is stored locally. You have basically two options:

Point the source directories for the different data folders (exp_type) to a synced folder on your laptop.
Recommended Point the source directories to a local directory on your computer, where local copies of files are stored, and then also set up one or more remotes using rclone (which is an open source cloud access tool). * pyspecdata can automatically search all your rclone remotes when

you try to load a file. This is obviously slow.
- After the auto-search, it adds a line to .pyspecdata so that it knows how to find that directory in the future.
- It will tell you when it’s searching the remotes. If you know what you’re doing, we highly recommend pressing ctrl-C and then manually adding the appropriate line to RcloneRemotes. (Once you allow it to auto-search and add a line once, the format should be obvious.)

Supports the case where data is processed both on a laboratory computer and (e.g. after transferring via ssh or a syncing client) on a user’s laptop. While it will return a default directory without any arguments, it is typically used with the keyword argument exp_type, described below.

It looks at the top level of the directory first, and if that fails, starts to look recursively. Whenever it finds a file in the current directory, it will not return data from files in the directories underneath. (For a more thorough description, see getDATADIR()).

Note that all loaded files will be logged in the data_files.log file in the directory that you run your python scripts from (so that you can make sure they are properly synced to the cloud, etc.).

It calls load_indiv_file(), which finds the specific routine from inside one of the modules (sub-packages) associated with a particular file-type.

If it can’t find any files matching the criterion, it logs the missing file and throws an exception.

Parameters:

searchstring (str) –
If you don’t know what a regular expression is, you probably want to wrap your filename with re.escape(, like this: re.escape(filename), and use that for your searchstring. (Where you have to import the re module.)

If you know what a regular expression is, pass one here, and it will find any filenames that match.
exp_type (str) – Gives the name of a directory, known to be pyspecdata, that contains the file of interest. For a directory to be known to pyspecdata, it must be registered with the (terminal/shell/command prompt) command pyspecdata_register_dir or in a directory contained inside (underneath) such a directory.
expno (int) – For Bruker NMR and Prospa files, where the files are stored in numbered subdirectories, give the number of the subdirectory that you want. Currently, this parameter is needed to load Bruker and Kea files. If it finds multiple files that match the regular expression, it will try to load this experiment number from all the directories.
postproc (function, str, or None) –
This function is fed the nddata data and the remaining keyword arguments (kwargs) as arguments. It’s assumed that each module for each different file type provides a dictionary called postproc_lookup (some are already available in pySpecData, but also, see the lookup argument, below).

Note that we call this “postprocessing” here because it follows the data organization, etc., performed by the rest of the file in other contexts, however, we might call this “preprocessing”

If postproc is a string, it looks up the string inside the postproc_lookup dictionary that’s appropriate for the file type.

If postproc is “none”, then explicitly do not apply any type of postprocessing.

If postproc is None, it checks to see if the any of the loading functions that were called set the postproc_type property – i.e. it checks the value of data.get_prop('postproc_type') – if this is set, it uses this as a key to pull the corresponding value from postproc_lookup. For example, if this is a bruker file, it sets postproc to the name of the pulse sequence.

For instance, when the acert module loads an ACERT HDF5 file, it sets postproc_type to the value of (h5 root).experiment.description['class']. This, in turn, is used to choose the type of post-processing.

dimname:

passed to load_indiv_file()

return_acq:

passed to load_indiv_file()

add_sizes:

passed to load_indiv_file()

add_dims:

passed to load_indiv_file()

use_sweep:

passed to load_indiv_file()

indirect_dimlabels:

passed to load_indiv_file() lookup : dictionary with str:function pairs

types of postprocessing to add to the postproc_lookup dictionary
zenodo (str, optional) – Deposition number on Zenodo. When the requested file is not found locally, a file matching searchstring will be downloaded from this deposition instead of searching rclone remotes.

pyspecdata.load_files.load_indiv_file(filename, dimname='', return_acq=False, add_sizes=[], add_dims=[], use_sweep=None, indirect_dimlabels=None, expno=None, exp_type=None, return_list=False)¶

Open the file given by filename, use file signature magic and/or filename extension(s) to identify the file type, and call the appropriate function to open it.

Parameters:

dimname (str) – When there is a single indirect dimension composed of several scans, call the indirect dimension dimname.
return_acq (DEPRECATED) –
add_sizes (list) – the sizes associated with the dimensions in add_dims
add_dims (list) – Can only be used with dimname. Break the dimension dimname into several dimensions, with the names given by the list add_dims and sizes given by add_sizes. If the product of the sizes is not the same as the original dimension given by dimname, retain it as the “outermost” (leftmost) dimension. pyspecdata.core.chunkoff() is used to do this, like so: data.chunkoff(dimname,add_dims,add_sizes)
indirect_dimlabels (str or None) – passed through to acert.load_pulse (names an indirect dimension when dimlabels isn’t provided)

Returns:

the nddata containing the data, or else, None, indicating that this is part of a pair of files that should be skipped

Return type:

nddata or None

safe modules¶

Here, include modules that are safe – that is, modules that don’t import a lot of junk, and for which we can just call autosummary.

`datadir`	Allows the user to run the same code on different machines, even though the location of the raw spectral data might change.
`latexscripts`	Provides the `pdflatex_notebook_wrapper` shell/dos command, which you run instead of your normal Latex command to build a lab notebook.

load_file¶

safe modules¶

Navigation

Related Topics