scanometrics.processing package

Processing modules for ScanOMetrics

Submodules

scanometrics.processing.dldirect module

Wrapper for DL+DiReCT scripts to be run on <subjects_dir>/<subjid>. Includes pipeline template parameters as proc_pipeline_name, and methods as run(), and proc2metric().

class scanometrics.processing.dldirect.proc_pipeline(bids_database, load_from_ID=True, ses_delimiter='_', acq_delimiter='_', atlas='DesikanKilliany')

Bases: object

calc_area_gauscurv(subject, atlas, hemi)
generate_location_plots(output_folder)
get_atlas(atlas_name)
get_lobe_roi()
get_subjSesAcq_T1s(subjects)
get_subjSesAcq_array(subjects)

Loops through a subjects dictionary and returns array of combinations of subject ID, session ID, as well as the acquisition label. Usually used to retrieve linear IDs used for processing individual scans and recover the corresponding metrics in the same order as covariate_value arrays.

Parameters

subjects (dictionary) – dictionary with subject IDs, session IDs, and acquisition labels.

get_subjSesAcq_id(subject_id, session_id, acq_label)
get_subjSesAcq_row(subjects, subject_id, session_id, acq_label)

Quick and dirty way of getting row index for measured_metrics and covariate_values

load_subject_metrics_from_ID(ID, metric_include=None, metric_exclude=None)

Load metrics from stats2table folder in <subjects_dir>/ID/stats2table. Here, ID is the combination of subject ID, session ID, and acquisition label, usually obtained when looping through SOM.subjects dictionary. All metrics are loaded into a single row which is returned along with metric names.

Parameters

ID (string) – subject ID corresponding to the folder in <subjects_dir> that contains all metric files. The metric files are supposed to contain a single line corresponding to ID, which should in turn correspond to the combination of subject ID, session ID and acquisition label of the scan used to compute the metrics.

load_subject_metrics_from_stats2tableFolder(stats2table_folder, subjSesAcq_list, metric_include=None, metric_exclude=None)

Loads tables with multiple subjects. Requires SOM.proc_pipeline.load_from_ID to be set to False for this to be called in proc2metric(). The function loops through files defined in seg_metrics, parc35_metrics and parc75_metrics, and loads all subjects found in the files, reorganizing order according to subjSesAcq_list. Metric_include and metric_exclude lists can be used to filter metrics by name.

Parameters
  • stats2table_folder (string) – path to folder containing all the stats2metric files, with all subjects in each file.

  • subjSesAcq_list (list) – list of IDs, used to reorder loaded metrics according to the desired order. Should also contain all subjects in the metric files, as loading a subject that is not in the list will probably trigger a ValueError when trying to find the index in this list.

  • metric_include (list) – list of metric names to keep

  • metric_exclude (list) – list of metric names to remove

pial2outer(subjects_dir, subject, atlas, hemi)

Labels outer surface with closest pial labels. :param subjects_dir: Freesurfer subjet directory :param subject: subject ID (folder name in subjects_dir) :param atlas: atlas to use (DesikanKilliany or Destrieux) :param hemi: hemispheres to process (‘lh’ or ‘rh’)

proc2metric(subjects, covariate_values, covariate_names, stats2table_folder=None, ref_rows=None, ref_metric_values=None, ref_metric_names=None, ref_covariate_values=None, ref_covariate_names=None, metric_include=None, metric_exclude=None)

Loads all measurements from stats2table files. The load_from_ID option allows to load from individual files in the bids structure when set to True. If set to False, then stats2table_folder must be specified for SOM to load all files in there, which should contain all metrics for all subjects. The different ref_* parameters allow specifying the metric_values, metric_names, covariate_values and covariate_names of the reference dataset to use (typically a copy of the SOM.normativeModel.* variables).

Parameters
  • subjects (dictionary) – dictionary with subject IDs, session IDs, and acquisition labels.

  • covariate_values (numpy matrix of size len(subjSesAqc_array)*n_covariates) – covariate values for subjects in the subjects variable.

  • covariate_names (list) – list of covariate names corresponding to columns in covariate_values.

  • stats2table_folder (string) – path to folder containing files with all subjects measurements, to be loaded if the load_from_ID parameter in SOM.proc_pipeline is set to True.

  • ref_rows (1D array) – array of row indexes in ref_metric_values and ref_covariate_values to use as reference when normalizing metrics.

  • ref_metric_values (numpy matrix) – numpy matrix with normative values to be used as reference when normalizing

  • ref_metric_names (list of strings) – list of metric names corresponding to columns in ref_metric_values

  • ref_covariate_values (numpy matrix) – numpy matrix with covariate_values used to select matching normative subjects in terms of age, sequence, scanner, etc… when normalizing

  • ref_covariate_names (list of strings) – list of covariate names corresponding to colums in ref_covariate_values

proc2table(subjects, n_threads)

n_thread is kept for compatibility with core.py functions, but DL+DiReCT stat2table script currently gathers all subject statistics into grouped files, so n_thread is ignored.

Parameters
  • subjects (dictionary) – dictionary with all subject IDs, session IDs, and acquisition labels used to identify scans.

  • n_threads (int) – ignored

run_pipeline(subjects, n_threads, subject_id=None)

Pipeline template method overided here. Calls DL+DiReCT processing scripts. Single or multi-subject processing controlled through n_threads (default of -1 allocates all cpus available through a call to multiprocessing.cpu_count(). Processes scans with batch-dl+direct after renaming all scans to T1w and saving them in a flattened folder structure in <bids_database>/derivatives/tmp_dldirectRawdata. The temporary directory is deleted afterwards, and the outputs of DL+DiReCT are saved into individual folders in <bids_database>/derivatives/ dldirect/<sub-label_ses-label_acq-label>

Parameters
  • subjects (dictionary) – multi-layered dictionary, with first level being subject IDs, then session IDs, followed by the acq labels. Used to identify T1w scans to process.

  • n_threads (int) – number of threads to use (default calls set to -1 to use all possible cpus). GPU calls all available ressources, as available from torch.cuda.device_count().

  • subject_id (str) – id of the chosen subject

run_proc2table(subjSesAcq_list)

Wrapper for DL+DiReCT stats2table. Assumes subjects were processed with DL+DiReCT. Creates new directories <subjects_dir>/<subjSesAcq_id>/stats2table (deletes old ones if present).

Parameters

subjSesAcq_list (list of strings) – list of subjects and sessions to include (lines with subjects and sessions not in the list are ignored).

set(setting_key, setting_value)
tivproxy_rois()
update_version(version, atlas)

Added function to update version, used mainly when loading a model that was trained with another version, to update software specific variables as subjects dir for dldirect

scanometrics.processing.freesurfer module

Wrapper for freesurfer scripts to be run on <subjects_dir>/<subjid>. Includes pipeline template parameters as proc_pipeline_name, and methods as run(), and proc2metric().

class scanometrics.processing.freesurfer.proc_pipeline(bids_database, compute_lgi=False, load_from_ID=True, ses_delimiter='_', acq_delimiter='_', atlas='DesikanKilliany')

Bases: object

calc_area_gauscurv(subject, atlas, hemi)
generate_location_plots(output_folder)
get_subjSesAcq_T1s(subjects)
get_subjSesAcq_array(subjects)
get_subjSesAcq_id(subject_id, session_id, acq_id)
get_subjSesAcq_row(subjects, subject_id, session_id, acq_id)

Quick and dirty way of getting row index for measured_metrics and covariate_values

load_subject_metrics_from_ID(ID, metric_include=None, metric_exclude=None)
load_subject_metrics_from_stats2tableFolder(stats2table_folder, subjSesAcq_list, metric_include=None, metric_exclude=None)

Loads tables with multiple subjects. Numpy arrays have to be hstacked from table to table. Test: allocate numpy array of NaNs of shape N_subj (known from subjSesAcq_list) x N_metrics (known from dictreader) and fill it when reading metric table line by line using subjSesAcq_list.index(ID) or something similar

Parameters
  • stats2table_folder (string) – path to folder containing files with all subject metrics

  • subjSesAcq_list (list of strings) – list of <subj_id>/<ses_id>/<aqc_label> combinations to load

  • metric_include (list) – list of metric names to keep

  • metric_exclude (list) – list of metric names to remove

Returns

pial2outer(subjects_dir, subject, atlas, hemi)

Labels outer surface with closest pial labels.

Parameters
  • subjects_dir (string) – path to Freesurfer subjet directory

  • subject (string) – subject ID (folder name in subjects_dir)

  • atlas (string) – atlas to use (DesikanKilliany or Destrieux)

  • hemi (string) – hemispheres to process (‘lh’ or ‘rh’)

proc2metric(subjects, covariate_values, covariate_names, stats2table_folder=None, ref_rows=None, ref_metric_values=None, ref_metric_names=None, ref_covariate_values=None, ref_covariate_names=None, metric_include=None, metric_exclude=None)

Function defined for each preprocessing pipeline, in order to get all the variables from the preprocessing in a list of metrics, usually a table saved as text file that can then be read by load_proc_metrics(). Subjects with missing values get assigned a np.nan value. Scan duplicates should be checked in the future. Computes lobe metrics and asymmetric indexes, and adds results at end of the measured_metric matri. Normalized values are computed on with respect to averages in the normative dataset. Some variables might not exist for all subjects, such scans get assigned a np.nan value.

Parameters
  • subjects (dictionary) – multilevel dictionary of subjects (e.g.: SOM.subject) starting with subject IDs, then session IDs, and finally acq labels. Used to generate subjSesAcq_array to analyse.

  • covariate_values (numpy array) – covariate matrix for subjects being processed (e.g.: SOM.covariate_values.copy())

  • covariate_names (list of strings) – names of covariates for the scans being processed (e.g.: SOM.covariate_names.copy())

  • stats2table_folder (string) – path to folder containing stat files with all subjects inside. Used when the flag self.load_from_ID is set to False.

  • ref_rows (numpy array) – indexes of rows in ref_metric_values and ref_covariate_values to keep before finding matches.

  • ref_metric_values (numpy matrix) – matrix of metric_values to use as reference when normalizing (e.g.: SOM.measured_metrics.copy())

  • ref_metric_names (list of strings) – list of metric names in ref_metric_values

  • ref_covariate_values (numpy matrix) – array with covariates of reference dataset, used to find matches

  • ref_covariate_names (list of strings) – name of covariates in ref_covariate_values

proc2table(subjects, n_threads)

Calls run_proc2table to convert Freesurfer segmentation and parcellation statistics into text files

Parameters
  • subjects (dictionary) – dictionary with subjects considered in the study

  • n_threads (integer) – number of threads to use

run_pipeline(subjects, n_threads, subject_id=None)

Pipeline template method overrided here. Calls freesurfer processing scripts. Single or multi-subject processing controlled through n_threads (default of -1 allocates all cpus available through a call to multiprocessing.cpu_count(). If T1_file is specified, checks that a .mgz file is present in <subject_dir>/<subj_id>/mri/orig, and stops with an error otherwise. Subject directory intended to be the actual subject folder, and the subject id is taken as the session id to follow bids structure.

Parameters

subj_id (string) – participant code for subject to be analysed.

run_proc2table(subjSesAcq_id)

Wrapper for asegstats2table and aparcstats2table. Assumes subject was processed with recon-all -parcstats2. Creates new directory <subjects_dir>/<subjSesAcq_id>/stats2table (deletes old one if present).

run_recon_all(subjSesAcq_id, T1_file)

Wrapper for freesurfer recon-all. Takes subj_id, ses_id and acq_label as input to create a flattened folder structure in self.subjects_dir (i.e. bids/derivatives/freesurfer) and overwrites previous recon-all outputs. Computation of LGI requires matlab to be installed.

Parameters
  • subj_id (string) – subject ID (usually taken from participants.tsv).

  • ses_id (string) – session ID (usually taken from <subj_id>_session.tsv).

  • acq_label (string) – label of scan to process (usually taken from glob() output on acq_pattern matches

set_settings(compute_lgi)
update_version(version, atlas)

Added function to update version, used mainly when loading a model that was trained with another version, to update software specific variables as subjects dir for dldirect

scanometrics.processing.fsl module

Module to run FSL processing scripts.

scanometrics.processing.fsl.fsl_pve_vols(fsl_dir, subj_id, T1_file)

Run FSL’s BET and FAST scripts on subject subj_id, and extract Partial Volume Estimates (PVEs) for CSF, GM and WM. PVEs are given in float, as opposed to the integer result in the original octave implementation.

Parameters
  • fsl_dir (string) – path to FSL derivatives folder (eg bids/derivatives/fsl)

  • subj_id (string) – ID of subject to process

  • T1_file (string) – path for T1.nii.gz file used as input

scanometrics.processing.pipeline_template module

Template for processing pipeline. Should contain a run() generic function to run the processing, as well as a proc2metric function in order to gather the generated outputs and combine them into a matrix. The template provides the metrics as a table to bypass processing. Remember to add new processing modules to the __init__.py for import and recognition as submodule by scanometrics.

scanometrics.processing.pipeline_template.proc2metric(subjects_dir, subj_ids, metric_tables=['metrics.txt'], sep=',')

Template function to convert tables with processing outputs to a list of variable names and corresponding numpy array, intended to be stacked with other subjects in a single matrix for a scanometrics study. Functions build from this template should be callable through proc2metric(subjects_dir, subj_ids). :param subjects_dir: path to folder containing the different subjects :param subj_ids: ids of subjects to process in parallel :param metric_tables: list of table files. Should have a header with variable names, and a single row with values. Path should be specified relative to <subjects_dir>/<subj_id> :param sep: character used as separator in metric_tables :return: [metric_names, metric_values] as a list of variable names, and numpy array

scanometrics.processing.quantifyBrainStructures module

Methods to gather results from different subjects

scanometrics.processing.quantifyBrainStructures.quantifybrainstem(subjects_dir, subj_id, output_filename='brainstem_volume.txt')

Collects relevant brainstem information from morphometry output

Parameters
  • subjects_dir (string) – absolute or relative path to directory with all freesurfer outputs, by subject

  • subj_id (string) – name of subject being processed

  • output_filename (string) – filename of output (defaults to brainstem_volume.txt, written to <subjects_dir>/<subj_id>/stats2table/<output_filename>)

Returns

scanometrics.processing.quantifyBrainStructures.quantifyhippocampalsubfields(subjects_dir, subj_id, hemi, output_filename='hippoSf_volume.txt', suffix='T1')

Collects relevant information for hippocampal subfields from morphometry output

Parameters
  • subjects_dir (string) – relative or absolute path to freesurfer subjects output

  • subj_id (string) – subject name

  • hemi (string) – hemisphere to recover (left=lh, right=rh)

  • output_filename (string) – output name, without hemisphere prefix to avoid call errors.

Returns

no return value