pynibs.regression package¶

This holds methods for the TMS-based cortical localization approach as published in [1]

References

[1]

Numssen, O., Zier, A. L., Thielscher, A., Hartwigsen, G., Knösche, T. R., & Weise, K. (2021). Efficient high-resolution TMS mapping of the human motor cortex by nonlinear regression. NeuroImage, 245, 118654.

Submodules¶

pynibs.regression.regression module¶

class pynibs.regression.regression.Element(x, y, ele_id, fun=<function sigmoid4>, score_type='R2', select_signed_data=False, constants=None, **kwargs)¶

Bases: object

Fit Element object class

calc_score()¶: Determine goodness-of-fit score.

run_fit(max_nfev=1000)¶: Perform data fit with lmfit.

run_select_signed_data()¶: Selects positive or negative data by performing an initial linear fit by comparing the resulting p-values, slopes and R2 values. Either positive or negative data (w.r.t. x-axis) yielding a fit with a p-value < 0.05, a positive slope and the higher R2 value is used and the remaining data with the other sign is omitted from the analysis

set_constants(value)¶: Sets constants in self.constants and gmodel instance

set_init_vals(value)¶: Sets initial values in self.init_vals and gmodel instance

set_limits(value)¶

Sets limits in self.limits and gmodel instance

Parameters:: value (dict) – Parameters (keys) to set in self as limits.

set_random_init_vals()¶: Set random initial values

setup_model()¶: Setup model parameters (limits, initial values, etc. …)

pynibs.regression.regression.fit_elms(elm_idx_list, e_matrix, mep, zap_idx=None, fun=<function sigmoid4>, init_vals=None, limits=None, log_scale=False, constants=None, max_nfev=None, bad_elm_idx=None, score_type='R2', verbose=False)¶

Workhorse for Mass-univariate nonlinear regressions on raw MEP_{AMP} ~ E. That is, for each element in elm_idx_list, it’s E (mag | norm | tan) for each zap regressed on the raw MEP amplitude. An element wise r2 score is returned.

Parameters:

elm_idx_list (list of int or np.ndarray) – List containing the element indices the fit is performed for
e_matrix (np.ndarray of float [n_zaps x n_ele]) – Electric field matrix
mep (np.ndarray of float [n_zaps]) – Motor evoked potentials for every stimulation
zap_idx (np.ndarray [n_zaps], default: None) – Indices of zaps the congruence factor is calculated with (default: all)
fun (str) – A function name of pynibs.exp.Mep (exp0, sigmoid)
init_vals (np.ndarray of dict) – Dictionary containing the initial values for each element as np.ndarray [len(elm_idx_list)]. The keys are the free parameters of fun, e.g. “x0”, “amp”, etc
limits (pd.DataFrame) – Dictionary containing the limits of each parameter for each element e.g.: limits[“x0”][elm_idx] = [min, max]
log_scale (bool, default: False) – Log-transform data before fit (necessary for functions defined in the log domain)
constants (dict of <string>:<num>, default: None) – key:value pair of model parameters not to optimize.
max_nfev (int, default: None) – Max fits, passed to model.fit() as max_nfev=max_nfev*len(x).
bad_elm_idx (np.ndarray) – Indices of elements not to fit, with indices corresponding to indices (not values) of elm_idx_list
score_type (str, default: "R2") – Goodness of fit measure; Choose SR for nonlinear fits and R2 or SR for linear fits: * “R2”: R2 score (Model variance / Total variance) [0, 1] for linear models; 0: bad fit; 1: perfect fit * “SR”: Relative standard error of regression (1 - Error 2-norm / Data 2-norm) [-inf, 1]; 1: perfect fit
verbose (bool, default: False) – Print verbosity information

Returns:

r2 (np.ndarray of float [n_roi, 1]) – R2 for each element in elm_idx_list
best_values (np.ndarray of object) – Fit parameters returned from the optimizer

pynibs.regression.regression.get_bad_elms(x, y, method='lstsq', verbose=False)¶

This does an element-wise fast linear regression fit to identify bad elements. Bad is defined here as a negative slope.

xnp.ndarray of float [n_zaps x n_ele]: Electric field matrix
ynp.ndarray of float [n_zaps]: Motor evoked potentials for every stimulation
methodstr, default: “lstsq”: Which method to use. (numpy.linalg.)lstsq, (scipy.stats.)linregress, or pinv
verbosebool, default: False: Print verbosity information

Returns:: idx – Indices of bad elements
Return type:: np.ndarray

pynibs.regression.regression.get_model_init_values(fun, elm_idx_list, e_matrix, mep, mask_e_field=None, rem_empty_hints=True)¶

Calc appropriate init, limit, and max values for models fits depending on the data. If negative and positive x-data is present in case of e.g. normal component values are set according to the side (positive or negative) where more values are present. When more positive x-axis values are present, negative x-axis values will be ignored. When more negative x-axis values are present, the absolute values will be taken and the positive values are ignored.

Only parameters for sigmoid* are optimized.

Parameters:

fun (pyfempp.exp.Mep) – IO curve function object
elm_idx_list (np.ndarray of int) – Array containing the element indices the fit is performed for
e_matrix (np.ndarray of float [n_zaps x n_ele]) – Electric field matrix
mep (np.ndarray of float [n_zaps]) – Motor evoked potentials for every stimulation
mask_e_field (np.ndarray of bool [n_zaps x n_ele], default: None) – Mask indicating for which e-field (and mep) values the fit is performed for. Changes for normal component in each element because of the sign and p-values. If None, all data is used in each element.
rem_empty_hints (bool, default: True) – Remove any non-filled param hints from limits dict.

Returns:

log_scale (bool) – Log scale
limits (dict of list [n elm_index_list]) – Element-wise limit values for function fitting.
init_vals (dict of list [n elm_index_list]) – Element-wise init values for function fitting.
max_vals_refit (dict of list [n elm_index_list]) – Element-wise perturbation range for refitting function.

pynibs.regression.regression.init(l, zap_lists, res_fn)¶

Pool init function to use with regression_nl_hdf5_single_core_write

Parameters:

l (multiprocessing.Lock()) –
zap_lists (list of list of int) – Which zaps to compute.
res_fn (str) – .hdf5 fn

pynibs.regression.regression.logistic_regression()¶

Some ideas on how to improve regression approach

De-log data

Data range has to be transformed to a reasonable range. For a full sigmoid, -10:10 looks ok

sig <- function(z) {
  return( 1 / (1 + exp(-z)))
}

desig <- function(x) {
  return(- log((1/x) - 1))
}

This might be a reasonable fast approach, but the parameter range has to be estimated. Maybe remove some outliters?

2. fit logistic regression to raw data scipy.optimize provides fit_curve(), which does OLS-ish fitting to a given function https://stackoverflow.com/questions/54376900/fit-sigmoid-curve-in-python

I expect this to be rather slow.

3. Use the sklearn logistic_regression classifyer and access raw fit data The logistic_regression is implemented as a classifyer, maybe it’s possible to use its regression fit results. Implementation should be pretty fast.

pynibs.regression.regression.nl_hdf5(elm_idx_list=None, fn_reg_hdf5=None, qoi_path_hdf5=None, e_matrix=None, mep=None, fun=<function sigmoid4>, zap_idx=None, n_cpu=4, con=None, n_refit=50, return_fits=False, score_type='R2', verbose=False, pool=None, refit_discontinuities=True)¶

Mass-univariate nonlinear regressions on raw MEP_{AMP} ~ E. That is, for each element in elm_idx_list, it’s E (mag | norm | tan) for each zap regressed on the raw MEP amplitude. An element wise r2 score is returned. The function reads the precomputed array of E-MEP data from an .hdf5 file.

Parameters:

elm_idx_list (np.ndarray of int) – List containing the element indices the fit is performed for
fn_reg_hdf5 (str) – Filename (incl. path) containing the precomputed E-MEP dataframes
qoi_path_hdf5 (Union[str, list[str]]) – Path in .hdf5 file to dataset of electric field qoi e.g.: [“E”, “E_norm”, “E_tan”]
e_matrix (np.ndarray of float [n_zaps x n_ele]) – Electric field matrix
mep (np.ndarray of float [n_zaps]) – Motor evoked potentials for every stimulation
zap_idx (np.array [n_zaps], default: None) – Indices of zaps the congruence factor is calculated with (default: all)
fun (pynibs.exp.Mep function) – A function of pynibs.exp.Mep (exp0, sigmoid)
n_cpu (int) – Number of threads
con (np.ndarray of float [n_roi, 3 or 4], default: None) – Connectivity matrix of ROI (needed in case of refit because of discontinuity check)
n_refit (int, default: 50) – Maximum number of refits of zero elements. No refit is applied in case of n_refit = 0.
return_fits (bool, default: False) – Return fit objects containing the parameter estimates
score_type (str, default: "R2") – Error measure of fit: * “R2”: R2 score (Model variance / Total variance); linear fits: [0, 1], 1 … perfect fit * “SR”: Relative standard error of regression (1 - Error 2-norm / Data 2-norm); [-Inf, 1], 1 … perfect fit
verbose (bool, default: False) – Plot output messages
pool (multiprocessing.Pool()) – pool instance to use.
refit_discontinuities (bool, default: True) – Run refit for discontinuous elements at the end

Returns:

r2 – R2 for each element in elm_idx_list

Return type:

np.ndarray of float [n_roi, n_qoi]

pynibs.regression.regression.nl_hdf5_single_core(zap_idx, elm_idx_list, fn_reg_hdf5=None, qoi_path_hdf5=None, e_matrix=None, mep=None, fun=<function sigmoid4>, con=None, n_refit=50, return_fits=False, constants=None, verbose=False, seed=None, rem_bad_elms=True, return_e_field_stats=True)¶

Parameters:

elm_idx_list (np.ndarray of int) – List containing the element indices the fit is performed for
fn_reg_hdf5 (str) – Filename (incl. path) containing the precomputed E-MEP dataframes
qoi_path_hdf5 (str or list of str) – Path in .hdf5 file to dataset of electric field qoi e.g.: [“E”, “E_norm”, “E_tan”]
e_matrix (np.ndarray of float [n_zaps x n_ele]) – Electric field matrix
mep (np.ndarray of float [n_zaps]) – Motor evoked potentials for every stimulation
zap_idx (np.array [n_zaps], default: None) – Indices of zaps the congruence factor is calculated with (default: all)
fun (function object) – A function of pynibs.exp.Mep (exp0, sigmoid)
con (np.ndarray of float [n_roi, 3 or 4], default: None) – Connectivity matrix of ROI (needed in case of refit because of discontinuity check)
n_refit (int, default: 50) – Maximum number of refits of zero elements. No refit is applied in case of n_refit = 0.
return_fits (bool, default: False) – Return fit objects containing the parameter estimates
constants (dict of <string>:<num>, default: None) – key:value pair of model parameters not to optimize.
verbose (bool, default: False) – Plot output messages
seed (int, default: None) – Seed to use.
rem_bad_elms (bool, default: True) – Remove elements based on a fast linear regression slope estimation
return_e_field_stats (bool, default: True) – Return some stats on the efield variance

Returns:

r2 – R2 for each element in elm_idx_list

Return type:

np.ndarray of float [n_roi, n_qoi]

pynibs.regression.regression.nl_hdf5_single_core_write(i, elm_idx_list, fn_reg_hdf5=None, qoi_path_hdf5=None, e_matrix=None, mep=None, fun=<function sigmoid4>, con=None, n_refit=50, return_fits=False, constants=None, verbose=False, seed=None, stepdown=False, score_type='R2', return_progress=False, geo=None)¶

pynibs.regression.regression.regress_data(e_matrix, mep, elm_idx_list=None, element_list=None, fun=<function sigmoid4>, n_cpu=4, con=None, n_refit=50, zap_idx=None, return_fits=False, score_type='R2', verbose=False, pool=None, refit_discontinuities=True, select_signed_data=False, mp_context='fork', **kwargs)¶

Mass-univariate nonlinear regressions on raw MEP_{AMP} ~ E. That is, for each element in elm_idx_list, it’s E (mag | norm | tan) for each zap regressed on the raw MEP amplitude. An element wise R2 score is returned. The function reads the precomputed array of E-MEP data from an .hdf5 file.

Parameters:

e_matrix (np.ndarray of float) – (n_zaps, n_ele) Electric field matrix
mep (np.ndarray of float) – (n_zaps,) Motor evoked potential for each stimulation
elm_idx_list (np.ndarray of int or list of int) – (n_zaps,) List containing the element indices the fit is performed for
element_list (list of Element object instances, optional) – [n_ele] pynibs.Element objects ot skip initialization here.
fun (pynibs.exp.Mep, default: pynibs.sigmoid4) – A pynibs.exp.Mep function (exp0, sigmoid, sigmoid4, …)
n_cpu (int) – Number of threads, if n_cpu=1 no parallel pool will be opened and all calculations are done in serial
con (np.ndarray of float, optional) – (n_ele, 3 or 4) Connectivity matrix of ROI. Needed in case of refit for discontinuity checks
n_refit (int, default: 50) – Maximum number of refits of zero elements. No refit is applied in case of n_refit = 0.
zap_idx (np.ndarray of int or list of int, optional) – Which e/mep pairs to use.
return_fits (bool, optional) – Return fit objects containing the parameter estimates
score_type (str, default: "R2") – Error measure of fit: * “R2”: R2 score (Model variance / Total variance); linear fits: [0, 1], 1 … perfect fit * “SR”: Relative standard error of regression (1 - Error 2-norm / Data 2-norm); [-Inf, 1], 1 … perfect fit * “rho”: Spearman correlation coefficient [-1, 1]; finds any monotonous correlation (0 means no correlation)
verbose (bool, default: False) – Plot output messages
pool (multiprocessing.Pool()) – pool instance to use
refit_discontinuities (bool, default: True) – Refit discontinuous elements. If True, provide _con_
mp_context (str, default: "fork") –
Controls the method the sub-processes of the multiprocessing pool (in case of n_cpu > 1) are launched. * fork: (only supported by Unix) mp processes diverge from the main process,

the entire stack, variables and other resources are copied over. From the docs: “The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic.”
- spawn: (supported by Window and Unix) mp processes are launched in an entirely new Python interpreter as separate processes. Variables are copied other resources are freshly instantiated. From the docs: “In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver.”
**kwargs – Passed on to pynibs.Element() to set fit parameters.

Returns:

score (np.ndarray of float (n_roi, n_qoi)) – Score for each element.
best_values (list of dict (n_ele)) – List of parameter fits, only return if return_fits=True.

pynibs.regression.regression.ridge_from_hdf5(elm_idx_list, fn_reg_hdf5, qoi_path_hdf5, zap_idx=None)¶

Mass-univariate ridge regressions on raw MEP_{AMP} ~ E. That is, for each element in elm_idx_list, it’s E (mag | norm | tan) for each zap regressed on the raw MEP amplitude. An element wise sklearn.metrics.regression.r2_score is returned. The function reads the precomputed array of E-MEP data from an .hdf5 file. Always uses all cores of a machine!

elm_idx_listlist of int: List containing the element indices the fit is performed for
fn_hdf5str: Filename (incl. path) containing the precomputed E-MEP dataframes
qoi_path_hdf5str: Path in .hdf5 file to dataset of electric field qoi
zap_idxnp.ndarray [n_zaps], default: None: Indices of zaps the congruence factor is calculated with (default: all)

Returns:: r2 – R^2 for each element in elm_idx_list
Return type:: np.ndarray of float [n_roi, n_datasets]

pynibs.regression.regression.sing_elm_fitted(elm_idx_list, mep_lst, mep_params, e, alpha=1000, n_samples=100)¶

Mass-univariate ridge regressions on fitted MEP_{AMP} ~ E. That is, for each element in elm_idx_list, it’s E (mag | norm | tan) for each zap regressed on the raw MEP amplitude. An element wise sklearn.metrics.regression.r2_score is returned.

elm_idx_list: np.ndarray [chunksize]: List of element indices, the congruence factor is computed for
mep_lst: list of Mep object instances [n_cond]: List of fitted Mep object instances for all conditions (see exp.py for more information of Mep class)
mep_params: np.ndarray of float [n_mep_params_total]: List of all mep parameters of curve fits used to calculate the MEP (accumulated into one array) (e.g.: [mep_#1_para_#1, mep_#1_para_#2, mep_#1_para_#3, mep_#2_para_#1, mep_#2_para_#1, …])
e: np.ndarray of float with e.shape == (n_elm, n_cond, n_qoi): array of the electric field to compute the r2 factor for, e.g. (e_mag, e_norm, e_tan)
n_samplesint, default=100: Number of data points to generate discrete mep and e curves

Returns:: r2 – R^2 for each element in elm_idx_list
Return type:: np.ndarray of float [n_roi, n_datasets]

pynibs.regression.regression.sing_elm_raw(elm_idx_list, mep_lst, mep_params, e, alpha=1000)¶

elm_idx_list: np.ndarray [chunksize]: List of element indices, the congruence factor is computed for
mep: list of Mep object instances [n_cond]: List of fitted Mep object instances for all conditions (see exp.py for more information of Mep class)
mep_params: np.ndarray of float [n_mep_params_total]: List of all mep parameters of curve fits used to calculate the MEP (accumulated into one array) (e.g.: [mep_#1_para_#1, mep_#1_para_#2, mep_#1_para_#3, mep_#2_para_#1, mep_#2_para_#1, …])
e: np.ndarray of float with e.shape == (n_elm, n_cond, n_qoi): array of the electric field to compute the r2 factor for, e.g. (e_mag, e_norm, e_tan)

Returns:: r2 – R^2 for each element in elm_idx_list
Return type:: np.ndarray of float [n_roi, n_datasets]

pynibs.regression.regression.single_fit(x, y, fun)¶

Performs a single fit and returns fit object.

Parameters:

x (ndarray of float) – x-values.
y (ndarray of float) – y-values.
fun (function) – Function for fitting.

Returns:

fit – Fit object.

Return type:

gmodel fit object

pynibs.regression.regression.stepdown_approach(zap_idx, elm_idx_list, fn_reg_hdf5=None, qoi_path_hdf5=None, e_matrix=None, mep=None, fun=<function sigmoid4>, con=None, n_refit=50, return_fits=False, constants=None, verbose=False, seed=None, rem_bad_elms=True, return_e_field_stats=True, score_type='R2', return_progress=False, smooth_data=True, geo=None)¶

Mass-univariate nonlinear regressions on raw MEP_{AMP} ~ E in a stepdown manner to speed up computation.

Initially, one set of fits is done for the complete dataset. Afterwards, the best 1% of the elements are used as initial fitting parameters for their neighboring elements. Then, neighboring elements are fitted accordingly and iterativley. Finally, discontinuous elements are refitted until a smooth map is found or n_refit is hit.

Can be sped up with rem_bad_elms that computes a fast linear fit to identify elements with a negative slope.

The function reads the precomputed array of E-MEP data from an .hdf5 file.

Parameters:

elm_idx_list (np.ndarray of int) – List containing the element indices the fit is performed for
fn_reg_hdf5 (str) – Filename (incl. path) containing the precomputed E-MEP dataframes
qoi_path_hdf5 (str or list of str) – Path in .hdf5 file to dataset of electric field qoi e.g.: [“E”, “E_norm”, “E_tan”]
e_matrix (np.ndarray of float [n_zaps x n_ele] | [n_zapidx x n_ele]) – Electric field matrix
mep (np.ndarray of float [n_zaps]) – Motor evoked potentials for every stimulation
zap_idx (np.array [n_zaps], default: None) – Indices of zaps the congruence factor is calculated with (default: all)
fun (function object) – A function of pynibs.exp.Mep (exp0, sigmoid)
con (np.ndarray of float [n_elm_roi, 3 or 4], default: None) – Connectivity matrix of ROI (needed in case of refit because of discontinuity check)
n_refit (int, default: 50) – Maximum number of refits of zero elements. No refit is applied in case of n_refit = 0.
return_fits (bool, default: False) – Return fit objects containing the parameter estimates
constants (dict of <string>:<num>, default: None) – key:value pair of model parameters not to optimize.
verbose (bool, default: False) – Plot output messages
seed (int, default: None) – Seed to use.
rem_bad_elms (bool, default: True) – Remove elements based on a fast linear regression slope estimation
return_e_field_stats (bool, default: True) – Return some stats on the efield variance
score_type (str, default: "R2") – Error measure of fit: * “R2”: R2 score (Model variance / Total variance); linear fits: [0, 1], 1 … perfect fit * “SR”: Relative standard error of regression (1 - Error 2-norm / Data 2-norm); [-Inf, 1], 1 … perfect fit * “rho”: Spearman correlation coefficient [-1, 1]; finds any monotonous correlation (0 means no correlation)
return_progress (bool, default: False) – Return c maps for all steps to allow visualization over e-fitting over timesteps.
smooth_data (bool, default: False) – Smooth c-map as final step.
geo (object, optional) – Geometriy data.

Returns:

r2np.ndarray of float [n_roi, n_qoi]

R2 for each element in elm_idx_list

best_values: list of dict, optional

fit information

statsdict, optional

’mc’: float, optional: Mutual coherence for e fields
’sv_rat’float, optional: SVD singular value ratio

progress : cmaps for each step

Return type:

dict

pynibs.regression.regression.workhorse_element_init(ele_id, e_matrix, mep, fun, score_type, select_signed_data, constants, **kwargs)¶: Workhorse to initialize Elements.

pynibs.regression.regression.workhorse_element_run_fit(element, max_nfev=10)¶: Workhorse to run single Element fit. If status is False, the element will not be fitted.

pynibs.regression.regression.write_regression_hdf5(fn_exp_hdf5, fn_reg_hdf5, qoi_path_hdf5, qoi_phys, e_results_folder, qoi_e, roi_idx, conds_exclude)¶

Reads the stimulation intensities from the experiment.hdf5 file. Reads the qoi from the experiment.hdf5 file. Reads the electric fields from the electric field folder. Weights the electric field voxel wise with the respective intensities and writes an .hdf5 file containing the preprocessed data (pandas dataframe).

fn_exp_hdf5: str: Filename of the experiment.hdf5 file
fn_reg_hdf5: str: Filename of output regression.hdf5 file
qoi_path_hdf5: str: Path in experiment.hdf5 file pointing to the pandas dataframe containing the qoi (e.g.: “phys_data/postproc/EMG”)
qoi: str: Name of QOI the congruence factor is calculated with (e.g.: “p2p”)
fn_e_results: str: Folder containing the electric fields (e.g.: “/data/pt_01756/probands/13061.30/results/electric_field/1”)
qoi_e: str or list of str: Quantities of the electric field used to calculate the congruence factor (e.g. [“E”, “E_norm”, “E_tan”] Has to be included in e.hdf5 -> e.g.: “data/midlayer/roi_surface/1/E”
roi_idx: int: ROI index
conds_exclude: str or list of str: Conditions to exclude

Returns:: <File> – File containing the intensity (current) scaled E-fields of the conditions in the ROI. Saved in datasets with the same name as qoi_e [“E”, “E_norm”, “E_tan”]
Return type:: .hdf5 file

pynibs.regression package¶

Submodules¶

pynibs.regression.regression module¶

pynibs.regression.score_types module¶

Table of Contents

Previous topic

Next topic

This Page