models.strategy

models.strategy#

Module: `models.strategy`#

Inheritance diagram for ISLP.models.strategy:

digraph inheritance704670af9d { bgcolor=transparent; rankdir=LR; size="8.0, 12.0"; "models.strategy.MinMaxCandidates" [URL="#ISLP.models.strategy.MinMaxCandidates",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top"]; "models.strategy.Stepwise" [URL="#ISLP.models.strategy.Stepwise",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Parameters"]; "models.strategy.MinMaxCandidates" -> "models.strategy.Stepwise" [arrowsize=0.5,style="setlinewidth(0.5)"]; "models.strategy.Strategy" [URL="#ISLP.models.strategy.Strategy",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="initial_state: object"]; }

Model selection strategies#

This module defines search strategies to be used in generic stepwise model selection.

Classes#

`MinMaxCandidates`#

class ISLP.models.strategy.MinMaxCandidates(model_spec, min_terms=0, max_terms=0, lower_terms=None, upper_terms=None, validator=None)#

Bases: object

Methods

`candidate_states`(state)	Produce candidates for fitting.
`check_finished`(results, path, best, ...)	Check if we should continue or not.

__init__(model_spec, min_terms=0, max_terms=0, lower_terms=None, upper_terms=None, validator=None)#

Parameters:

model_spec: ModelSpec: ModelSpec describing the terms in the model.
min_terms: int (default: 0): Minumum number of terms to select
max_terms: int (default: 0): Maximum number of terms to select
lower_terms: [Feature]: Subset of terms to keep: smallest model.
upper_terms: [Feature]: Largest possible model.
validator: callable: Callable taking a single argument: state, returning whether this is a valid state.

candidate_states(state)#

Produce candidates for fitting.

Parameters:

state: ignored

Returns:

candidates: iterator: A generator of (indices, label) where indices are columns of X and label is a name for the given model. The iterator cycles through all combinations of columns of nfeature total of size ranging between min_terms and max_terms. If appropriate, restricts combinations to include a set of fixed terms. Models are labeled with a tuple of the feature names. The names of the columns default to strings of integers from range(nterms).

check_finished(results, path, best, batch_results)#: Check if we should continue or not. For exhaustive search we stop because all models are fit in a single batch.

`Stepwise`#

class ISLP.models.strategy.Stepwise(model_spec, direction='forward', min_terms=1, max_terms=1, lower_terms=None, upper_terms=None, validator=None)#

Bases: MinMaxCandidates

Parameters:

model_spec: ModelSpec: ModelSpec describing the terms in the model.
direction: str: One of [‘forward’, ‘backward’, ‘both’]
min_terms: int (default: 1): Minumum number of terms to select
max_terms: int (default: 1): Maximum number of terms to select
lower_terms: [Feature]: Subset of terms to keep: smallest model.
upper_terms: [Feature]: Largest possible model.
constraints: {array-like} (optional), shape [n_terms, n_terms]: Boolean matrix decribing a dag with [i,j] nonzero implying that j is a child of i (i.e. there is an edge i->j). All search candidates are checked for validity: i.e. the parent of each term in a candidate must be included in the set of terms.

Methods

`candidate_states`(state)	Produce candidates for fitting.
`check_finished`(results, path, best, ...)	Check if we should continue or not.
`first_peak`(model_spec[, direction, ...])
`fixed_steps`(model_spec, n_steps[, ...])	Strategy that stops first time a given model size is reached.

__init__(model_spec, direction='forward', min_terms=1, max_terms=1, lower_terms=None, upper_terms=None, validator=None)#

Parameters:

model_spec: ModelSpec: ModelSpec describing the terms in the model.
min_terms: int (default: 0): Minumum number of terms to select
max_terms: int (default: 0): Maximum number of terms to select
lower_terms: [Feature]: Subset of terms to keep: smallest model.
upper_terms: [Feature]: Largest possible model.
validator: callable: Callable taking a single argument: state, returning whether this is a valid state.

candidate_states(state)#

Produce candidates for fitting. For stepwise search this depends on the direction.

If ‘forward’, all columns not in the current state are added (maintaining an upper limit on the number of columns at self.max_terms).

If ‘backward’, all columns not in the current state are dropped (maintaining a lower limit on the number of columns at self.min_terms).

All candidates include self.lower_terms if any.

Parameters:

state: ignored

Returns:

candidates: iterator: A generator of (indices, label) where indices are columns of X and label is a name for the given model. The iterator cycles through all combinations of columns of nfeature total of size ranging between min_terms and max_terms. If appropriate, restricts combinations to include a set of fixed terms. Models are labeled with a tuple of the feature names. The names of the columns default to strings of integers from range(nterms).

check_finished(results, path, best, batch_results)#: Check if we should continue or not. For exhaustive search we stop because all models are fit in a single batch.

static first_peak(model_spec, direction='forward', min_terms=1, max_terms=1, random_state=0, lower_terms=[], upper_terms=[], initial_terms=[], validator=None, parsimonious=False)#

Parameters:

X: {array-like, sparse matrix}, shape = [n_samples, n_features]: Training vectors, where n_samples is the number of samples and n_features is the number of features. New in v 0.13.0: pandas DataFrames are now also accepted as argument for X.
direction: str: One of [‘forward’, ‘backward’, ‘both’]
min_terms: int (default: 1): Minumum number of terms to select
max_terms: int (default: 1): Maximum number of terms to select
lower_terms: [Feature]: Subset of terms to keep: smallest model.
upper_terms: [Feature]: Largest possible model.
initial_terms: column identifiers, default=[]: Subset of terms to be used to initialize when direction is both. If None defaults to behavior of forward. where self.columns will correspond to columns if X is a pd.DataFrame or an array of integers if X is an np.ndarray
validator: callable: Callable taking a single argument: state, returning whether this is a valid state.
parsimonious: bool: If True, use the 1sd rule: among the shortest models within one standard deviation of the best score pick the one with the best average score.

Returns:

initial_state: tuple: (column_names, feature_idx)
state_generator: callable: Object that proposes candidates based on current state. Takes a single argument state
build_submodel: callable: Candidate generator that enumerate all valid subsets of columns.
check_finished: callable: Check whether to stop. Takes two arguments: best_result a dict with keys of scores and state.

static fixed_steps(model_spec, n_steps, direction='forward', lower_terms=[], upper_terms=[], initial_terms=[], validator=None)#

Strategy that stops first time a given model size is reached.

Parameters:

model_spec: ModelSpec: ModelSpec describing the terms in the model.
n_steps: int: How many steps to take in the search?
direction: str: One of [‘forward’, ‘backward’, ‘both’]
min_terms: int (default: 0): Minumum number of terms to select
max_terms: int (default: None): Maximum number of terms to select. If None defaults to number of terms in model_spec.
lower_terms: [Feature]: Subset of terms to keep: smallest model.
upper_terms: [Feature]: Largest possible model.
initial_terms: column identifiers, default=[]: Subset of terms to be used to initialize.

Returns:

strategyNamedTuple

`Strategy`#

class ISLP.models.strategy.Strategy(initial_state: Any, candidate_states: Callable, build_submodel: Callable, check_finished: Callable, postprocess: Callable)#

Bases: NamedTuple

initial_state: object: Initial state of feature selector.
candidate_states: callable: Callable taking single argument state and returning candidates for next batch of scores to be calculated.
build_submodel: callable: Callable taking two arguments (X, state) that returns model matrix represented by state.
check_finished: callable: Callable taking three arguments (results, best_state, batch_results) which determines if the state generator should step. Often will just check if there is a better score than that at current best state but can use entire set of results if desired.
postprocess: callable: Callable to postprocess the results after selection procedure terminates.

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

__init__(*args, **kwargs)#

build_submodel: Callable#: Alias for field number 2

candidate_states: Callable#: Alias for field number 1

check_finished: Callable#: Alias for field number 3

count(value, /)#: Return number of occurrences of value.

index(value, start=0, stop=sys.maxsize, /)#

Return first index of value.

Raises ValueError if the value is not present.

initial_state: Any#: Alias for field number 0

postprocess: Callable#: Alias for field number 4

Functions#

ISLP.models.strategy.first_peak(results, path, best, batch_results)#

Check if we should continue or not.

For first_peak search we stop if we cannot improve over our current best score.

ISLP.models.strategy.fixed_steps(n_steps, results, path, best, batch_results)#

Check if we should continue or not.

For first_peak search we stop if we cannot improve over our current best score.

ISLP.models.strategy.min_max(model_spec, min_terms=1, max_terms=1, lower_terms=None, upper_terms=None, validator=None, parsimonious=False)#

Parameters:

model_spec: ModelSpec: ModelSpec describing the terms in the model.
min_terms: int (default: 1): Minumum number of terms to select
max_terms: int (default: 1): Maximum number of terms to select
lower_terms: [Feature]: Subset of terms to keep: smallest model.
upper_terms: [Feature]: Largest possible model.
validator: callable: Callable taking a single argument: state, returning whether this is a valid state.
parsimonious: bool: If True, use the 1sd rule: among the shortest models within one standard deviation of the best score pick the one with the best average score.

Returns:

initial_state: tuple: (column_names, feature_idx)
state_generator: callable: Object that proposes candidates based on current state. Takes a single argument state
build_submodel: callable: Candidate generator that enumerate all valid subsets of columns.
check_finished: callable: Check whether to stop. Takes two arguments: best_result a dict with keys of scores. and state.

ISLP.models.strategy.validator_from_constraints(model_spec, constraints)#

models.strategy

Contents

models.strategy#

Module: models.strategy#

Model selection strategies#

Classes#

MinMaxCandidates#

Stepwise#

Strategy#

Functions#

Module: `models.strategy`#

`MinMaxCandidates`#

`Stepwise`#

`Strategy`#