PCA Statistics for CBGTC spike times

class analyseur.cbgtc.stats.pca.PCA[source]

Bases: object

Computes PCA

Use Cases

1. Pre-requisites

1.1. Import Modules

from analyseur.cbgtc.loader import LoadSpikeTimes
from analyseur.cbgtc.stats.pca import PCA

1.2. Load file and get spike times

loadST = LoadSpikeTimes("spikes_GPi.csv")
spiketimes_superset = loadST.get_spiketimes_superset()

2. Cases

2.1. Compute PCA with default values

pca, pca_trajectory, activity_matrix, time_bins = PCA.compute(spiketimes_superset)

2.2. Compute PCA with desired parameters

pca, pca_trajectory, activity_matrix, time_bins = PCA.compute(
                                           spiketimes_superset,
                                           binsz=0.01,
                                           window=(0,10),
                                           n_comp=0.95,
                                           sigma_bins=2)

2.3. Get participation ratio

pr = PCA.participation_ratio(pca)

2.4. Get eigen spectrum

eig_spec, cum_spec = PCA.eigenspectrum(pca)

classmethod compute(spiketimes_set, binsz=None, window=None, n_comp=None, sigma_bins=None)[source]

Computes PCA given

Parameters:
  • spiketimes_set – Dictionary returned using

  • binsz – integer or float; 0.01 [default]

  • window – Tuple in the form (start_time, end_time); (0, 10) [default]

  • n_comp – integer or float; 0.95 [default]

  • sigma_bins – integer or float; 2 [default]

Returns:

  • pca : PCA model

  • pca_trajectory : numpy array, Projection matrix

  • activity_matrix : numpy array, Population activity

  • time_bins : numpy array,

Given,

\[A \in \mathbb{R}^{n_\text{nuc} \times n_\text{bins}}\]

where \(a(i,t)\) is the number of spikes of neuron \(i\) in bin \(t\). \(A\) is a noisy matrix because \(var(a(i,t)) \approx \mathbb{E}[a(i,t)]\).

Step-1: Temporal smoothing

\[\widetilde{a}(i,t) = \sum_\tau \left[a(i,\tau) \cdot K(t-\tau)\right]\]

where smoothing kernel K is

\[K(\Delta t) = \frac{1}{\sqrt{2\pi\sigma}} \cdot e^\frac{\Delta t^2}{2\sigma^2}\]

Step-2: Mean subtraction

\[[\widetilde{a}(i,t)] = [\widetilde{a}(i,t)] - \frac{1}{n_\text{nuc}} \sum_{i=1}^{n_\text{nuc}} \widetilde{a}(i,t)\]

removes global population activity.

Step-3: Transpose activity matrix

\[ \begin{align}\begin{aligned}\tilde{A}^T \in \mathbb{R}^{n_\text{bins} \times n_\text{nuc}} \\X = \tilde{A}^T\end{aligned}\end{align} \]

for analyzing population activity trajectories over time where each row \(x(t,\forall)\) is the population activity vector at time \(t\).

Step-4:

Smallest $P$ chosen such that

\[\sum_{p=1}^{P}\lambda_p \ge q \sum_{i=1}^{n_\text{nuc}}\lambda_i\]

where \(q \sum_{i=1}^{n_\text{nuc}}\lambda_i\) is the parameter n_comp.

Step-5: PCA model

\[ \begin{align}\begin{aligned}C_{n_\text{nuc} \times n_\text{nuc}} = \frac{1}{(n_\text{bins} - 1)} X^T X \\C \vec{w_p} = \lambda_p \vec{w_p} \\W_{n_\text{nuc} \times P} = [\vec{w_1}, ..., \vec{w_P}]\end{aligned}\end{align} \]

where \(\vec{w}_p \in \mathbb{R}^{n_\text{nuc}}\) is a principal component direction across neurons and \(W\) is the basis matrix.

Step-6: PCA trajectory

\[Z_{n_\text{bins} \times P} = X W \]

where \(z(t,p) = \sum_{i=1}^{n_\text{nuc}}w(i,p)x(i,t)\) and \(\vec{z}(t)\) is the population state in low-dimensional space.


static eigenspectrum(pca)[source]

Given the computed PCA model

\[\tilde{\lambda_p} = \frac{\lambda_p}{\sum_{i}\lambda_i}\]

is the fraction of explained variance which is a function of component index \(p\). The cumulative spectrum is

\[S_p = \sum_{i=1}^p \tilde{\lambda_i}\]

COMMENT:

Plot of the eigenspectrum can reveal activity as low-dimensional or high-dimensional.


static get_spike_activity_matrix(spiketimes_set, window, binsz)[source]
\[ \begin{align}\begin{aligned}t_0, t_1, t_2, ..., t_{n_\text{bins}} \in \mathbb{R} \\A \in \mathbb{R}^{n_\text{nuc} \times n_\text{bins}}\end{aligned}\end{align} \]

where \(n_\text{nuc}\) is the number of neurons, \(n_\text{bins}\) is the number of time bins, \(t_j = t_0 + j \cdot \Delta t\) for \(j = 0,1, ..., n_\text{bins}\), \(\Delta t\) is bin width, and \(a(i,t)\) is the number of spikes of neuron \(i\) in bin \(t\).

Returns the activity matrix and time bins.

Note that given the activity matrix \(A\)

  • PCA(\(A\)) measures neuron correlation structure

  • PCA(\(A^T\)) measures population activity trajectories over time.


static participation_ratio(pca)[source]

Given the computed PCA model

\[D_\text{PR} = \frac{\left(\sum_{i=1}^P \lambda_i\right)}{\sum_{i=1}^P \lambda_i^2}\]

measures how spread the variance is across components. Consequently, the complexity is

\[\text{complexity} = \frac{D_\text{PR}}{n_\text{nuc}}\]

NOTE:

The n_comp in compute() can be either a fixed components, \(dim=\) n_comp or a bound for variance threshold, \(\sum_{i=1}^{P}\lambda_i \ge\) n_comp. This estimate for dimensionality depends on arbitrary threshold.