🎯

statistical-analysis

🎯Skill

from ovachiever/droid-tings

VibeIndex|
What it does

Performs comprehensive statistical analysis with hypothesis testing, regression, correlation, and Bayesian methods, generating APA-compliant research reports.

πŸ“¦

Part of

ovachiever/droid-tings(370 items)

statistical-analysis

Installation

git cloneClone repository
git clone https://github.com/ovachiever/droid-tings.git
πŸ“– Extracted from docs: ovachiever/droid-tings
14Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

"Statistical analysis toolkit. Hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, Bayesian stats, power analysis, assumption checks, APA reporting, for academic research."

Overview

# Statistical Analysis

Overview

Statistical analysis is a systematic process for testing hypotheses and quantifying relationships. Conduct hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, and Bayesian analyses with assumption checks and APA reporting. Apply this skill for academic research.

When to Use This Skill

This skill should be used when:

  • Conducting statistical hypothesis tests (t-tests, ANOVA, chi-square)
  • Performing regression or correlation analyses
  • Running Bayesian statistical analyses
  • Checking statistical assumptions and diagnostics
  • Calculating effect sizes and conducting power analyses
  • Reporting statistical results in APA format
  • Analyzing experimental or observational data for research

---

Core Capabilities

1. Test Selection and Planning

  • Choose appropriate statistical tests based on research questions and data characteristics
  • Conduct a priori power analyses to determine required sample sizes
  • Plan analysis strategies including multiple comparison corrections

2. Assumption Checking

  • Automatically verify all relevant assumptions before running tests
  • Provide diagnostic visualizations (Q-Q plots, residual plots, box plots)
  • Recommend remedial actions when assumptions are violated

3. Statistical Testing

  • Hypothesis testing: t-tests, ANOVA, chi-square, non-parametric alternatives
  • Regression: linear, multiple, logistic, with diagnostics
  • Correlations: Pearson, Spearman, with confidence intervals
  • Bayesian alternatives: Bayesian t-tests, ANOVA, regression with Bayes Factors

4. Effect Sizes and Interpretation

  • Calculate and interpret appropriate effect sizes for all analyses
  • Provide confidence intervals for effect estimates
  • Distinguish statistical from practical significance

5. Professional Reporting

  • Generate APA-style statistical reports
  • Create publication-ready figures and tables
  • Provide complete interpretation with all required statistics

---

Workflow Decision Tree

Use this decision tree to determine your analysis path:

```

START

β”‚

β”œβ”€ Need to SELECT a statistical test?

β”‚ └─ YES β†’ See "Test Selection Guide"

β”‚ └─ NO β†’ Continue

β”‚

β”œβ”€ Ready to check ASSUMPTIONS?

β”‚ └─ YES β†’ See "Assumption Checking"

β”‚ └─ NO β†’ Continue

β”‚

β”œβ”€ Ready to run ANALYSIS?

β”‚ └─ YES β†’ See "Running Statistical Tests"

β”‚ └─ NO β†’ Continue

β”‚

└─ Need to REPORT results?

└─ YES β†’ See "Reporting Results"

```

---

Test Selection Guide

Quick Reference: Choosing the Right Test

Use references/test_selection_guide.md for comprehensive guidance. Quick reference:

Comparing Two Groups:

  • Independent, continuous, normal β†’ Independent t-test
  • Independent, continuous, non-normal β†’ Mann-Whitney U test
  • Paired, continuous, normal β†’ Paired t-test
  • Paired, continuous, non-normal β†’ Wilcoxon signed-rank test
  • Binary outcome β†’ Chi-square or Fisher's exact test

Comparing 3+ Groups:

  • Independent, continuous, normal β†’ One-way ANOVA
  • Independent, continuous, non-normal β†’ Kruskal-Wallis test
  • Paired, continuous, normal β†’ Repeated measures ANOVA
  • Paired, continuous, non-normal β†’ Friedman test

Relationships:

  • Two continuous variables β†’ Pearson (normal) or Spearman correlation (non-normal)
  • Continuous outcome with predictor(s) β†’ Linear regression
  • Binary outcome with predictor(s) β†’ Logistic regression

Bayesian Alternatives:

All tests have Bayesian versions that provide:

  • Direct probability statements about hypotheses
  • Bayes Factors quantifying evidence
  • Ability to support null hypothesis
  • See references/bayesian_statistics.md

---

Assumption Checking

Systematic Assumption Verification

ALWAYS check assumptions before interpreting test results.

Use the provided scripts/assumption_checks.py module for automated checking:

```python

from scripts.assumption_checks import comprehensive_assumption_check

# Comprehensive check with visualizations

results = comprehensive_assumption_check(

data=df,

value_col='score',

group_col='group', # Optional: for group comparisons

alpha=0.05

)

```

This performs:

  1. Outlier detection (IQR and z-score methods)
  2. Normality testing (Shapiro-Wilk test + Q-Q plots)
  3. Homogeneity of variance (Levene's test + box plots)
  4. Interpretation and recommendations

Individual Assumption Checks

For targeted checks, use individual functions:

```python

from scripts.assumption_checks import (

check_normality,

check_normality_per_group,

check_homogeneity_of_variance,

check_linearity,

detect_outliers

)

# Example: Check normality with visualization

result = check_normality(

data=df['score'],

name='Test Score',

alpha=0.05,

plot=True

)

print(result['interpretation'])

print(result['recommendation'])

```

What to Do When Assumptions Are Violated

Normality violated:

  • Mild violation + n > 30 per group β†’ Proceed with parametric test (robust)
  • Moderate violation β†’ Use non-parametric alternative
  • Severe violation β†’ Transform data or use non-parametric test

Homogeneity of variance violated:

  • For t-test β†’ Use Welch's t-test
  • For ANOVA β†’ Use Welch's ANOVA or Brown-Forsythe ANOVA
  • For regression β†’ Use robust standard errors or weighted least squares

Linearity violated (regression):

  • Add polynomial terms
  • Transform variables
  • Use non-linear models or GAM

See references/assumptions_and_diagnostics.md for comprehensive guidance.

---

Running Statistical Tests

Python Libraries

Primary libraries for statistical analysis:

  • scipy.stats: Core statistical tests
  • statsmodels: Advanced regression and diagnostics
  • pingouin: User-friendly statistical testing with effect sizes
  • pymc: Bayesian statistical modeling
  • arviz: Bayesian visualization and diagnostics

Example Analyses

#### T-Test with Complete Reporting

```python

import pingouin as pg

import numpy as np

# Run independent t-test

result = pg.ttest(group_a, group_b, correction='auto')

# Extract results

t_stat = result['T'].values[0]

df = result['dof'].values[0]

p_value = result['p-val'].values[0]

cohens_d = result['cohen-d'].values[0]

ci_lower = result['CI95%'].values[0][0]

ci_upper = result['CI95%'].values[0][1]

# Report

print(f"t({df:.0f}) = {t_stat:.2f}, p = {p_value:.3f}")

print(f"Cohen's d = {cohens_d:.2f}, 95% CI [{ci_lower:.2f}, {ci_upper:.2f}]")

```

#### ANOVA with Post-Hoc Tests

```python

import pingouin as pg

# One-way ANOVA

aov = pg.anova(dv='score', between='group', data=df, detailed=True)

print(aov)

# If significant, conduct post-hoc tests

if aov['p-unc'].values[0] < 0.05:

posthoc = pg.pairwise_tukey(dv='score', between='group', data=df)

print(posthoc)

# Effect size

eta_squared = aov['np2'].values[0] # Partial eta-squared

print(f"Partial Ξ·Β² = {eta_squared:.3f}")

```

#### Linear Regression with Diagnostics

```python

import statsmodels.api as sm

from statsmodels.stats.outliers_influence import variance_inflation_factor

# Fit model

X = sm.add_constant(X_predictors) # Add intercept

model = sm.OLS(y, X).fit()

# Summary

print(model.summary())

# Check multicollinearity (VIF)

vif_data = pd.DataFrame()

vif_data["Variable"] = X.columns

vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]

print(vif_data)

# Check assumptions

residuals = model.resid

fitted = model.fittedvalues

# Residual plots

import matplotlib.pyplot as plt

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Residuals vs fitted

axes[0, 0].scatter(fitted, residuals, alpha=0.6)

axes[0, 0].axhline(y=0, color='r', linestyle='--')

axes[0, 0].set_xlabel('Fitted values')

axes[0, 0].set_ylabel('Residuals')

axes[0, 0].set_title('Residuals vs Fitted')

# Q-Q plot

from scipy import stats

stats.probplot(residuals, dist="norm", plot=axes[0, 1])

axes[0, 1].set_title('Normal Q-Q')

# Scale-Location

axes[1, 0].scatter(fitted, np.sqrt(np.abs(residuals / residuals.std())), alpha=0.6)

axes[1, 0].set_xlabel('Fitted values')

axes[1, 0].set_ylabel('√|Standardized residuals|')

axes[1, 0].set_title('Scale-Location')

# Residuals histogram

axes[1, 1].hist(residuals, bins=20, edgecolor='black', alpha=0.7)

axes[1, 1].set_xlabel('Residuals')

axes[1, 1].set_ylabel('Frequency')

axes[1, 1].set_title('Histogram of Residuals')

plt.tight_layout()

plt.show()

```

#### Bayesian T-Test

```python

import pymc as pm

import arviz as az

import numpy as np

with pm.Model() as model:

# Priors

mu1 = pm.Normal('mu_group1', mu=0, sigma=10)

mu2 = pm.Normal('mu_group2', mu=0, sigma=10)

sigma = pm.HalfNormal('sigma', sigma=10)

# Likelihood

y1 = pm.Normal('y1', mu=mu1, sigma=sigma, observed=group_a)

y2 = pm.Normal('y2', mu=mu2, sigma=sigma, observed=group_b)

# Derived quantity

diff = pm.Deterministic('difference', mu1 - mu2)

# Sample

trace = pm.sample(2000, tune=1000, return_inferencedata=True)

# Summarize

print(az.summary(trace, var_names=['difference']))

# Probability that group1 > group2

prob_greater = np.mean(trace.posterior['difference'].values > 0)

print(f"P(μ₁ > ΞΌβ‚‚ | data) = {prob_greater:.3f}")

# Plot posterior

az.plot_posterior(trace, var_names=['difference'], ref_val=0)

```

---

Effect Sizes

Always Calculate Effect Sizes

Effect sizes quantify magnitude, while p-values only indicate existence of an effect.

See references/effect_sizes_and_power.md for comprehensive guidance.

Quick Reference: Common Effect Sizes

| Test | Effect Size | Small | Medium | Large |

|------|-------------|-------|--------|-------|

| T-test | Cohen's d | 0.20 | 0.50 | 0.80 |

| ANOVA | Ξ·Β²_p | 0.01 | 0.06 | 0.14 |

| Correlation | r | 0.10 | 0.30 | 0.50 |

| Regression | RΒ² | 0.02 | 0.13 | 0.26 |

| Chi-square | CramΓ©r's V | 0.07 | 0.21 | 0.35 |

Important: Benchmarks are guidelines. Context matters!

Calculating Effect Sizes

Most effect sizes are automatically calculated by pingouin:

```python

# T-test returns Cohen's d

result = pg.ttest(x, y)

d = result['cohen-d'].values[0]

# ANOVA returns partial eta-squared

aov = pg.anova(dv='score', between='group', data=df)

eta_p2 = aov['np2'].values[0]

# Correlation: r is already an effect size

corr = pg.corr(x, y)

r = corr['r'].values[0]

```

Confidence Intervals for Effect Sizes

Always report CIs to show precision:

```python

from pingouin import compute_effsize_from_t

# For t-test

d, ci = compute_effsize_from_t(

t_statistic,

nx=len(group1),

ny=len(group2),

eftype='cohen'

)

print(f"d = {d:.2f}, 95% CI [{ci[0]:.2f}, {ci[1]:.2f}]")

```

---

Power Analysis

A Priori Power Analysis (Study Planning)

Determine required sample size before data collection:

```python

from statsmodels.stats.power import (

tt_ind_solve_power,

FTestAnovaPower

)

# T-test: What n is needed to detect d = 0.5?

n_required = tt_ind_solve_power(

effect_size=0.5,

alpha=0.05,

power=0.80,

ratio=1.0,

alternative='two-sided'

)

print(f"Required n per group: {n_required:.0f}")

# ANOVA: What n is needed to detect f = 0.25?

anova_power = FTestAnovaPower()

n_per_group = anova_power.solve_power(

effect_size=0.25,

ngroups=3,

alpha=0.05,

power=0.80

)

print(f"Required n per group: {n_per_group:.0f}")

```

Sensitivity Analysis (Post-Study)

Determine what effect size you could detect:

```python

# With n=50 per group, what effect could we detect?

detectable_d = tt_ind_solve_power(

effect_size=None, # Solve for this

nobs1=50,

alpha=0.05,

power=0.80,

ratio=1.0,

alternative='two-sided'

)

print(f"Study could detect d β‰₯ {detectable_d:.2f}")

```

Note: Post-hoc power analysis (calculating power after study) is generally not recommended. Use sensitivity analysis instead.

See references/effect_sizes_and_power.md for detailed guidance.

---

Reporting Results

APA Style Statistical Reporting

Follow guidelines in references/reporting_standards.md.

Essential Reporting Elements

  1. Descriptive statistics: M, SD, n for all groups/variables
  2. Test statistics: Test name, statistic, df, exact p-value
  3. Effect sizes: With confidence intervals
  4. Assumption checks: Which tests were done, results, actions taken
  5. All planned analyses: Including non-significant findings

Example Report Templates

#### Independent T-Test

```

Group A (n = 48, M = 75.2, SD = 8.5) scored significantly higher than

Group B (n = 52, M = 68.3, SD = 9.2), t(98) = 3.82, p < .001, d = 0.77,

95% CI [0.36, 1.18], two-tailed. Assumptions of normality (Shapiro-Wilk:

Group A W = 0.97, p = .18; Group B W = 0.96, p = .12) and homogeneity

of variance (Levene's F(1, 98) = 1.23, p = .27) were satisfied.

```

#### One-Way ANOVA

```

A one-way ANOVA revealed a significant main effect of treatment condition

on test scores, F(2, 147) = 8.45, p < .001, Ξ·Β²_p = .10. Post hoc

comparisons using Tukey's HSD indicated that Condition A (M = 78.2,

SD = 7.3) scored significantly higher than Condition B (M = 71.5,

SD = 8.1, p = .002, d = 0.87) and Condition C (M = 70.1, SD = 7.9,

p < .001, d = 1.07). Conditions B and C did not differ significantly

(p = .52, d = 0.18).

```

#### Multiple Regression

```

Multiple linear regression was conducted to predict exam scores from

study hours, prior GPA, and attendance. The overall model was significant,

F(3, 146) = 45.2, p < .001, RΒ² = .48, adjusted RΒ² = .47. Study hours

(B = 1.80, SE = 0.31, Ξ² = .35, t = 5.78, p < .001, 95% CI [1.18, 2.42])

and prior GPA (B = 8.52, SE = 1.95, Ξ² = .28, t = 4.37, p < .001,

95% CI [4.66, 12.38]) were significant predictors, while attendance was

not (B = 0.15, SE = 0.12, Ξ² = .08, t = 1.25, p = .21, 95% CI [-0.09, 0.39]).

Multicollinearity was not a concern (all VIF < 1.5).

```

#### Bayesian Analysis

```

A Bayesian independent samples t-test was conducted using weakly

informative priors (Normal(0, 1) for mean difference). The posterior

distribution indicated that Group A scored higher than Group B

(M_diff = 6.8, 95% credible interval [3.2, 10.4]). The Bayes Factor

BF₁₀ = 45.3 provided very strong evidence for a difference between

groups, with a 99.8% posterior probability that Group A's mean exceeded

Group B's mean. Convergence diagnostics were satisfactory (all RΜ‚ < 1.01,

ESS > 1000).

```

---

Bayesian Statistics

When to Use Bayesian Methods

Consider Bayesian approaches when:

  • You have prior information to incorporate
  • You want direct probability statements about hypotheses
  • Sample size is small or planning sequential data collection
  • You need to quantify evidence for the null hypothesis
  • The model is complex (hierarchical, missing data)

See references/bayesian_statistics.md for comprehensive guidance on:

  • Bayes' theorem and interpretation
  • Prior specification (informative, weakly informative, non-informative)
  • Bayesian hypothesis testing with Bayes Factors
  • Credible intervals vs. confidence intervals
  • Bayesian t-tests, ANOVA, regression, and hierarchical models
  • Model convergence checking and posterior predictive checks

Key Advantages

  1. Intuitive interpretation: "Given the data, there is a 95% probability the parameter is in this interval"
  2. Evidence for null: Can quantify support for no effect
  3. Flexible: No p-hacking concerns; can analyze data as it arrives
  4. Uncertainty quantification: Full posterior distribution

---

Resources

This skill includes comprehensive reference materials:

References Directory

  • test_selection_guide.md: Decision tree for choosing appropriate statistical tests
  • assumptions_and_diagnostics.md: Detailed guidance on checking and handling assumption violations
  • effect_sizes_and_power.md: Calculating, interpreting, and reporting effect sizes; conducting power analyses
  • bayesian_statistics.md: Complete guide to Bayesian analysis methods
  • reporting_standards.md: APA-style reporting guidelines with examples

Scripts Directory

  • assumption_checks.py: Automated assumption checking with visualizations

- comprehensive_assumption_check(): Complete workflow

- check_normality(): Normality testing with Q-Q plots

- check_homogeneity_of_variance(): Levene's test with box plots

- check_linearity(): Regression linearity checks

- detect_outliers(): IQR and z-score outlier detection

---

Best Practices

  1. Pre-register analyses when possible to distinguish confirmatory from exploratory
  2. Always check assumptions before interpreting results
  3. Report effect sizes with confidence intervals
  4. Report all planned analyses including non-significant results
  5. Distinguish statistical from practical significance
  6. Visualize data before and after analysis
  7. Check diagnostics for regression/ANOVA (residual plots, VIF, etc.)
  8. Conduct sensitivity analyses to assess robustness
  9. Share data and code for reproducibility
  10. Be transparent about violations, transformations, and decisions

---

Common Pitfalls to Avoid

  1. P-hacking: Don't test multiple ways until something is significant
  2. HARKing: Don't present exploratory findings as confirmatory
  3. Ignoring assumptions: Check them and report violations
  4. Confusing significance with importance: p < .05 β‰  meaningful effect
  5. Not reporting effect sizes: Essential for interpretation
  6. Cherry-picking results: Report all planned analyses
  7. Misinterpreting p-values: They're NOT probability that hypothesis is true
  8. Multiple comparisons: Correct for family-wise error when appropriate
  9. Ignoring missing data: Understand mechanism (MCAR, MAR, MNAR)
  10. Overinterpreting non-significant results: Absence of evidence β‰  evidence of absence

---

Getting Started Checklist

When beginning a statistical analysis:

  • [ ] Define research question and hypotheses
  • [ ] Determine appropriate statistical test (use test_selection_guide.md)
  • [ ] Conduct power analysis to determine sample size
  • [ ] Load and inspect data
  • [ ] Check for missing data and outliers
  • [ ] Verify assumptions using assumption_checks.py
  • [ ] Run primary analysis
  • [ ] Calculate effect sizes with confidence intervals
  • [ ] Conduct post-hoc tests if needed (with corrections)
  • [ ] Create visualizations
  • [ ] Write results following reporting_standards.md
  • [ ] Conduct sensitivity analyses
  • [ ] Share data and code

---

Support and Further Reading

For questions about:

  • Test selection: See references/test_selection_guide.md
  • Assumptions: See references/assumptions_and_diagnostics.md
  • Effect sizes: See references/effect_sizes_and_power.md
  • Bayesian methods: See references/bayesian_statistics.md
  • Reporting: See references/reporting_standards.md

Key textbooks:

  • Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences
  • Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics
  • Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models
  • Kruschke, J. K. (2014). Doing Bayesian Data Analysis

Online resources:

  • APA Style Guide: https://apastyle.apa.org/
  • Statistical Consulting: Cross Validated (stats.stackexchange.com)