A learning path ready to make your own.

Research methods

Overview This guide summarizes core principles, histories, methods, and practical advice for designing, conducting, analyzing, and reporting research across disciplines. It targets graduate students, early-career researchers, and others seeking a rigorous orientation to research methods. Purpose of research Exploratory: generate hypotheses or map phenomena. Descriptive: characterize distributions or states. Explanatory/Analytic: test hypotheses and infer causes. Evaluative: assess interventions or programs. Predictive: forecast future observations. Historical roots (brief) Origins in natural philosophy; formal scientific method (17th–18th c.); rise of statistics in the 19th century; formal inference and experimental design in the early 20th century; diversification of qualitative methods mid-20th century; late-20th/21st-century growth in computational methods, causal frameworks (Rubin, DAGs) and open-science reforms. Key concepts & terminology Population vs. sample; variable types (categorical, continuous). Internal/external/construct/statistical validity; reliability and bias. Confounding, mediation, moderation; precision, power, and sample size. Major research paradigms Positivism/Post-positivism — quantitative, hypothesis-driven. Interpretivism/Constructivism — qualitative, meaning-focused. Critical theory — power and change-oriented. Pragmatism — method choice driven by research question; supports mixed methods. Research designs & strategies Design selection should follow the research question, feasibility, ethics, and causal requirements. Quantitative: experimental (RCTs, factorial, cluster), quasi-experimental (DiD, RDD, IV), observational (cohort, case-control, cross-sectional). Qualitative: ethnography, phenomenology, grounded theory, case studies, narrative research. Mixed methods: convergent, explanatory sequential, exploratory sequential, embedded designs. Sampling & study populations Probability sampling: simple random, stratified, cluster — enables unbiased inference and known sampling error. Non-probability sampling: convenience, purposive, snowball — common in qualitative or hard-to-reach contexts. Key tasks: define target population, set inclusion/exclusion criteria, address frame coverage and selection bias, compute sample size (account for design effects). Measurement, reliability & validity Scale types: nominal, ordinal, interval, ratio; common instruments include Likert scales. Reliability: test–retest, inter-rater (kappa, ICC), internal consistency (Cronbach's alpha/omega). Validity: content, construct (convergent/discriminant), criterion, face validity. Psychometrics: IRT vs classical test theory; factor analysis (EFA, CFA) for scale development. Data collection methods Quantitative: surveys, experiments (lab/field/online), administrative/secondary data, sensors and digital traces. Qualitative: interviews, focus groups, observation, document analysis. Practical needs: piloting, training, SOPs, quality monitoring. Data analysis approaches Descriptive: summary statistics and visualization. Inferential: hypothesis tests, regression (linear, logistic, Poisson), multilevel models, time series, survival analysis, nonparametric methods. Qualitative analysis: coding, thematic analysis, grounded theory procedures. Computational methods: machine learning (supervised/unsupervised), NLP, network analysis. Model assessment: diagnostics, model selection (AIC/BIC/cross-validation), effect sizes and CIs. Causal inference & advanced methods Design-based preference: randomization (RCTs), natural/quasi-experiments (DiD, RDD, IV, synthetic controls). Tools: Rubin Causal Model (potential outcomes), DAGs to represent assumptions and identify confounders. Methods: propensity scores, mediation analysis, sensitivity analyses (E-values), doubly robust estimators, TMLE, causal forests for HTE. Reproducibility, transparency & open science Practices: pre-registration, registered reports, open data/code, FAIR data principles, comprehensive reporting (CONSORT, STROBE, PRISMA, COREQ). Tools: version control (git), literate programming (R Markdown, Jupyter), containers (Docker), workflow automation (snakemake). Ethics & regulation IRBs/ethics committees, informed consent, privacy and data security, protections for vulnerable populations. Responsible conduct: authorship, conflicts of interest, avoiding fabrication/falsification; special attention to digital/AI research and dual-use concerns. Practical workflow: idea to publication Formulate question and theory; review literature. Choose design, plan sampling/measures, power analysis. Obtain ethics approval, pilot, collect and document data. Pre-register where appropriate, analyze with diagnostics and sensitivity checks. Report with appropriate guidelines, share data/code, and disseminate findings. Tools & resources (high-level) Statistical: R, Python, Stata, SAS, SPSS; SEM: Mplus, AMOS. Qualitative: NVivo, ATLAS.ti, MAXQDA; transcription services. Reproducibility: Git/GitHub, Jupyter, RStudio, Docker, Binder. Causal: dagitty, DoWhy, EconML, causalml. Contemporary debates Reproducibility crisis, p-hacking, and incentives for significant results. Tension between predictive machine-learning models and interpretable causal inference. Big data strengths vs measurement/selection biases; ethical concerns around AI and automated research. Challenges and opportunities in integrating interdisciplinary and mixed-methods approaches. Future directions Broader adoption of open science, automated/AI-assisted workflows, and causal discovery tools. Personalized/adaptive experiments, living syntheses (living systematic reviews), and stronger ethical frameworks for AI and synthetic data. Increased citizen and participatory research. Examples & quick checklists Examples: RCT in medicine, DiD for policy evaluation, grounded theory for qualitative inquiry, causal forests for HTE. Design checklist: clear question, justified design, sample size, validated measures, SOPs, ethics approval. Analysis checklist: pre-registration, reproducible scripts, diagnostics, sensitivity analyses, effect sizes with CIs. Reporting checklist: use relevant reporting guideline, disclose limitations, share data/code where possible. Further reading & concluding remarks Recommended textbooks and resources include Creswell on mixed methods, Shadish/Cook/Campbell on experimental design, Imbens & Rubin on causal inference, Pearl for DAG intuition, and the Cochrane Handbook for systematic reviews. In sum: research methods combine practical tools and epistemic assumptions. Mastery requires rigorous design thinking, careful measurement, appropriate analysis, transparent reporting, and ethical conduct. These principles guide credible, useful research across evolving data and computational landscapes.

Open full tree

Follow the trail that experts already trust.

Resources

10:11

Qualitative vs Quantitative vs Mixed Methods Research: How To Choose Research Methodology

Grad Coach785.2K views

0:05

Qualitative research and Quantitative research || types of research()

ntaugcnet748.3K views

13:52

Fundamentals of Qualitative Research Methods: What is Qualitative Research (Module 1)

Yale University694.6K views

Read deeper, connect wider, own the subject.

Deep Article

Research Methods — A Comprehensive Guide

Research methods are the systematic approaches, tools, and techniques used to ask and answer questions, test hypotheses, build theory, and generate knowledge across disciplines. This article is a deep dive into the history, theoretical foundations, key concepts, practical applications, and future directions of research methods. It is intended for graduate students, early-career researchers, and anyone who wants a rigorous orientation to designing, conducting, analyzing, and reporting research.

Table of contents

Introduction and purpose
Historical overview and intellectual roots
Key concepts and terminology
Major research paradigms
Research designs and strategies
Sampling and study populations
Measurement, reliability, and validity
Data collection methods (qualitative & quantitative)
Data analysis approaches
Causal inference and advanced analytic methods
Reproducibility, transparency, and open science
Ethics and regulatory considerations
Practical workflow: from idea to publication
Tools, software, and resources
Current state and contemporary debates
Future directions and emerging trends
Examples and case studies
Quick checklists and templates
Further reading

1. Introduction and purpose

Research methods provide the rules and procedures that guide the collection, analysis, interpretation, and presentation of evidence. Good methods increase the likelihood that study conclusions are accurate, replicable, and useful. Selecting appropriate methods requires matching research questions and theory to design choices, measurements, and analytic techniques.

Research is often classified by purpose:

Exploratory: generate hypotheses or map phenomena.
Descriptive: describe characteristics or distributions.
Explanatory (analytic): test hypotheses and infer causes.
Evaluative: assess interventions or programs.
Predictive: forecast future observations.

2. Historical overview and intellectual roots

Ancient roots: systematic observation in natural philosophy; early methods in medicine and astronomy.
17th–18th century: The scientific method formalized — emphasis on observation, experiment, and skepticism (Bacon, Galileo, Newton).
19th century: Statistical methods and social statistics emerge; Quetelet, Galton; development of correlation.
Early 20th century: Formalization of statistical inference (Fisher), hypothesis testing (Neyman-Pearson), experimental design, randomized experiments.
Mid 20th century: Behavioral and social sciences diversify methods; qualitative traditions such as ethnography and phenomenology become established.
Late 20th–early 21st century: Expansion of computational methods, machine learning, causal inference frameworks (Rubin Causal Model, DAGs), and “replication crisis” leading to open science reforms.

Landmark figures and ideas:

Ronald A. Fisher: experimental design, ANOVA, maximum likelihood.
Jerzy Neyman & Egon Pearson: hypothesis testing framework.
Donald Rubin: potential outcomes and causal inference.
Karl Popper: falsifiability.
Cronbach: reliability and validity in measurement.

3. Key concepts and terminology

Population vs. sample
Variable types: categorical (nominal, ordinal), continuous (interval, ratio)
Independent (explanatory) vs dependent (outcome) variables
Confounding, mediation, moderation
Internal validity: the degree to which observed effects are causal
External validity (generalizability)
Construct validity: whether measures capture intended constructs
Statistical conclusion validity: appropriateness of statistical inferences
Reliability: stability and consistency of measurement
Bias: systematic error (selection bias, information bias, publication bias)
Precision: variability or uncertainty (standard errors, confidence intervals)
Power and sample size: probability of correctly detecting an effect

4. Major research paradigms

Positivism / Post-positivism: Emphasizes objective measurement, hypothesis testing, quantitative methods.
Interpretivism / Constructivism: Emphasizes subjective meaning, context, qualitative methods.
Critical theory: Examines power structures and seeks social change.
Pragmatism: Prioritizes methods that best address the research question — often supports mixed methods.

Choosing a paradigm affects epistemological assumptions, design, and methods selection.

5. Research designs and strategies

Broad typology:

Quantitative designs

Experimental designs
Randomized Controlled Trials (RCTs): gold standard for causal inference.
Factorial designs, crossover, cluster randomized trials.
Quasi-experimental designs
Interrupted Time Series, Difference-in-Differences (DiD), Regression Discontinuity, Instrumental Variables.
Observational designs
Cohort (prospective or retrospective), Case-control, Cross-sectional, Nested case-control, Ecological studies.

Qualitative designs

Ethnography, Participant observation
Phenomenology
Grounded theory
Case studies (single or multiple)
Narrative research

Mixed methods

Convergent (parallel) design
Explanatory sequential (quant → qual)
Exploratory sequential (qual → quant)
Embedded/multiphase designs

Design selection should be driven by the research question, feasibility, ethics, and the causal strength required.

6. Sampling and study populations

Sampling strategies:

Probability sampling: simple random, stratified, cluster, systematic, multistage — supports unbiased estimators and known sampling error.
Non-probability sampling: convenience, purposive, snowball, quota — commonly used in qualitative research or hard-to-reach populations.

Key considerations:

Define target population clearly.
Use inclusion/exclusion criteria appropriately.
Address sampling frame coverage and selection bias.
Calculate sample size based on effect size, alpha, power, and design effects (clustered designs require larger n).

Example: power calculation for a two-sample t-test (Python using statsmodels)

```python from statsmodels.stats.power import TTestIndPower

analysis = TTestIndPower() effectsize = 0.5 # Cohen's d (small 0.2, med 0.5, large 0.8) alpha = 0.05 power = 0.8 npergroup = analysis.solvepower(effectsize=effectsize, alpha=alpha, power=power, alternative='two-sided') print(f"n per group: {int(npergroup)+1}") ```

7. Measurement, reliability, and validity

Measurement scales:

Nominal (labels), Ordinal (ranked), Interval (equal intervals), Ratio (true zero).
Likert scales are common for attitudes (treated as ordinal or interval depending on analysis).

Reliability types:

Test–retest reliability
Inter-rater reliability (kappa, ICC)
Internal consistency (Cronbach's alpha, omega)

Validity types:

Content validity: coverage of construct domain.
Construct validity: convergent and discriminant validity.
Criterion validity: correlation with a gold standard (concurrent, predictive).
Face validity: subjective judgment.

Measurement error:

Random error reduces precision.
Systematic error introduces bias.

Psychometrics:

Item Response Theory (IRT) vs Classical Test Theory
Factor analysis (EFA, CFA) for scale development and validation

Example: computing Cronbach's alpha in Python (pandas + numpy)

```python import numpy as np import pandas as pd

df is a DataFrame with items as columns

def cronbachalpha(df): items = df.columns itemvars = df.var(axis=0, ddof=1) totalvar = df.sum(axis=1).var(ddof=1) nitems = len(items) return nitems / (nitems - 1) * (1 - itemvars.sum() / totalvar)

usage

alpha = cronbach_alpha(df)

```

8. Data collection methods

Quantitative methods

Surveys and questionnaires: online, phone, face-to-face, paper. Consider questionnaire design, piloting, response rates, measurement invariance.
Experiments: lab, field, online (e.g., A/B testing).
Administrative and secondary datasets: registries, EHRs, government data.
Sensors and digital trace data: wearables, smartphone logs, clickstreams.

Qualitative methods

Interviews: structured, semi-structured, unstructured.
Focus groups
Observation and ethnography
Document and content analysis

Mixed methods integrate both types to capitalize on strengths and offset weaknesses.

Practical considerations:

Piloting instruments
Training data collectors
Standard operating procedures
Data quality monitoring and auditing

9. Data analysis approaches

Descriptive statistics

Central tendency, dispersion, frequencies, cross-tabs, visualization.

Inferential statistics

Hypothesis testing: t-tests, chi-square, ANOVA
Regression: linear, logistic, Poisson, Cox proportional hazards
Multilevel (hierarchical) modeling for nested data
Time series analysis (ARIMA, state-space models)
Survival analysis and competing risks
Nonparametric methods when distributional assumptions fail

Model assessment

Residual diagnostics, goodness-of-fit
Multicollinearity, heteroskedasticity
Model selection: AIC, BIC, cross-validation
Effect sizes and confidence intervals

Qualitative analysis

Thematic analysis
Coding and codebooks
Grounded theory: open, axial, selective coding
Content and discourse analysis
Validity in qualitative research: credibility, transferability, dependability, confirmability

Computational and data-intensive methods

Machine learning: supervised (classification/regression), unsupervised (clustering, dimensionality reduction), reinforcement learning
Natural language processing (topic modeling, sentiment analysis)
Network analysis and graph methods

Example: simple linear regression in R or Python (statsmodels)

Python: ```python import statsmodels.api as sm

X = sm.add_constant(df[['age', 'income']]) y = df['outcome'] model = sm.OLS(y, X).fit() print(model.summary()) ```

10. Causal inference and advanced analytic methods...

Ready to see the full tree?

Clone the preview to open the complete learning structure, practice tools, and generated study materials.

Research methods

Sociology Research Methods: Crash Course Sociology #4

Overview of Qualitative Research Methods