causalml Contributions
PR #912
Make meta-learners scikit-learn compliant via BaseEstimator
Merged Jun 2026
  • Made BaseLearner inherit sklearn.base.BaseEstimator, giving every subclass get_params/set_params for free and enabling Pipeline and GridSearchCV compatibility out of the box
  • Refactored all five learner families (S/T/X/R/DR) to store constructor arguments verbatim in __init__ with no logic or deepcopy; all model construction deferred to fit()
  • Replaced the bespoke _unfitted_clone/_model_*_template machinery introduced in #910 with a direct clone(self) call in the bootstrap path, eliminating the regression where clone(self, safe=False) deepcopied fitted models on every bootstrap iteration
  • Fixed XGBRRegressor to use an explicit named-parameter signature with xgb_kwargs=None instead of *args/**kwargs, deferring all XGBRegressor construction to fit() so get_params()/clone() work correctly
  • Moved learner-presence validation out of __init__ into fit() across all learners, since __init__-time assertions break clone()
  • Added self.propensity = {} sentinel to BaseXLearner and BaseDRLearner so estimate_ate(pretrain=True) before fit() raises a clean ValueError instead of AttributeError
  • Fixed BaseTClassifier.predict fail-fast ordering to match BaseTLearner.predict, checking mutually exclusive flags at the top before any computation
  • Made fit() return self across all learners for Pipeline/GridSearchCV method chaining; nested params now visible via get_params (e.g. learner__max_depth)
  • Added 31 sklearn compliance tests to test_meta_learners.py covering clone()/get_params() round-trips for all 8 learner variants, fit() returns self, XGBRRegressor bootstrap CI path, bit-identical equivalence guards, and propensity sentinel consistency
  • Addressed all blocking and non-blocking maintainer review comments across 6+ review rounds, including merge conflict resolution, Cython extension troubleshooting on Windows, architecture consistency, and statistical correctness
View Pull Request ↗
PR #886
Add Post-Fit Confidence Intervals to BaseTLearner via store_bootstraps and return_ci
Merged May 2026
  • Added store_bootstraps=False to BaseTLearner.fit(), enabling storage of a bootstrap ensemble after training for train-once, score-many workflows
  • Added return_ci=False to BaseTLearner.predict(), allowing confidence intervals to be generated on new unseen datasets without retraining
  • Introduced a reusable bootstrap ensemble framework through BaseLearner.fit_bootstrap_ensemble(), making the implementation extensible to additional causal inference meta-learners
  • Refactored bootstrap training into module-level helper functions to eliminate joblib parallelization and pickling issues caused by nested functions
  • Replaced deepcopy() with sklearn.base.clone() following EconML-style design patterns for efficient model replication and reproducibility
  • Added support for reproducible bootstrap inference through random_state handling and parallel execution via joblib
  • Extended confidence interval support to BaseTClassifier.predict(), enabling uncertainty estimation for classification-based treatment effect models
  • Added comprehensive test coverage for reproducibility, parallel execution (n_jobs > 1), random seed behavior, BaseTLearner confidence intervals, and BaseTClassifier confidence intervals
  • Addressed all blocking and non-blocking maintainer review comments across multiple review rounds, including architecture refactoring, API consistency, parallelization safety, and statistical correctness
View Pull Request ↗
PR #890
Add Bootstrap Confidence Intervals and P-values to rate_score()
Merged Apr 2026
  • Extended rate_score() in causalml/metrics/rate.py with return_ci=False, n_bootstrap=200, alpha=0.05, and random_state=None parameters following sklearn conventions
  • When return_ci=True, uses half-sample bootstrap (m = n // 2, without replacement) per the Yadlowsky et al. (2021) functional CLT, returning SE, CI bounds, and a two-sided p-value testing H0: RATE = 0
  • Refactored integration logic into a module-level _compute_rate_from_toc() helper to eliminate code duplication and avoid joblib pickle issues with nested functions
  • Added 4 new tests to tests/test_rate.py using existing synthetic_df and rct_df fixtures and RANDOM_SEED from tests/const.py; addressed all blocking and non-blocking review comments across two review rounds; passed black and CI checks
  • Bootstrap inference verified correct against the Yadlowsky et al. (2021) paper by the maintainer across two review rounds
View Pull Request ↗
PR #887
Add Rank-weighted Average Treatment Effect (RATE) Metric
Merged Mar 2026
  • Added causalml/metrics/rate.py with three public functions — get_toc(), rate_score(), and plot_toc() — following the exact same API conventions as get_qini / qini_score / plot_qini
  • get_toc() computes the Targeting Operator Characteristic curve via O(n) cumulative sums; rate_score() computes the RATE scalar with AUTOC (1/q) or Qini (q) weighting; plot_toc() visualizes the TOC curve
  • Supported both oracle mode (simulated tau) and observed RCT mode (y + w); fixed normalize division-by-zero by using max(|TOC|) instead of TOC(1); added logger.warning for observed-outcome fallback
  • Added 20 unit tests in tests/test_rate.py; addressed all blocking and non-blocking review comments across two review rounds; passed black and pre-commit clean
  • Implementation verified correct against the Yadlowsky et al. (2021) paper and the grf R package reference by the maintainer
View Pull Request ↗
PR #860
Add Native NaN Support for UpliftTree and UpliftRandomForest
Merged Mar 2026
  • Added native NaN routing logic to each candidate split, evaluating both left/right directions and learning the optimal routing per node — consistent with scikit-learn's decision tree behavior
  • Stored the learned NaN routing in each DecisionTree node and applied it consistently during training, pruning, filling, and prediction
  • Guarded all np.isnan() calls with np.issubdtype(..., np.number) to prevent TypeError on string/categorical columns
  • Added NaN-aware percentile calculation by filtering out NaN values before computing split thresholds
  • Added two targeted tests: one for NaN values in numeric columns, one for None values in object-dtype columns
View Pull Request ↗