[SPARK-26173] Prior regularization for Logistic Regression - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Minor
Resolution: Won't Fix
Affects Version/s: 3.0.0
Fix Version/s: None
Component/s: MLlib
Labels:
None

Description

This feature enables Maximum A Posteriori (MAP) optimization for Logistic Regression based on a Gaussian prior. In practice, this is just implementing a more general form of L2 regularization parameterized by a (multivariate) mean and precisions (inverse of variance) vectors.

Prior regularization is calculated through the following formula:

where:

λ: regularization parameter (regParam)
K: number of coefficients (weights vector length)
w~i~ with prior Normal(μ~i~, β~i~²)

Reference: Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning (section 4.5). Berlin, Heidelberg: Springer-Verlag.

Existing implementations

Python: bayes_logistic

Implementation

2 new parameters added to LogisticRegression: priorMean and priorPrecisions.
1 new class (PriorRegularization) implements the calculations of the value and gradient of the prior regularization term.
Prior regularization is enabled when both vectors are provided and regParam > 0 and elasticNetParam < 1.

Tests

DifferentiableRegularizationSuite
- Prior regularization
LogisticRegressionSuite
- prior precisions should be required when prior mean is set
- prior mean should be required when prior precisions is set
- `regParam` should be positive when using prior regularization
- `elasticNetParam` should be less than 1.0 when using prior regularization
- prior mean and precisions should have equal length
- priors' length should match number of features
- binary logistic regression with prior regularization equivalent to L2
- binary logistic regression with prior regularization equivalent to L2 (bis)
- binary logistic regression with prior regularization

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Prior regularization.png
29/Nov/18 12:25
4 kB
Facundo Bellosi

Issue Links

links to

[Github] Pull Request #23146 (elfausto)

GitHub Pull Request #23146

Activity

People

Assignee:: Unassigned

Reporter:: Facundo Bellosi

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 26/Nov/18 13:03

Updated:: 03/Jan/20 00:17

Resolved:: 03/Jan/20 00:17