\ell_p-\ell_q penalty for sparse linear and sparse multiple kernel multi-task learning,

Metadatas

Date

2011

type
Language
Identifiers
License

info:eu-repo/semantics/OpenAccess


Keywords

multi-task multiple kernel mixed-norm sparsity

Similar subjects Fr

châtiment

Cite this document

Alain Rakotomamonjy et al., « \ell_p-\ell_q penalty for sparse linear and sparse multiple kernel multi-task learning, », Hyper Article en Ligne - Sciences de l'Homme et de la Société, ID : 10670/1.azc6gj


Metrics


Share / Export

Abstract En

Recently, there has been a lot of interest around multi-task learning (MTL) problem with the constraints that tasks should share a common sparsity profile. Such a problem can be addressed through a regularization framework where the regularizer induces a joint-sparsity pattern between task decision functions. We follow this principled framework and focus on $\ell_p-\ell_q$ (with $0 \leq p \leq 1$ and $ 1 \leq q \leq 2$) mixed-norms as sparsity- inducing penalties. Our motivation for addressing such a larger class of penalty is to adapt the penalty to a problem at hand leading thus to better performances and better sparsity pattern. For solving the problem in the general multiple kernel case, we first derive a variational formulation of the $\ell_1-\ell_q$ penalty which helps up in proposing an alternate optimization algorithm. Although very simple, the latter algorithm provably converges to the global minimum of the $\ell_1-\ell_q$ penalized problem. For the linear case, we extend existing works considering accelerated proximal gradient to this penalty. Our contribution in this context is to provide an efficient scheme for computing the $\ell_1-\ell_q$ proximal operator. Then, for the more general case when $0 < p < 1$, we solve the resulting non-convex problem through a majorization-minimization approach. The resulting algorithm is an iterative scheme which, at each iteration, solves a weighted $\ell_1-\ell_q$ sparse MTL problem. Empirical evidences from toy dataset and real-word datasets dealing with BCI single trial EEG classification and protein subcellular localization show the benefit of the proposed approaches and algorithms.

From the same authors

On the same subjects

Similar documents