This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\coltauthor\Name

Guy Blanc \Email[email protected]
\addrStanford and \NameJane Lange \Email[email protected]
\addrMIT and \NameMingda Qiao \Email[email protected]
\addrStanford and \NameLi-Yang Tan \Email[email protected]
\addrStanford

Open Problem: Properly learning decision trees in polynomial time?

Abstract

The authors recently gave an nO(loglogn)n^{O(\log\log n)} time membership query algorithm for properly learning decision trees under the uniform distribution (BLQT21). The previous fastest algorithm for this problem ran in nO(logn)n^{O(\log n)} time, a consequence of EH89’s classic algorithm for the distribution-free setting. In this article we highlight the natural open problem of obtaining a polynomial-time algorithm, discuss possible avenues towards obtaining it, and state intermediate milestones that we believe are of independent interest.

keywords:
Decision trees, proper learning, analysis of Boolean functions

1 Introduction

Decision trees are one of the most intensively studied concept classes in learning theory. In this article we focus on the problem of properly learning decision trees, where the learning algorithm is expected to return a hypothesis that is itself a decision tree. Decision tree hypotheses are of particular interest because of their simple structure, and they are the canonical example of a highly interpretable model. Indeed, it is natural to seek decision tree hypotheses even when learning concepts that are not themselves decision trees. Algorithms such as ID3, CART, and C4.5 that construct decision tree representations of datasets are among the most popular and empirically successful algorithms in everyday machine learning practice.

We focus on the setting of PAC learning under the uniform distribution with membership queries, where the learner is given query access to an unknown size-ss decision tree target f:{0,1}n{0,1}f:\{0,1\}^{n}\to\{0,1\} and is expected to construct a decision tree hypothesis h:{0,1}n{0,1}h:\{0,1\}^{n}\to\{0,1\} satisfying Pr\boldsymbolx{0,1}n[f(\boldsymbolx)h(\boldsymbolx)]ε\Pr_{\boldsymbol{x}\sim\{0,1\}^{n}}[f(\boldsymbol{x})\neq h(\boldsymbol{x})]\leq\varepsilon, where \boldsymbolx{0,1}n\boldsymbol{x}\sim\{0,1\}^{n} is uniform random.

The main open problem of this article is the following:

{openproblem}

Design a poly(n,s,1/ε)\mathrm{poly}(n,s,1/\varepsilon)-time membership query algorithm for properly learning size-ss decision trees over nn variables to error ε\varepsilon under the uniform distribution.

Regarding the use of membership queries, it would of course be preferable if the algorithm does not require them and instead only uses uniform random labeled examples. However, there are two significant barriers to such an algorithm. First, no such statistical query algorithm (SQ) exists: any SQ algorithm for learning size-ss decision trees has to make at least nΩ(logs)n^{\Omega(\log s)} SQs