We have added a new hyper-parameter optimization algorithm called Randomized Search for our tree-based models within Deep QI. This enhancement makes hyper-parameter tuning easier, faster, and more efficient.

Introduction:

Randomized Search CV is a hyper-parameter optimization technique used in machine learning to find the best combination of hyper-parameters for a given model. Instead of exhaustively searching through a predefined grid of hyper-parameter values (as in Grid Search), Randomized Search randomly samples a specified number of hyper-parameter combinations from a given distribution. This allows it to explore a larger hyper-parameter space with a fixed computational budget.

Advantages of Randomized Search CV over Grid Search:

  • Computational efficiency: Random Search is generally faster and more efficient.
  • Broader exploration: Random Search explores the search space more broadly and is less likely to get stuck in a local optima compared to Grid Search, which can sometimes be too narrowly focused.
  • Better performance: Random Search has been shown to find better hyperparameter configurations than Grid Search, especially when the search space is large or when there are complex interactions between hyperparameters.
  • Fewer iterations: Random Search often requires fewer iterations to find good hyperparameter settings compared to Grid Search, which can save time and computational resources.

The figure 1 compares Grid Search (a) and Randomized Search (b) for hyper-parameter tuning. In Grid Search, all possible combinations of parameters are evaluated systematically, but it can be inefficient as it exhaustively searches the predefined grid. In contrast, Randomized Search randomly samples parameter combinations, which allows for a broader and often more efficient exploration of the hyper-parameter space, potentially discovering better parameters more quickly.

Source: Bergstra, J., & Bengio, Y. (2012)

Figure 1: shows a comparison of Grid Search and Random Search

Conclusion:

Randomized Search is generally preferred over Grid Search because it is more efficient, provides better coverage of the hyper-parameter space, and offers greater flexibility and adaptability in exploring parameter values. This algorithm can be used for both regression and classification tasks across all the tree-based algorithms within Deep QI.

Venkatesh Anantharamu
Post by Venkatesh Anantharamu
Jun 5, 2024 4:05:29 PM