2. Model Types

A summary of the different machine learning models Boosted.ai offers for stock universe ranking

Boosted Insights offers a collection of different model training types that are each useful, depending on your use case. Note that models only create rankings for in-universe assets, portfolio construction from model rankings is done separately by a portfolio optimizer.

Linear Regression

Linear Regression is the simplest machine learning algorithm and very fast to train. The model fits a linear equation through the data points such that the squared errors between the predicted values (the blue line) and the actual values (the black data points) are minimized.

 


However, the strict linearity of the model means that it is quite inflexible and typically generates higher prediction errors than more flexible and complex models. Despite its simplicity, variants of linear models are widely used in applications where model interpretability is a must.

Decision Trees

While not available as a model on the Boosted Insights platform, decision tree learning is a fundamental component of Random Forest, Generalized Boosting, and our proprietary Battle Royale models.

A decision tree relies on successive splits of variables to sort the dataset such that data points are more similar within a group. Optimal splits are typically chosen based on the maximum increase in post-split similarity. The specific metric used to gauge post-split data similarity will differ depending on the task and is beyond the scope of this article.

While decision trees are more flexible than a linear model, it is also more fragile since small changes in variable values can lead to large changes in model predictions.

 

Random Forest

To combat the fragility of individual decision trees, a random forest uses an ensemble of many trees that are each trained on a subset of the data. Each split of each tree will utilize a subset of available features. The final prediction is determined by a majority vote of decision tree models. 

The randomness introduced through sampling the dataset and splitting on subsets of variables makes the model more robust to outliers, decorrelates individual trees from each other, and overall contributes to better model generalization and lower variance of predictions.

 

Gradient Boosting

Gradient boosting also uses an ensemble of many models (typically decision trees) to inform its predictions. However, unlike a random forest, each successive model is trained to correct the errors of previous models. The initial prediction flows through all subsequent models which correct for and minimize initial errors.

This should be considered a broad model class with many different implementations, the details of which are beyond the scope of this article.

 

Battle Royale Noise Reduction (BRNR)

Battle Royale is our internally developed proprietary machine learning algorithm custom-built for financial applications. It is robust to noisy data and works well on a large number of input features. Like other tree-based models, it is capable of filtering out uninformative features and finding non-linear relationships between inputs. In BRNR, every company is compared to every other company in a “battle royale” relative to a defined goal, such as maximizing alpha. Companies are ranked by the number of individual battles they win relative to the defined goal. 

This approach allows a richer understanding of relationships between variables and an emphasis on picking relative outperformers.

MODEL TYPE COMPARISON FACTORS

A relative score for the following factors are provided to help you make the choice that is best for your use case.

Model Factor Description
Cost The amount of credits this model type would cost you relative to other model types. Cost is reflective of the amount of cloud computing power required.
Performance How well this model type typically meets goal criteria.
Speed How quickly this model is able to be trained and backtest results made available.
Accuracy How accurate this model type is at picking individual performers that meet goals.
Holistic How well this model type is at building an overall portfolio that works together holistically to meet goals.