A: A random forest is a machine-learning method that makes predictions by combining the decisions of many simpler models called decision trees. A decision tree works like a tree from bottom-up. At each node, it asks a question about the data, e.g. “is a person’s age greater than 30?”. Then, depending on the answer, it moves left or right until it reaches a final decision at a ‘leaf’.

While single trees are easier to understand, they can also overfit the data, i.e. they may learn small quirks of the training data that don’t generalise well. A random forest minimises this issue by building a large number of trees, each trained on a slightly different random sample of the data.

When asked to make a prediction, every tree gives an answer. For classification problems, the random forest picks

See Full Page