3 min read

Introduction to Decision Tree Algorithm

In its simplest form, a decision tree is a type of flowchart that shows a clear pathway to a decision. In terms of data analytics, it is a type of algorithm that includes conditional ‘control’ statements to classify data.
Introduction to Decision Tree Algorithm

Decision tree is a flowchart like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label. A Decision Tree has many analogies in real life and turns out, it has influenced a wide area of Machine Learning, covering both Classification and Regression.

Creating a Decision Tree
Let us consider a scenario where a new planet is discovered by a group of astronomers. Now the question is whether it could be ‘the next earth?’ The answer to this question will revolutionize the way people live. Well, literally!

There is n number of deciding factors which need to be thoroughly researched to take an intelligent decision. These factors can be whether water is present on the planet, what is the temperature, whether the surface is prone to continuous storms, flora and fauna survives the climate or not, etc.

Let us create a decision tree to find out whether we have discovered a new habitat.
The habitable temperature falls into the range 0 to 100 Celsius.

Decision Tree Example 1 - Decision tree - Edureka

Whether water is present or not?

Decision Tree Example 2 - Decision tree - Edureka

Whether flora and fauna flourishes?

Decision Tree Example 3 - Decision tree - Edureka

The planet has a stormy surface?

Decision Tree Example 4 - Decision tree - Edureka

Thus, we a have a decision tree with us.

Advantages of decision trees

  • Good for handling a combination of numerical and non-numerical data.
  • Easy to define rules, e.g. ‘yes, no, if, then, else…’
  • Requires minimal preparation or data cleaning before use.

Disadvantages of decision trees

  • They are not well-suited to continuous variables (i.e. variables which can have more than one value, or a spectrum of values).
  • In predictive analysis, calculations can quickly grow cumbersome, especially when a decision path includes many chance variables.
  • When using an imbalanced dataset (i.e. where one class of data dominates over another) it is easy for outcomes to be biased in favor of the dominant class.
Decision Tree | Decision Tree Introduction With Examples | Edureka
This blog will teach you how to create a perfect Decision Tree, by using parameters of ‘Entropy’ and ‘Information Gain’.

#DecisionTreeAlgorithm #MachineLearning #DataScience #Probyto #ProbytoAI

Subscribe & Follow us for latest in field of AI & Tech and stay updated!

Facebook: https://facebook.com/probyto
Twitter: https://twitter.com/probyto
LinkedIn: https://linkedin.com/company/probyto
Instagram: https://instagram.com/probyto