Page 173 - AI Computer 10
P. 173
Knowledge Botwledge Bot
Kno
RGB stands for red, green, and blue, and is a colour model that uses these three colours to create many
other colours. It is an additive model and is used in many electronic devices, including computer monitors,
televisions, and smartphones.
Data Feature
A data feature is an individual measurable property or characteristic of a dataset that is used as an input in a
machine learning model. Features can be thought of as the columns in a data table, each containing specific
information.
For example, In the Colour dataset example, features may be colour, name, colour type, and RGB code, etc.
Features
Colour Colour Type RGB Code
Red Primary (255, 0, 0)
Green Secondary (0, 255, 0)
Red-Purple Tertiary (80, 0, 255)
Data features are the collected data points that provide information about the observations. For example, in a
dataset describing houses, features might include plot, number of rooms, location, etc.
Data Labelling
Data Labelling is the process of adding meaningful tags or labels to various features within a dataset, so that
machine learning models can learn from it.
In general, labels depend on the context of the problem we are trying to solve. For example, if we are trying to
predict what colour it is based on the RGB code of the colour, then colour is the feature, and RGB code is the
label. The primary goal of data labelling is to provide clear information about the data, allowing algorithms to
make predictions or categorisations based on training datasets.
In machine learning models, data can be of two types: Labelled and Unlabelled.
u Labelled data: Labelled data consists of input data that is paired with corresponding output data (labels).
Each data point is illustrated with information that reflects the desired outcome.
u Unlabelled data: Unlabelled data consists of input data without any associated output information. It is raw
data that has not been annotated.
Labelled Data (Based on Taste) Labelled Data (Based on Colour) Unlabelled Data
Apple Orange Banana Red Orange Yellow
Training Dataset
A Training Dataset is a subset of data used to train a machine learning model. Each entry in the dataset contains
features and labels. It consists of input-output pairs where the model learns to map the input features to the
corresponding output labels. It typically includes a variety of examples that reflect the problem being solved.
39
39