Page 173 - AI Computer 10
P. 173

Knowledge Botwledge Bot
              Kno
          RGB stands for red, green, and blue, and is a colour model that uses these three colours to create many
          other colours. It is an additive model and is used in many electronic devices, including computer monitors,
          televisions, and smartphones.


        Data Feature

        A data feature is an individual measurable property or characteristic of a dataset that is used as an input in a
        machine learning model. Features can be thought of as the columns in a data table, each containing specific
        information.
        For example, In the Colour dataset example, features may be colour, name, colour type, and RGB code, etc.

                                                          Features



                                       Colour           Colour Type          RGB Code
                                 Red                Primary               (255, 0, 0)
                                 Green              Secondary             (0, 255, 0)

                                 Red-Purple         Tertiary              (80, 0, 255)
        Data features are the collected data points that provide information about the observations. For example, in a
        dataset describing houses, features might include plot, number of rooms, location, etc.

        Data Labelling

        Data Labelling is the process of adding meaningful tags or labels to various features within a dataset, so that
        machine learning models can learn from it.
        In general, labels depend on the context of the problem we are trying to solve. For example, if we are trying to
        predict what colour it is based on the RGB code of the colour, then colour is the feature, and RGB code is the
        label. The primary goal of data labelling is to provide clear information about the data, allowing algorithms to
        make predictions or categorisations based on training datasets.
        In machine learning models, data can be of two types: Labelled and Unlabelled.
         u Labelled data: Labelled data consists of input data that is paired with corresponding output data (labels).
             Each data point is illustrated with information that reflects the desired outcome.
         u Unlabelled data: Unlabelled data consists of input data without any associated output information. It is raw
             data that has not been annotated.

             Labelled Data (Based on Taste)     Labelled Data (Based on  Colour)           Unlabelled Data





             Apple     Orange     Banana          Red       Orange    Yellow

        Training Dataset

        A Training Dataset is a subset of data used to train a machine learning model. Each entry in the dataset contains
        features and labels. It consists of input-output pairs where the model learns to map the input features to the
        corresponding output labels. It typically includes a variety of examples that reflect the problem being solved.



                                                                                                              39
                                                                                                              39
   168   169   170   171   172   173   174   175   176   177   178