Page 258 - AI Computer 10
P. 258
TYPES OF DATA
You have learnt that the term ‘data’ is the collection of raw facts and figures. This data is converted into processed
information using machines. An AI model predicts optimum results on the basis of data which is being fed by the
programmer in different formats. Some of the commonly used data formats are as follows:
u Spreadsheet: In general, all data should be stored in tabular form. On the computer, we can mostly use
a spreadsheet application program such as Excel, Calc, etc. The term ‘Spreadsheet’ refers to a computer
program used for accounting and recording data using rows and columns into which information can be
entered.
u CSV: CSV is an acronym of Comma Separated Values which allows data to be saved in a tabular format.
CSVs look like a garden-variety spreadsheet but with a .csv extension. CSV files can be used with most
spreadsheet program, such as Microsoft Excel or Google Spreadsheets. They differ from other spreadsheet
file types because you can only have a single sheet in a file. These do not have properly defied cells, rows,
or columns.
u SQL: SQL is an acronym of Structured Query Language. It is a database computer language designed for the
retrieval and management of data in a relational database.
u ZIP: ZIP format is an archive file format. In archive file format, you create a file that contains multiple files
along with metadata. An archive file format is used to collect multiple data files together into a single file.
This is done for simply compressing the files to use less storage space.
DATA ACCESS IN PYTHON
You have learnt that Python is the most commonly used programming language in the field of data science. Now,
you will learn how packages of Python help us in accessing structured data within the Python code. The brief
description of Python packages are as follows:
NumPy
NumPy, an acronym of Numerical Python, is the fundamental package for scientific
computing with Python. The important features of NumPy are as follows:
u Creation of powerful N-dimensional array object (nd array)
u Sophisticated broadcasting function allowing arithmetic operations between arrays
of different sizes.
u Useful linear algebra, Fourier transform, and random number capabilities.
An array is a set of multiple values of the same datatype. They can be numbers, characters, booleans, etc. You
should always remember that only one data type can be accessed through an array. The difference between
NumPy arrays and lists are summarised in the table below:
NumPy Arrays Lists
Arrays are created by using a specific function from Lists are created by simply enclosing a sequence of
either the array module or NumPy packages. elements within square brackets.
The data in arrays is homogeneous in nature. The data in lists is heterogeneous in nature.
Arrays are great for numerical operations. Lists cannot directly handle numerical operations.
Arrays offer more efficient data storage. Lists utilise more memory space.
Functions such as concatenation, appending, extending, Functions such as concatenation, appending,
etc are not possible within an array. extending, etc are possible within a list.
124
124