Data, Data, Data and Data!!!!
In every aspect of human life, be it your day to day chores or a big business issue, handling the problem based on its future aspects is very necessary. In a similar manner, data when grows and increases at an unprecedented level, it becomes necessary to handle data.
Information is increasing day-by-day and thus there is always a need to handle such a huge amount of data. Data comes from various sources be it: webpages, applications, software, blogging services, APIs and many more. The data that is fetched is not always in the structured form, instead it is, raw, dirty and consists of many irregularities.
Thus this data needs to be processed in order to handle it and manipulate it, so as to come to certain conclusion, which can be achieved by classification, clustering, prediction and by other data analysis techniques.
Increase of webpages, increase in the use of IT services, increase in areas of mass media and communication have led to a marvelous growth in the density of the data.
Data can be measured in terms of 3 simple terms:
- Volume of the data: How much the data is present at a particular instance?
- Variety of the data: What are the different types of data present in a given set, at a particular instance?
- Velocity of the data: At what speed, the data is acquired to perform data manipulation and to make the data useful in various possible ways.
Such huge data sets and increasing number of trends in the field of data and information present in it, have led to the existence of Data Science and Data Analysis.
It’s a simple logic, that more the information, the more you need to analyse it, to find some particular pattern present in it. Similarly, more the data, the more you need to analyse it, so as to fetch the knowledge or the information present in it, in the form of some patterns associated with the data in the given set.
The Data Analysis, makes the use of Databases, Data Warehouses and Data Mining. The cycle is known as the Knowledge Discovery from Databases, i.e. the KDD Process. It can be shown as:
This KDD process helps in mining the important patterns and knowledge from the databases and then finally they are generated in the form of reports and graphs to the end user.
Also Data Analysis makes further use of various artificial intelligence techniques, such as: Machine Learning, Artificial Neural Networks, Support Vector Machines and many more. Data Analysis is done, in order to study the data, process them and then make use of some important knowledge present in the data.
Thus, we can say that the increase in the data, will lead to the increase of data analysis and data science.
1) Banner: https://www.slideshare.com
2) Figure1 and Figure2: Data Mining and Techniques Concepts-Kambher
-Article submitted and written by: Akshay Rakesh Toshniwal