In old times
data was generated by users of companies through entering data. But now the
users are generating data by surfing in internet, emailing, photos, videos and
so on. A very good example is Google, where there is 2.5 quintillion bytes of
data is generated in a single day. So, when the data accumulation has scaled up
the importance of big data came.
Big data is
a huge set of data -both
structured and unstructured, that are complex and voluminous that traditional
data processing software’s are unable to make use of them. Old times people
used Relational Database Management Software’s(RDBMS). The data were brought to the software for processing. But
now the volume of data hiked a lot. Today we need parallel software’s running
on thousands of servers where software’s are taken to the data. The main
challenges for big data are data capturing, storage of data, analysing data, search,
sharing, transfer, visualisation, querying, updating and information privacy.
But why should we use big data?
of the storage capacity, processing power and the availability of data makes
the big data’s growth rapidly. There are mainly three characteristics for big
data or else known as the 3V’s: Volume, Variety, Velocity. Volume refers to the
enormous space that generated and processed data is acquiring. Variety on the other
hand refers to the different types of data including both structured and unstructured.
Compared to olden days today data comes in forms of photos, videos, pdf’s, email
etc. The variety in unstructured data makes the storage and processing a bit difficult
task. Velocity is another dimension of big data which refers to the speed at
which data is generated, processed and stored. Other than these 3 dimensions
two additional characteristics are also added to this namely Variability and Veracity.
Big data as
a common term is used by many organisations large and small. A very good
example is banks who use data analysis for reducing the risks and to obtain a
good customer satisfaction. It can also be used for taking an effective decision,
to increase the quality and productivity of goods, to implement changes and
modifications in student curriculum and so on. The importance of big data comes
when it is used by organisations and governments to find out issues or problems
of any organisation, spotting frauds and crimes, in taking accurate decisions,
getting a good public review, to minimise the working time and increase the
efficiency of work. Data is money is an old saying. For that we need a lot of tools,
techniques and algorithms for extracting the raw data. While considering the
technologies that handle the data, came the importance of operational and analytical
big data. Operational big data provides operational capabilities for real time workloads
where data is at first collected and stored, while using Analytical big data we
use Massive Parallel Processing(MPP) systems and MapReduce for analysing complex
data sets which may cover the whole data set. Hadoop is this kind of a tool
which uses MapReduce algorithms for mining the data. MapReduce is an algorithm
which uses two techniques namely mapping and reducing for getting refined data.
big data is a phrase to describe the massive amount of data that needs to be
collected, processed and stored because of the incapability of traditional
databases and software’s. The big data technology and market for big data is
expected to reach 57 billion by 2020.It would for sure help and make life easier
and even without our notice big data is making a massive impact on our daily