In old timesdata was generated by users of companies through entering data. But now theusers are generating data by surfing in internet, emailing, photos, videos andso on. A very good example is Google, where there is 2.
5 quintillion bytes ofdata is generated in a single day. So, when the data accumulation has scaled upthe importance of big data came. Big data isa huge set of data -bothstructured and unstructured, that are complex and voluminous that traditionaldata processing software’s are unable to make use of them. Old times peopleused Relational Database Management Software’s(RDBMS). The data were brought to the software for processing. Butnow the volume of data hiked a lot. Today we need parallel software’s runningon thousands of servers where software’s are taken to the data.
The mainchallenges for big data are data capturing, storage of data, analysing data, search,sharing, transfer, visualisation, querying, updating and information privacy.But why should we use big data? The increaseof the storage capacity, processing power and the availability of data makesthe big data’s growth rapidly. There are mainly three characteristics for bigdata or else known as the 3V’s: Volume, Variety, Velocity.
Volume refers to theenormous space that generated and processed data is acquiring. Variety on the otherhand refers to the different types of data including both structured and unstructured.Compared to olden days today data comes in forms of photos, videos, pdf’s, emailetc. The variety in unstructured data makes the storage and processing a bit difficulttask. Velocity is another dimension of big data which refers to the speed atwhich data is generated, processed and stored.
Other than these 3 dimensionstwo additional characteristics are also added to this namely Variability and Veracity. Big data asa common term is used by many organisations large and small. A very goodexample is banks who use data analysis for reducing the risks and to obtain agood customer satisfaction. It can also be used for taking an effective decision,to increase the quality and productivity of goods, to implement changes andmodifications in student curriculum and so on. The importance of big data comeswhen it is used by organisations and governments to find out issues or problemsof any organisation, spotting frauds and crimes, in taking accurate decisions,getting a good public review, to minimise the working time and increase theefficiency of work.
Data is money is an old saying. For that we need a lot of tools,techniques and algorithms for extracting the raw data. While considering thetechnologies that handle the data, came the importance of operational and analyticalbig data. Operational big data provides operational capabilities for real time workloadswhere data is at first collected and stored, while using Analytical big data weuse Massive Parallel Processing(MPP) systems and MapReduce for analysing complexdata sets which may cover the whole data set.
Hadoop is this kind of a toolwhich uses MapReduce algorithms for mining the data. MapReduce is an algorithmwhich uses two techniques namely mapping and reducing for getting refined data. In general,big data is a phrase to describe the massive amount of data that needs to becollected, processed and stored because of the incapability of traditionaldatabases and software’s. The big data technology and market for big data isexpected to reach 57 billion by 2020.It would for sure help and make life easierand even without our notice big data is making a massive impact on our dailylife.