“ Life is the application of noble and profound ideas to life. ”
— Matthew Arnold
Recent Forbes magazine article at http://www.forbes.com/sites/davefeinleib/2012/07/24/big-data-trends/ gets to address the urgency and advent of Big Data into corporate leaders apart from IT and Business leaders working together or in isolation to get the genie of Big Data OUT!
Data that is the cornerstone of any Application – be it Unstructured Content / media based or Structured set of relational and other emerging columnar and other types of databases. We had the Relational SQL and Content Management Systems dominate this space of any Web applications from basic Web sites to complex applications. Is that era getting over with all the HADOOPla?
What is Big in Big Data? Is it the Volume, Variability, Velocity or Variety? It is combination of all these with high level of Complexity. The initial space for it comes from immense spatial data and the Social CRM and Analytics that Business would like to use for PREDICTIVE Analytics – not just old fashioned FACT based BI where it is post mortem of data collections. So the data comes in REAL time from various sources and Business leaders would like that to be trimmed to get to Social Sentiments to fine tune advertisements and product/ service marketing to consumers.
Management gurus from Mckinsey opine Big Data is the greatest Next frontier for Innovation, Competition and Productivity – http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation
BCG , Gartner, Forrester and a whole lot of Business Technology and IT Strategy and Architecture leaders are all taking Big Data to the height of the HYPE Cycle. The Business and Technology TRANSFORMATION discussions without taking Big Data into account are rare these days! Have you looked at the Big Data strategy for your organization of late?
What Industries are now caught in the Big Data analysis? It is all the industries that thrive on market intelligence, the old fashioned way relying on marketing and consumer advocacy media and other research sources. The target data sets are running in PETA scale and hence the talk of High Performance Computing – HPC and Parallel processing dominate Big Data discussions.
Did we not address parallel processing of data even with relational and non-relational databases of yester years? The answer is YES but the Moore’s law of hardware CPU and multi-core systems were not invented in that era. So there is a change in hardware, software ecosystem in the Cloudy Servic e oriented world and the sleuth of PROGRAMMING Styles coupled together offer at the best “Nightmare’s on Information Highway of Internet”.
Can one product or the adoption of HADOOP to process PETA bytes of data collected by Enterprises solve Business problems? The answer is an ABSOLUTE NO from every vendor. The acquisition spree of Big Data is not yet over. Aster Data by Terradata, Vivismo by IBM, Vertica by HP , HANA from SAP and all the HADOOP offerings from all vendors taking advantage of the OPEN SOURCE vendor Apache – does this remind the swap of HTTP of all vendors with Apache HTTP and subsequently taking all those COMMONS and multitude of FRAMEWORKS to development fabrics to create today’s world of PaaS – Platform as a Service offerings?
So Big Data PaaS, SaaS models emerge as IaaS is all combined with HADOOP , Open Source scavenged assets and proprietary IP related products is in the making.
Where can you find the terabytes of information on Big Data?
Try from wikipedia – http://en.wikipedia.org/wiki/Big_data to all the major IT and yester year Statistical Kings like SAS – http://www.sas.com/big-data/ , IBM – http://www-01.ibm.com/software/data/bigdata/ (Talks more of solutions these days as we all know their revenue stream for Services is on the rise as opposed to their product/platform offerings), Oracle – http://www.oracle.com/us/technologies/big-data/index.html , EMC – http://www.emc.com/microsites/bigdata/index.htm (they are happy as STORAGE requirements exponentially rise regardless of where the DATA resides – Cloud or on-premise), Google – https://cloud.google.com/products/big-query.html (need to call it something else ), http://sites.teradata.com/MapReduce/Whitepaper/.ashx?src=AdWords&kw=big%20data ( has to have Data Scientists and other new roles as this space has new ROLES!) , SAP – http://blogs.sap.com/innovation/big-data (look at the URL composition with in-memory data etc that can take you for a spin!) and of course Pentaho, Tableau, Qlikview – the BI engines have Tools too!!
And of course to get a whole catalog of products offerings , one can browse http://www.infoworld.com/t/big-data
But what are the opportunities & Solutions potential to any Industry? It is tough to answer this even after expanded reading and research into Big Data products as none of them single handedly provide any Solution per se. So one needs to do the home work of understanding the Business Capabilities needed before jumping to analyze, select Big Data products. The era of Smart Device + Cloud + Big Data = Infinite solutions and that is reflected on the 1000’s of small niche, Nuevo Cloud Service Analytics and Big data vendor scape.
Big Data is now a corporate cauldron that has all the collections of Information along with multitude of NOISE generated from the outpouring of DATA silos remitting enormous amounts of DATA that someone has to figure out the relevance of Information based on Business priorities and capabilities that are subject to change at all times in REAL time. The secret of separating the real time SIGNAL from this immense noise/chatter of data mixture is a challenge to all and so a DATA Scientist role is all the more relevant!