Digital Edition

SYS-CON.TV
Shots Across the Data Lake
Big Data Analytics Range War

Range Wars
The settling of the American West brought many battles between ranchers and farmers over access to water. The farmers claimed land near the water and fenced it to protect their crops. But the farmers' fences blocked the ranchers' cattle from reaching the water. Fences were cut; shots were fired; it got ugly.

About a century later, with the first tech land rush of the late1980s and early '90s - before the Web - came battles between those who wanted software and data to be centrally controlled on corporate servers and those who wanted it to be distributed to workers' desktops. Oracle and IBM versus Microsoft and Lotus. Database versus Spreadsheet.

Now, with the advent of SoMoClo (Social, Mobile, Cloud) technologies and the Big Data they create, have come battles between groups on different sides of the "Data Lake" over how it should be controlled, managed, used, and paid for. Operations versus Strategy. BI versus Data Science. Governance versus Discovery.  Oversight versus Insight.

The range wars of the Old West were not a fight over property ownership, but rather over access to natural resources. The farmers and their fences won that one, for the most part.

Those tech battles in the enterprise are fights over access to the "natural" resource of data and to the tools for managing and analyzing it.

In the '90s and most of the following decade, the farmers won again. Data was harvested from corporate systems and piled high in warehouses, with controlled accessed by selected users for milling it into Business Intelligence.

But now in the era of Big Data Analytics, it is not looking so good for the farmers. The public cloud, open source databases, and mobile tablets are all chipping away at the centralized command-and-control infrastructure down by the riverside.  And, new cloud based Big Data analytics solution providers like BigML, Yottamine (my company) and others are putting unprecedented analytical power in the hands of the data ranchers.

A Rainstorm, Not a River
Corporate data is like a river - fed by transaction tributaries and dammed into databases for controlled use in business irrigation.

Big Data is more like a relentless rainstorm - falling heavily from the cloud and flowing freely over and around corporate boundaries, with small amounts channeled into analytics and most draining to the digital deep.

Many large companies are failing to master this new data ecology because they are trying to do Big Data analytics in the same way, with the same tools as they did with BI, and that will never work. There is a lot more data, of course, but it is different data - tweets, posts, pictures, clicks, GPS, etc., not RDBMS records - and different analytics - discovery and prediction, not reporting and evaluation.

Successfully gleaning business value from the Big Data rainstorm requires new tools and maybe new rules.

Embracing Shadows
These days, tech industry content readers frequently see the term "Shadow IT" referring to how business people are using new technologies to process and analyze information without the help of "real IT".  SoMoClo by another, more sinister name.  Traditionalists see it as a threat to corporate security and stability and modernists a boon to cost control and competitiveness.

But, it really doesn't matter which view is right.  Advanced analytics on Big Data takes more computing horsepower than most companies can afford.  Jobs like machine learning from the Twitter Fire Hose will take hundreds or even thousands of processor cores and terabytes of memory (not disk!) to build accurate and timely predictive models.

Most companies will have no choice but to embrace the shadow and use AWS or some other elastic cloud computing service, and new, more scalable software tools to do effective large scale advanced analytics.

Time for New Rules?
Advanced Big Data analytics projects, the ones of a scale that only the cloud can handle, are being held back by reservations over privacy, security and liability that in most cases turn out to be needless concerns.

If the data to be analyzed were actual business records for customers and transactions as it is in the BI world, those concerns would be reasonable.  But more often than not, advanced analytics does not work that way.  Machine learning and other advanced algorithms do not look at business data. They look at statistical information derived from business data, usually in the form of an inscrutable mass of binary truth values that is only actionable to the algorithm.  That is what gets sent to the cloud, not the customer file.

If you want to do advanced cloud-scale Big Data analytics and somebody is telling you it is against the rules, you should look at the rules.  They probably don't even apply to what you are trying to do.

First User Advantage
Advanced Big Data analytics is sufficiently new and difficult that not many companies are doing much of it yet.  But where BI helps you run a tighter ship, Big Data analytics helps you sink your enemy's fleet.

Some day, technologies like high performance statistical machine learning will be ubiquitous and the business winners will be the ones who uses the software best.  But right now, solutions are still scarce and the business winners are ones willing to use the software at all.

About Tim Negris
Tim Negris is SVP, Marketing & Sales at Yottamine Analytics, a pioneering Big Data machine learning software company. He occasionally authors software industry news analysis and insights on Ulitzer.com, is a 25-year technology industry veteran with expertise in software development, database, networking, social media, cloud computing, mobile apps, analytics, and other enabling technologies.

He is recognized for ability to rapidly translate complex technical information and concepts into compelling, actionable knowledge. He is also widely credited with coining the term and co-developing the concept of the “Thin Client” computing model while working for Larry Ellison in the early days of Oracle.

Tim has also held a variety of executive and consulting roles in a numerous start-ups, and several established companies, including Sybase, Oracle, HP, Dell, and IBM. He is a frequent contributor to a number of publications and sites, focusing on technologies and their applications, and has written a number of advanced software applications for social media, video streaming, and music education.



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

Lori MacVittie is a subject matter expert on emerging technology responsible for outbound evangelism...
Dynatrace is an application performance management software company with products for the informatio...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Ser...
Having been in the web hosting industry since 2002, dhosting has gained a great deal of experience w...
NanoVMs is the only production ready unikernel infrastructure solution on the market today. Unikerne...
All in Mobile is a mobile app agency that helps enterprise companies and next generation startups bu...
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO Silicon Valley 2019 will cover all of these tools, with the m...
SUSE is a German-based, multinational, open-source software company that develops and sells Linux pr...
Yottabyte is a software-defined data center (SDDC) company headquartered in Bloomfield Township, Oak...
Your job is mostly boring. Many of the IT operations tasks you perform on a day-to-day basis are rep...
Serveless Architectures brings the ability to independently scale, deploy and heal based on workload...
Technological progress can be expressed as layers of abstraction - higher layers are built on top of...
When building large, cloud-based applications that operate at a high scale, it’s important to mainta...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it wil...
Big Switch's mission is to disrupt the status quo of networking with order of magnitude improvements...
Every organization is facing their own Digital Transformation as they attempt to stay ahead of the c...
"Calligo is a cloud service provider with data privacy at the heart of what we do. We are a typical ...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, disc...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (No...
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in developm...