Digital Edition

SYS-CON.TV
Happiness Is… a Handhold on Hadoop
For a Hadoop solution do we look inside or outside?

This post is sponsored by The Business Value Exchange and HP Enterprise Services

As we know, the subject of Big Data and the ‘space race' to produce software application development functions that will enable us to extract insight and (therefore) value from the Big Data mountain remains one of the most discussed issues in information technology today.

Increasingly prevalent and popular, if not quite as ‘predominant' as some would have us believe, in this arena is Apache Hadoop. This software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.

But there's a problem, because Hadoop is drastically underutilized in two respects:

  • Full-blown implementations of Hadoop are argued to be extremely technically difficult to pull off.
  • Implementations that do exist are argued to only take advantage of a fraction of what might be represented in a complete deployment in terms of data management and sheer number crunching power.

What's the answer?

Do we look inside (@ logs) or outside (@ architecture)?
For a Hadoop solution do we look inside or outside? That is to say, do we look inside at logs and logfiles as we tinker around to perfect our Hadoop installation? Or do we look at higher level and look at the architectural considerations that should be governing any individual instance of Hadoop to get some greater insight into what should be working?

Looking inside at logs and logfiles - these are files that record "events" occurring throughout an operating system or software application or data management environment such as Apache Hadoop.

If we look at how our logs and logfiles are performing, then we can get information on hidden: errors, anomalies, problems and patterns... and these are the sorts of reports that can help guide DevOps (developer-operations) pros as they attempt to being a Hadoop project online.

The HP System Management Homepage (SMH) software function provides this kind of information to users working directly with the firm's own dedicated software for particular hardware. Elsewhere there are products such as XpoLog Augmented Search 5.0, which brings XpoLog's troubleshooting capabilities to the Hadoop platform. Put simply, it's a big expanding market.

... and then outside (@ architecture)?
The converse approach (actually it should be corollary and complementary one) here is to focus more closely on the outside, i.e., the architecture inside which an instance of Hadoop is created. HP provides its own Reference Architectures for Hadoop and this is available for each of the three leading distributions (Cloudera, Hortonworks and MapR).

This sponsored HP commentary has highlighted the firm's own product initially, but thankfully HP is big and bold enough not to shirk away from us being able to mention other vendors in this space (most of which will be key partners anyway) - so yes indeed competing products do exist from Cisco, Dell, IBM and others.

Ways to Improve the RDBMS with Hadoop
In a comprehensive sub-headed piece entitled Ten Ways To Improve the RDBMS with Hadoop to be found on Business Process Management (BPM) website http://www.ebizq.net/ you can read the following opinion why a good Hadoop installation can help improve the scalability of applications:

"Very low cost commodity hardware can be used to power Hadoop clusters since redundancy and fault resistance is built into the software instead of using expensive enterprise hardware or software alternatives with proprietary solutions. This makes adding more capacity (and therefore scale) easier to achieve and Hadoop is an affordable and very granular way to scale out instead of up. While there can be cost in converting existing applications to Hadoop, for new applications it should be a standard option in the software selection decision tree."

There is much to gain from intelligent implementation of Hadoop, but it's not easy and we need to look both inside and out (and back to front) in terms of where we can get guidance on best practice and efficiency in our implementation.

About Adrian Bridgwater
Adrian Bridgwater is a freelance journalist and corporate content creation specialist focusing on cross platform software application development as well as all related aspects software engineering, project management and technology as a whole.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

CloudEXPO | DevOpsSUMMIT | DXWorldEXPO Silicon Valley 2019 will cover all of these tools, with the m...
Lori MacVittie is a subject matter expert on emerging technology responsible for outbound evangelism...
Technological progress can be expressed as layers of abstraction - higher layers are built on top of...
"Calligo is a cloud service provider with data privacy at the heart of what we do. We are a typical ...
Having been in the web hosting industry since 2002, dhosting has gained a great deal of experience w...
NanoVMs is the only production ready unikernel infrastructure solution on the market today. Unikerne...
SUSE is a German-based, multinational, open-source software company that develops and sells Linux pr...
Your job is mostly boring. Many of the IT operations tasks you perform on a day-to-day basis are rep...
When building large, cloud-based applications that operate at a high scale, it’s important to mainta...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, disc...
Big Switch's mission is to disrupt the status quo of networking with order of magnitude improvements...
Dynatrace is an application performance management software company with products for the informatio...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Ser...
All in Mobile is a mobile app agency that helps enterprise companies and next generation startups bu...
Yottabyte is a software-defined data center (SDDC) company headquartered in Bloomfield Township, Oak...
Serveless Architectures brings the ability to independently scale, deploy and heal based on workload...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it wil...
Every organization is facing their own Digital Transformation as they attempt to stay ahead of the c...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (No...
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in developm...