Digital Edition

SYS-CON.TV
Happiness Is… a Handhold on Hadoop
For a Hadoop solution do we look inside or outside?

This post is sponsored by The Business Value Exchange and HP Enterprise Services

As we know, the subject of Big Data and the ‘space race' to produce software application development functions that will enable us to extract insight and (therefore) value from the Big Data mountain remains one of the most discussed issues in information technology today.

Increasingly prevalent and popular, if not quite as ‘predominant' as some would have us believe, in this arena is Apache Hadoop. This software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.

But there's a problem, because Hadoop is drastically underutilized in two respects:

  • Full-blown implementations of Hadoop are argued to be extremely technically difficult to pull off.
  • Implementations that do exist are argued to only take advantage of a fraction of what might be represented in a complete deployment in terms of data management and sheer number crunching power.

What's the answer?

Do we look inside (@ logs) or outside (@ architecture)?
For a Hadoop solution do we look inside or outside? That is to say, do we look inside at logs and logfiles as we tinker around to perfect our Hadoop installation? Or do we look at higher level and look at the architectural considerations that should be governing any individual instance of Hadoop to get some greater insight into what should be working?

Looking inside at logs and logfiles - these are files that record "events" occurring throughout an operating system or software application or data management environment such as Apache Hadoop.

If we look at how our logs and logfiles are performing, then we can get information on hidden: errors, anomalies, problems and patterns... and these are the sorts of reports that can help guide DevOps (developer-operations) pros as they attempt to being a Hadoop project online.

The HP System Management Homepage (SMH) software function provides this kind of information to users working directly with the firm's own dedicated software for particular hardware. Elsewhere there are products such as XpoLog Augmented Search 5.0, which brings XpoLog's troubleshooting capabilities to the Hadoop platform. Put simply, it's a big expanding market.

... and then outside (@ architecture)?
The converse approach (actually it should be corollary and complementary one) here is to focus more closely on the outside, i.e., the architecture inside which an instance of Hadoop is created. HP provides its own Reference Architectures for Hadoop and this is available for each of the three leading distributions (Cloudera, Hortonworks and MapR).

This sponsored HP commentary has highlighted the firm's own product initially, but thankfully HP is big and bold enough not to shirk away from us being able to mention other vendors in this space (most of which will be key partners anyway) - so yes indeed competing products do exist from Cisco, Dell, IBM and others.

Ways to Improve the RDBMS with Hadoop
In a comprehensive sub-headed piece entitled Ten Ways To Improve the RDBMS with Hadoop to be found on Business Process Management (BPM) website http://www.ebizq.net/ you can read the following opinion why a good Hadoop installation can help improve the scalability of applications:

"Very low cost commodity hardware can be used to power Hadoop clusters since redundancy and fault resistance is built into the software instead of using expensive enterprise hardware or software alternatives with proprietary solutions. This makes adding more capacity (and therefore scale) easier to achieve and Hadoop is an affordable and very granular way to scale out instead of up. While there can be cost in converting existing applications to Hadoop, for new applications it should be a standard option in the software selection decision tree."

There is much to gain from intelligent implementation of Hadoop, but it's not easy and we need to look both inside and out (and back to front) in terms of where we can get guidance on best practice and efficiency in our implementation.

About Adrian Bridgwater
Adrian Bridgwater is a freelance journalist and corporate content creation specialist focusing on cross platform software application development as well as all related aspects software engineering, project management and technology as a whole.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

Discussions of cloud computing have evolved in recent years from a focus on specific types of cloud,...
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost...
In his session at 20th Cloud Expo, Scott Davis, CTO of Embotics, discussed how automation can provid...
Modern software design has fundamentally changed how we manage applications, causing many to turn to...
We are seeing a major migration of enterprises applications to the cloud. As cloud and business use ...
In this presentation, you will learn first hand what works and what doesn't while architecting and d...
Everyone wants the rainbow - reduced IT costs, scalability, continuity, flexibility, manageability, ...
DXWorldEXPO LLC announced today that "IoT Now" was named media sponsor of CloudEXPO | DXWorldEXPO 20...
Founded in 2000, Chetu Inc. is a global provider of customized software development solutions and IT...
SYS-CON Events announced today that DatacenterDynamics has been named “Media Sponsor” of SYS-CON's 1...
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with e...
Most DevOps journeys involve several phases of maturity. Research shows that the inflection point wh...
Dynatrace is an application performance management software company with products for the informatio...
Today, we have more data to manage than ever. We also have better algorithms that help us access our...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: D...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held Novemb...
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22...
DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, ...
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018,...
SYS-CON Events announced today that IoT Global Network has been named “Media Sponsor” of SYS-CON's @...