The Hadoop wars, HP cloud(s) and IBM’s big win

Learn more about Cloudera at Structure Data 2015 Register now

If you are confused about Hewlett-Packard’s cloud plan of action, this week’s guest will lay it out for you. Bill Hilf, the SVP of Helion product management, makes his second appearance on the show (does that make him our Alec Baldwin?) to talk about HP’s many-clouds-one-management layer strategy.

The lineup? Helion Eucalyptus for private clouds which need Amazon Web Services API compatibility; Helion OpenStack for the rest and HP Cloud Development Platform (aka Cloud Foundry) for platform as a service. Oh and there’s HP Public Cloud which I will let him tell you about himself.

But first Derrick Harris and I are all over IBM’s purchase of AlchemyAPI, the cool deep learning startup that does stuff like identifying celebs and wanna-be celebs from their photos. It’s a win for IBM because all that coolness will be sucked into Watson and expand the API set Watson can parlay for more useful work. (I mean, winning Jeopardy is not really a business model, as IBM Watson exec Mike Rhodin himself has pointed out.)

At first glance it might seem that a system that can tell the difference between Will Ferrell and Chad Smith might be similarly narrow, but after consideration you can see how that fine-grained, self-teaching technology could find broader uses.

AlchemyAPI CEO Elliot Turner and IBM Watson sales chief Stephen Gold shared the stage at Structure Data last year. Who knows what deals might be spawned at this year’s event?

at this year’s event?


Also we’re happy to follow the escalating smack talk in the Hadoop arena as Cloudera CEO Tom Reilly this week declared victory over the new Hortonworks-IBM-Pivotal-backed Open Data Platform effort which we’re now fondly referring to as the ABC or “Anyone But Cloudera” alliance.

It’s a lively show so have a listen and (hopefully) enjoy.


Datawatch: The Leader of Big Data Stocks

Datawatch: The Leader of Big Data Stocks

big-data-stocksThis is the first segment of a two-part article about DataWatch. Click here  to view Part 2, titled DataWatch is Cheap at $28.

I love investing in a small company undergoing a transition. The reason is simply because investor perception and can be incorrect, and that can create a huge opportunity for value investors.

Big Data Stocks: Datawatch

One company that is ripe for profits is DataWatch (NASDAQ: DWCH). The company is transitioning to become a leader in “big data.” Thanks to this transformation and the company’s small market capitalization of just $185 million, investors haven’t yet recognized the big opportunity with the data stock.

So what’s happening at DataWatch? The company has a new management team from IBM at the helm and an acquisition that will revamp its product portfolio.

Revenue growth will be ramping up next quarter. And the valuations of competitors indicate that this data stock could double in price.

Two of the most successful recent IPOs were Splunk (NASDAQ: SPLK) and Tableau (NASDAQ: DATA). Both are software companies competing in the space of enterprise analytics, providing easy access to big data in real-time. Because of a major paradigm shift towards the use of visual tools for analyzing data, these stocks have received exceptionally high valuations from Wall Street.

Splunk had its IPO in early 2012. It trades at 27 times trailing sales and is valued at $6.5 billion. Meanwhile, Tableau went public in 2013, trades at 23 times trailing sales and is worth $3.9 billion. Neither company is profitable.

Why is the Street giving these companies given such lofty valuations? There are three main reasons:

  1. High gross margins of more than 85%
  2. Recurring revenues business model
  3. Early innings for “Big Data” industry

Demand for the products and solutions from big data companies like Splunk and Tableau is rapidly growing, and has resulted in incredible growth over the past year. Splunk is on pace to grow revenue 41% in 2013, Tableau almost 100%. These are great companies with bright futures, but unfortunately it appears as if Wall Street has already priced the bulk of their success.

Enter DataWatch (NASDAQ: DWCH). This is a little company that went public in 1992 and is valued at just about $250 million. Here’s the catch: since 2011 DataWatch has been undergoing a major transformation to compete directly with offerings from Splunk and Tableau.

This transition has been led by an all-star management team, with a track record of numerous buyouts in the big data analytics space. In 2011, DataWatch hired Michael Morrison as its new CEO. Morrison came with years of executive experience at Applix, Cognos and IBM.

Morrison was first the COO of Applix, which was bought out by Cognos in 2007 for $339 million. Just a couple months after the acquisition, IBM bought out Cognos for $4.9 billion. These were all enterprise software companies specializing in business analytics, exactly where DataWatch now competes. Not a bad track record, especially given DataWatch was worth well under $100 million when Morrison took his position as CEO.

It may seem hard to imagine why five industry leading executives would all leave their current positions to work for a tiny company like DataWatch. I’m impressed by what they’ve accomplished. In just two years DataWatch has transformed from an outdated niche software vendor and become a dynamic real-time analytics player.

A large part of that shift is due to the recent acquisition of Panopticon Software, a Swedish company specializing in the visualization of real-time big data analytics. DataWatch announced the acquisition in June, and it was completed in late August. The all-stock deal was done and on the conference call, Panopticon’s management specifically pointed to DataWatch’s low valuation as the rationale for selling the company in an all-stock deal.

Industry experts were quick to rave about the acquisition. Philip Howard of Bloor Research sums it up nicely in a blog post: “DataWatch … isn’t very sexy … Panopticon, on the other hand, is sexy.”

Panopticon doesn’t just sound like a trendy software company. But its  numbers are impressive. Although concrete financial data will be tough to find on the company until DataWatch announces its third-quarter results, we do have a couple of hints to work from. On the original conference call it was noted that Panopticon produced $5 million in revenue in 2012, an increase of 112%.

Although it’s very small, Panopticon grew faster than both Splunk and Tableau in 2012. Ironically, DataWatch got this higher growth, for just a fraction of the cost. When it was announced, the deal was worth $33 million, meaning DataWatch paid just 6.6 times sales. That translates into about 25% of the valuation multiple of these better-known competitors.

But what really makes me excited is the potential of combined product offerings from DataWatch and Panopticon. DataWatch has over 40,000 customers, including 99 of the Fortune 100 companies. Meanwhile, Panopticon has just 75 customers. It’s safe to assume that Panopticon’s customers are signing much bigger contracts, similar in structure to what Tableau and Splunk typically receive.

If DataWatch can successfully integrate its new offerings and sell Panopticon’s software to its 40,000 customers, there is huge potential for growth. Even if DataWatch can only sell 1% of its customers the Panopticon services, the business could grow by a factor of four.

Learn more about DataWatch, and why I think the data stock could rise by 50%. To read Part 2 of this article, click here: DataWatch is Cheap at $28.

The One Stock to Own in 2014 — The Year Mobile Takes Over

On Dec. 31, something incredible happened. For the first time in history, the majority of Internet traffic originated from NOT from PCs or desktops — but from mobile devices including smartphones and tablets. We’re never going back. Mobile is taking over. And even though the biggest player in mobile, Apple, is selling over 200 million iPhones this year alone… here at Wyatt Research, we’re recommending the one company no one is taking about. The one reaping massive profits each time a new Apple or Samsung smartphone is activated. In fact, as mobile data usage explodes in the year ahead, its stock is set to soar! Shares are already on the move. So, before this stock moves any higher, read our latest report for all the details: Click here for the full story.


5 steps for transforming your business using data

Not every organization was born digital, but most can still transform their business into a digitalized business, if they follow some of the recipes used by successful digital companies

Organizations that were born digital are built around their IT platform, and all their business processes are IT-driven and data-powered. Every action, every decision, is based on the processing of data sets about users and customers, about usage patterns, external conditions, etc.

Featured Resource
Presented by Good Technology

Harness the benefits of BYOD for your organization including improved user productivity, engagement,

Learn More

But not every organization was born digital. If you run a traditional organization, with various degrees of sophistication in its IT, how can you still transform this business into a digitalized business? How can you create new business opportunities, based on your digital assets?

Here are 5 steps you should take that digital leaders use.

Drive your business intelligence toward real-timeliness

Chances are, you are already collecting transactional data in a data warehouse or data mart, and you analyze it somehow. Maybe it’s to decide which sales rep to promote of fire at the end of each quarter. Or it’s to establish a list of customers to send the new catalog to. The first step toward a digital business is to obtain this insight on a real-time basis. Now, “real time” does not always mean sub-second! In fact, I have argued before that “right time” is a better term: get insight when this insight is relevant to impact the business.

Instead of just using your sales reps performance ranking for HR reasons, use them to send your best reps the best leads, and to provide a gamified challenge to these who usually do great but are having a (hopefully temporary) hard time. And use customer segmentation for in-app promotion and text-message-coupons.

Of course, all of this requires insight to be obtained in right time. Agility and instantaneity constitute one of the foundations of digitalization.

Inject analytics into all business processes

Obtaining insight from real-time analytics is only the beginning. Business processes need to be retooled to accept this insight as an input. When business intelligence was all the craze, this used to be referred to as operational business intelligence — the concept has not changed, only the technology. Real-time analytics, rules engines, intelligent business process management tools, push-style communication protocols make it possible to inject and modify rules, to trigger and modify actions based on insight.

Always look for more data

Storage is cheap, and the newest breed of data platforms makes it really easy to amass data even if the purpose is not clear. You may not be ready to move your transactional systems, or your data warehousing infrastructure, away from the tried and proven mainframes or RDBMS they are running on. That’s fine. But consider complementing them with a data lake based on Hadoop, and to dump into this lake data you would not have considered worth keeping in the past: access logs, GPS records, abandoned carts, call data records, customer complaint messages, etc.

All of this so-called “dark data” can be used for new insight, for new actions. You may not know which ones yet — but you won’t know until you have had a chance to look.

Explore new ways to use your data

With reason, give unrestricted access to all data to your analysts. I say “with reason”, because some industries have specific regulations that apply, and all industries need to be careful about data privacy. But, assuming you can trust them, and the proper governance exists that would curb any abuse, let your experts explore the data. They will find ideas — some that will work and some that won’t.

It’s always difficult to measure the return on investment of innovation. But innovation happens when people are let to pursue ideas. Google got it right, with their “20 percent time” program.

You may not have official data scientists. But a good business analyst, equipped with modern tools for data preparation/exploration, can achieve amazing results.

Release often, test all the time and fail fast

This last point is probably the most critical one. Digital organizations are agile. They always test innovations in real-life conditions. Once a feature is complete, go live with it and measure its effectiveness. If it does not yield the expected results, be ready to pull back — revert to the previous version, remove the option, try something else. And whenever possible, test different alternatives in parallel to see which one works best.

The key is not to always get it right — nobody does. The key to success is to fail fast, and change course before it’s too late.


Hortonworks chief: Why it’s time for some tough Hadoop decisions

With last month’s launch of the Open Data Platform initiative, the time is fast approaching when larger vendors involved in Hadoop will have to pick an alliance – because the option of going it alone is too risky, according to Hortonworks president Herb Cunitz.

Cunitz, whose firm jointly founded the Open Data Platform with EMC and VMware spinoff Pivotal, believes market pressures will oblige Hadoop firms to decide on their allegiances if they want to exert any influence on the direction of the big-data technology.

“The larger vendors have to proclaim. This market is moving very quickly. They really only have two choices of how to play the market,” Cunitz said.

“[They can either] work with what we’re driving around Hortonworks Data Platform and the Open Data Platform and the alliance; or they can say, ‘Hey, I don’t want to align with anybody. I’m just going to take whatever comes out of Apache Software Foundation’, and then they’re beholden to package it, support it, distribute it themselves.”

However, a number of businesses have already tried that second approach, only to fail, Cunitz said.

“We’ve seen in the market company after company do that and then exit the space because they realise it’s very hard to do. Unless you have the committers, you have no influence on what’s being driven into Apache. You’re just a downstream packager and then it’s a very difficult business to monetise. Unless you have influence, it’s not a monetisable business,” he said.

“It’s very hard for anyone else to influence [Apache Hadoop] because all those people either work for us, or the next company would be Cloudera, or somebody else in the space. There are only a handful of those people around.”

Aimed at defining a core set of Apache technologies to speed adoption of Hadoop, the Open Data Platform’s founding members of GE, Hortonworks, IBM, Infosys, Pivotal, SAS, and AltiScale, plus Capgemini, CenturyLink, EMC, Telstra, Teradata, Splunk, Verizon and VMware will test and certify a number of primary Apache components, which will then form the basis of their Hadoop platforms.

“What everyone agreed on is a Hadoop kernel. Think of it like the Linux kernel, so this doesn’t impact upstream software development in the Apache Software Foundation. That goes on exactly as it’s always done and we’ll continue to be the leader in doing that,” Cunitz said.

“This is downstream from that. These vendors just agree on what the kernel is that they’re going to package and build on top of. That drives alignment across the industry to say if we’ve agreed on a common kernel, it’s a lot easier for us to build the APIs, the interfaces and focus on the things on top of the kernel than to argue what the parts are in the kernel.”

Cunitz said some big companies, such as Microsoft and HP, are notably absent from the Open Data Platform.

“The kernel of YARN, Ambari and Hadoop are the foundation of Hortonworks Data Platform. It’s the same bits [as the Open Data Platform]. Many of the other vendors we’ve already partnered with said, ‘I don’t need to go pay to join that because I’m already getting that by being partnered with Hortonworks,” he said.

“[Open Data Platform] was many of the others, who are not partnered today, saying, ‘Let’s adopt a standardised kernel’. What I do expect is more market uptake, [and] less arguing across vendors.”

Whether any of those already in a relationship with Hortonworks will ultimately end up joining the Open Data Platform remains to be seen but Cunitz expects more companies to join.

“If it stays in the format it is now, it’s fine. I would think naturally over time we would see others who would either join in one of two ways. They would either join the Open Data Platform and the kernel or join like Microsoft and HP have with us on the broader, full Hortonworks Data Platform,” he said.

“My expectation would be you will see others in the market this year proclaim and join into one of those paths – or candidly join an alternative path. Some would argue there is a competing initiative. It’s called Cloudera-Intel. It’s never been positioned that way, but obviously they’re aligned and they’re not part of this and in some ways you could argue that’s already a competing initiative.”

Microsoft’s existing involvement with Hortonworks would not preclude its joining the Open Data Platform one day.

“They may. Don’t read anything into what I’m saying here. They can make their own choice. They are already getting what they would want in terms of driving standardisation through our partnership with Microsoft today because they’ve not only adopted the kernel, they’ve adopted all of Hortonworks Data Platform as their standard, their standard inside Azure, their standard inside HDInsight [the Azure Hadoop cloud service].

“They’re already building what I would call a bigger kernel jointly with us. So for them to say we’ve also agreed on the smaller kernel [of the Open Data Platform], well, they already did it on the bigger one. It doesn’t really buy them anything.”

Last week Hortonworks reported its first quarterly results since its initial public offering in December raised $100m and prompted its stock to rise 65 percent.

“In the last quarter the number that’s most impressive is the 99 new subscription customers. That means 99 new companies have become customers and are paying us to go support them and work with them on Hortonworks as a platform. The quarter prior we had 232 total customers. So to go from 232 to add 99 in one quarter, you can do the math on the slope of that curve,” Cunitz said.

“Those 99 new customers are by default using the exact same kernel that’s in the Open Data Platform. So as, an example, if they choose to say, ‘I want to run the Pivotal HAWQ SQL engine’ that can now plug right into Horton Data Platform and the kernel and they can take advantage of that. It’s certified and it runs out of the box.”

“We’re in the early stages of it and this is by far from not over for anybody but we’re very comfortable on the direction and the strategy and how this is playing out.”

More on Hadoop and big data


NIH-led effort launches Big Data portal for Alzheimer’s drug discovery

A National Institutes of Health-led public-private partnership to transform and accelerate drug development achieved a significant milestone today with the launch of a new Alzheimer’s Big Data portal — including delivery of the first wave of data — for use by the research community. The new data sharing and analysis resource is part of the Accelerating Medicines Partnership (AMP), an unprecedented venture bringing together NIH, the U.S. Food and Drug Administration, industry and academic scientists from a variety of disciplines to translate knowledge faster and more successfully into new therapies.

“We are determined to reduce the cost and time it takes to discover viable therapeutic targets and bring new diagnostics and effective therapies to people with Alzheimer’s.”

Francis S. Collins, M.D., Ph.D.
Director, National Institute of Health

The opening of the AMP-AD Knowledge Portal External Web Site Policy and release of the first wave of data will enable sharing and analyses of large and complex biomedical datasets. Researchers believe this approach will ramp up the development of predictive models of Alzheimer’s disease and enable the selection of novel targets that drive the changes in molecular networks leading to the clinical signs and symptoms of the disease.

“We are determined to reduce the cost and time it takes to discover viable therapeutic targets and bring new diagnostics and effective therapies to people with Alzheimer’s. That demands a new way of doing business,” said NIH Director Francis S. Collins, M.D., Ph.D. “The AD initiative of AMP is one way we can revolutionize Alzheimer’s research and drug development by applying the principles of open science to the use and analysis of large and complex human data sets.”

Developed by Sage Bionetworks External Web Site Policy, a Seattle-based non-profit organization promoting open science, the portal will house several waves of Big Data to be generated over the five years of the AMP-AD Target Discovery and Preclinical Validation Project by multidisciplinary academic groups. The academic teams, in collaboration with Sage Bionetworks data scientists and industry bioinformatics and drug discovery experts, will work collectively to apply cutting-edge analytical approaches to integrate molecular and clinical data from over 2,000 postmortem brain samples.

The National Institute on Aging (NIA) at NIH supports and coordinates the multidisciplinary groups contributing data to the portal. The AMP Steering Committee for the Alzheimer’s Disease Project is composed of NIA and the National Institute of Neurological Disorders and Stroke, both of NIH, the U.S. Food and Drug Administration, four pharmaceutical companies (AbbVie, Biogen Idec, GlaxoSmithKline and Lilly) and four non-profit groups (Alzheimer’s Association, Alzheimer’s Drug Discovery Foundation, Geoffrey Beene Foundation and USAgainst Alzheimer’s) and is managed through the Foundation for the NIH.

“The enormous complexity of the human brain and the processes involved in development and progression of Alzheimer’s disease have been major barriers to drug development,” said NIA Director Richard J. Hodes, M.D. “Now that we are gathering the data and developing the tools needed to tackle this complexity, it is key to make them widely accessible to the research community so we can speed up the development of critically needed therapies”

The consortium of academic teams contributing the data are led by researchers at the following institutions: Eric Schadt, Ph.D., Icahn School of Medicine at Mount Sinai, New York; Philip De Jager, M.D., Ph.D., Eli and Edythe L. Broad Institute of MIT and Harvard, Boston; Todd Golde, M.D., Ph.D., University of Florida, Gainesville; and Alan Levey, M.D., Ph.D., Emory University, Atlanta. Researchers from Rush University, Chicago; Mayo Clinic, Jacksonville, Fla.; Institute for Systems Biology, Seattle; the University of California, Los Angeles and a number of other academic centers are also participating.

Because no publication embargo is imposed on the use of the data once they are posted to the AMP-AD Knowledge Portal, it increases the transparency, reproducibility and translatability of basic research discoveries, according to Suzana Petanceska, Ph.D., NIA’s program director leading the AMP-AD Target Discovery Project.

“The era of Big Data and open science can be a game-changer in our ability to choose therapeutic targets for Alzheimer’s that may lead to effective therapies tailored to diverse patients,” Petanceska said. “Simply stated, we can work more effectively together than separately.”

About AMP-AD: The Alzheimer’s disease initiative is a project of the Accelerating Medicines Partnership, a joint venture among the National Institutes of Health, the Food and Drug Administration, 10 biopharmaceutical companies and multiple non-profits, managed by the Foundation for the NIH, to identify and validate promising biological targets of disease. AMP-AD is one of the three initiatives under the AMP umbrella; the other two are focused on type 2 diabetes and the autoimmune disorders rheumatoid arthritis and systemic lupus erythematosus. To learn more about the AMP-AD projects please visit:

About the National Institute on Aging: The NIA leads the federal government effort conducting and supporting research on aging and the health and well-being of older people. It provides information on age-related cognitive change and neurodegenerative disease specifically at its Alzheimer’s Disease Education and Referral (ADEAR) Center at

About the National Institutes of Health (NIH): NIH, the nation’s medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit


Microsoft announces quarterly earnings release date

Microsoft to host earnings conference call webcast.

REDMOND, Wash. — December 10, 2014 — Microsoft Corp. will publish fiscal year 2015 second-quarter financial results after the close of the market on Monday, January 26, 2015 on the Microsoft Investor Relations website at  A live webcast of the earnings conference call will be made available at 2:30 p.m. Pacific Time on the Microsoft Investor Relations website at

Founded in 1975, Microsoft (Nasdaq “MSFT”) is the worldwide leader in software, services, devices and solutions that help people and businesses realize their full potential.

For more information, financial analysts and investors only:

Investor Relations, Microsoft, (425) 706-4400

Note to editors: For more information, news and perspectives from Microsoft, please visit the Microsoft News Center at Web links, telephone numbers and titles were correct at time of publication, but may since have changed. Shareholder and financial information is available at