To discover the shape and structure of the big data market, the San Francisco-based startup Relato took a unique approach to market research and created the first fully data-driven market report. Company CEO Russell Jurney and his team collected and analyzed raw data from a variety of sources to reveal a boatload of business insights about the big data space. This exceptional report is now available for free download. Using data analytic techniques such as social network analysis (SNA), Relato exposed the vast and complex partnership network that exists among tens of thousands of unique big data vendors. The dataset Relato collected is centered around Cloudera, Hortonworks, and MapR, the major platform vendors of Hadoop, the primary force behind this market. From this snowball sample, a 2-hop network, the Relato team was able to answer several questions, including: - Who are the major players in the big data market? - Which is the leading Hadoop vendor? - What sectors are included in this market and how do they relate? - Which among the thousands of partnerships are most important? - Who’s doing business with whom? Metrics used in this report are also visible in Relato’s interactive web application, via a link in the report, which walks you through the insights step-by-step.
My editors at O'Reilly and I thought this work a significant milestone: the first FULLY data-driven market report. I'm not sure anyone else marked it the way we did, but the claim is probably true. Future historians will utter my name for all etern... no. It was cool, though. Ben Lorica pitched me on the idea after I gathered all the partnerships of the four major Hadoop vendors at the time and it was hundreds of companies. This report is based on one more hop out from that - the partnerships of all those hundreds of companies.
There were two interesting conclusions to the report. First, Hortonworks had rapidly caught up with Cloudera in the marketplace - something that was not obvious to many at the time. Second, there was a cool chart showing how economic clusters interacted: new data platforms were more data-driven (connected to analytics market) than old data platforms.
I will say that after this report came out, a second and greatly improved fully data-driven market report followed it from Spiderbook and O'Reilly... and that one was great. It delivered on what I hoped to do with this report as they had spent much longer collecting data. Their report is called The New Artificial Intelligence Market, 2016.