Science creates knowledge via controlled experiments, so a data query isn’t an experiment. An experiment suggests controlled conditions; data scientists stare at data that someone else collected, which includes any and all sample biases.
Now, before you drag out the pitchforks: I’m not a query hater. You won’t see me standing outside the Oracle Open World conference with a sign that says “NO SQL” on it. Queries are fine. Smart people don’t always have the right answer, but they need to ask the right questions. Yes, building a query is like “forming a hypothesis,” but at that point we enter the realm of observational or “soft” science. Yes, by this standard, Astronomy and Social Sciences are also not sciences. I have no idea what Computer Science is, but no, it’s not a science either.
Oh what’s that? Your kind of “Data Science” includes things such as A|B Testing, and your “experiments” actually involve executing designs that affect the world? Allow me to retort: that’s not Data Science, that’s actually doing a job. You might have a job title like Product Management or Marketing. But if your job title is “Data Scientist,” you are effectively removing yourself from the actual creation of data.
I do sympathize. I appreciate that it’s no longer sexy to be a Database Administrator, and I guess the term “Business Analyst” is a bit too 1980’s. Slapping “Data Warehousing” on a resume is probably not going to land you a job, and it’s way down there with “Systems Analyst” on the cool-factor scale. If you’re going to make up a cool-sounding job title for yourself, “Data Scientist” seems to fit the bill. You can go buy a lab coat from a medical-supply surplus store and maybe some thick glasses from a costume shop. And it works! When you put “Data Scientist” on your LinkedIn profile, recruiters perk up, don’t they? Go to the Strata conference and look on the jobs board—every company wants to hire Data Scientists.
OK, so we want to be “Data Scientists” when we grow up, right? Wrong. Not only is Data Science not a science, it’s not even a good job prospect. In the immortal words of Admiral Akbar: “It’s a trap.”
These companies expect data scientists to (from a real job posting): “develop and investigate hypotheses, structure experiments, and build mathematical models to identify… optimization points.” Those scientists will help build “a unique technology platform dedicated to… operation and real-time optimization.”
Well, that sounds like a reasonable—albeit buzzword-filled—job description, no? There is going to be a ton of data in the future, certainly. And interpreting that data will determine the fate of many a business empire. And those empires will need people who can formulate key questions, in order to help surface the insights needed to manage the daily chaos. Unfortunately, the winners who will be doing this kind of work will have job titles like CEO or CMO or Founder, not “Data Scientist.” Mark my words, after the “Big Data” buzz cools a bit it will be clear to everyone that “Data Science” is dead and the job function of “Data Scientist” will have jumped the shark.
Yes, more and more companies are hoarding every single piece of data that flows through their infrastructure. As Google Chairman Eric Schmidt pointed out, we create more data in a single day today than all the data in human history prior to 2013.
Unfortunately, unless this is structured data, you will be subjected to the data equivalent of dumpster diving. But surfacing insight from a rotting pile of enterprise data is a ghastly process—at best. Sure, you might find the data equivalent of a flat-screen television, but you’ll need to clean off the rotting banana peels. If you’re lucky you can take it home, and oh man, it works! Despite that unappetizing prospect, companies continue to burn millions of dollars to collect and gamely pick through the data under respective roofs. What’s the time-to-value of the average “Big Data” project? How about “Never”?
If the data does happen to be structured data, you will probably be given a job title like Database Administrator, or Data Warehouse Analyst.
When it comes to sorting data, true salvation may lie in automation and other next-generation processes, such as machine learning and evolutionary algorithms; converging transactional and analytic systems also looks promising, because those methods deliver real-time analytic insight while it’s still actionable (the longer data sits in your store, the less interesting it becomes). These systems will require a lot of new architecture, but they will eventually produce actionable results—you can’t say the same of “data dumpster diving.” That doesn’t give “Data Scientists” a lot of job security: like many industries, you will be replaced by a placid and friendly automaton.
So go ahead: put “Data Scientist” on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you’ll be the King or Queen of a rotting whale-carcass of data. And when you talk to Master Data Management and Data Integration vendors about ways to, er, dispose of that corpse, you’ll realize that the “Big Data” vendors have filled your executives’ heads with sky-high expectations (and filled their inboxes with invoices worth significant amounts of money). Don’t be the data scientist tasked with the crime-scene cleanup of most companies’ “Big Data”—be the developer, programmer, or entrepreneur who can think, code, and create the future.
Miko Matsumura is a Vice President at Hazelcast, an open source in-memory data grid company. He is a 20-year veteran of Silicon Valley.
The leading vendors in the MBaaS space ordered by viability/stability just got shuffled.
My previous post detailed a whole slew of Mobile-Backend-as-a-Service (MBaaS) Vendors who had all raised less than 1M in venture capital and predicted that those small MBaaS vendors would be caught in the “Series A Crunch” This post has turned out to (so far) accurately predict that none of these vendors would be able to raise any more money.
The vendors called out in the “Safe Zone” were Kii (profitable), Stackmob (7.5M), Kinvey (7.02M) and Parse (7M). My direct source in Silicon Valley reports that Parse was raising a significant round of Venture Capital when it decided, instead, to take the acquisition offer from FaceBook.
Given the burn rate, organizational size and growth of this space (as well as the validation of the exit potential of this space), the other “Safe Zone” vendors should have no trouble raising additional capital and we should expect to see several announcements of this sort coming soon. We expect those who raised around 7M should have burnt through most of it by now. Linkedin shows a Kinvey of about 20 people, and a Stackmob of about 29 people.
So the list now puts Parse at the top of the vendor viability equation in MBaaS… However, with the Facebook acquisition comes a lot of questions.
Here are the the data points we have collected from reputable sources:
* The rumored amount of the transaction was 85M
* Facebook rep was quoted as saying the sum was “Not Material” (they wont file an 8K with the SEC)
* Total amount of VC raised by Parse was about 7M
* 85M exit is consistent with the kind of returns VC would expect
* I spoke to a VC who was planning to invest in their large B round
* Obviously they didnt raise a round but chose to go with FB
* There must have been other bidders, hence the VC Exit level pricing
* Illiya said that they liked FB the best for executing their plans
* 85M equals 0.085 Instagrams for Facebook
* This price puts them into mixed territory–it’s not exactly an
acqui-hire but also not an instagram either
* Both parties claim that the service will continue uninterrupted
* We’ll see about this
* This is a very different business model from what FB is used to
* Linkedin shows the company to be about 12 people, maybe more
So what does this mean?
1) Validation for the MBaaS space: this will grease the skids for
Stackmob and Kinvey to raise big rounds or to exit at a neighborhood
multiple, say about 40-60M each
2) Further investment for smaller players: I think not. I think the
smaller players will continue to scratch in the desert–however,
smaller players *WILL* experience acqui-hire by bigger players
3) Facebook is probably not sure what’s going to happen, but it seems
sure that MOBILE is a huge deal, DEVELOPERS are also a huge deal and
they are aggressively trying to flip the company in that direction.
4) If Facebook doesn’t squash these entrepreneurs (at 85M in probably
mostly stock, it’s not above company politics) then it puts them in
direct competition with Amazon, Heroku, Google and Apple. Add in
FaceBook Home and also Instagram and it does put them in Google and
In terms of consolidation: I expect that this will drive some of the big players to make acquisitions, but this was an unusual deal. I would expect some of the smaller vendors to be acqui-hired over the next six months by the likes of Google, Apple, Amazon, Heroku and other big cloud providers. An orthogonal app/platform provider with momentum might also do an all stock acqui-hire like maybe a Yahoo, Box.com or an Evernote.
The short version is that the huge influx of Angel and SuperAngel investors has created a glut of seed funded companies, VERY FEW of which will ever get Series A financing. We’re in essentially what Marc Andreesen called a nuclear winter for Series A financing (back in 2008).
In this article by Sara Lacy of PandoDaily (who knows everyone!) she writes:
Companies we never really got to know are simply starting to fade away. Multiply that by literally a couple thousand, and that’s what 2013 is going to look like in Silicon Valley, and to a lesser degree some other startup ecosystems. “The numbers just don’t add up,” says Jon Callaghan of True Ventures. “There are a minimum of 2,000 companies per year getting funded and coming out if incubators, and there are only 750 VCs that call themselves ‘active.’ But when you look at who is doing at least two deals a quarter, the numbers fall to just 200 firms. Those firms are only going to do a few Series A deals a year.” When you look at the number of firms who invest at least $1 million a quarter for at least four straight quarters, the number drops further: To a paltry 97 firms.
Small MBaaS Players Won’t Be Able to Raise VC Money
lets take a look at the Mobile Backend-as-a-Service (MBaaS) space from this perspective. These are companies that provide mobile developers with APIs and SDKs to quickly drop in cloud services into apps. This means that apps won’t need to write their own cloud backends, thus saving time, risk, and cost and enabling developers to focus on front-end and end user concerns.
Unlike Starbucks drinks, MBaaS vendors fall into three groups, big, medium and small.
The bigger ones: Several vendors that are counted among the MBaaS crowd have raised serious bucks. I count among those companies like Appcelerator which has raised $50.2 M and APIgee at a cosmological 72.1 M. I see Appcelerator as a full Enterprise Mobility player (who got into MBaaS via acquisition of Cocoafish), and you can see from their crunchbase entry that competitors include Adobe Systems, Sencha, Xamarin. I see APIgee as an Enterprise API Management company which only lists Mashery as a competitor (although 3scale also competes with them and to a lesser extent WsO2) who got into MBaaS through acquisition of UserGrid. So these bigger players won’t have liquidity problems, but they aren’t so much pure-play MBaaS players either.
The midsized ones: So in the next group you see players like Kii(profitable), Stackmob (7.5M), Kinvey (7.02M) and Parse (7M). I see these companies in the “Safe” zone for the Series A Crunch being that they each raised about 7M give or take. I’m including Kii in this group because Kii does not have to worry about the “Series A Crunch”. Kii is a profitable company which itself has it’s own Venture Capital fund. These companies fit into the “pure play” MBaaS model as well, not having offerings in API Management or Front End Client tooling like Sencha or Appcelerator.
The Small ones: It’s in this last group where you see some serious concerns. Companies like Cloudmine, FatFractal, Applicasa (500k), Cloudyrec, iKnode, ScottyApp, QuickBlox and others havent raised Series A and may be caught up in this crunch. The lucky ones may be acqui-hired or merged into bigger companies as Cocoafish was by Appcelerator or Trestle was by Flurry. Obviously these are privately held companies, so they may have a huge cash hoard that we don’t know about. But probably not. If you look at it from the Venture Capitalist perspective, the entry price for Series A for MBaaS is $7M given the competition. There really aren’t any firms left who are burning to do a deal in this space who can pull together a Series A like that. I personally know a small MBaaS player who has been out raising money for the past six months or more with no luck.
Small MBaaS Players Run Out of Cash, Try to Sell Themselves
In this kind of environment, it’s hard to see there being enough early revenue in this market to sustain the large number of players. Unfortunately, the buyers may not be there. When polled, some of the top analysts in the space indicate that the consolidation is not going to happen very quickly. Ray Wang, Analyst from Constellation seems to think MBaaS consolidation will take 3 years.
There’s a detailed conversation about this topic among top MBaaS vendors here. What does this mean reading between the lines? This kind of pricing doesn’t reap more revenue from hypergrowth apps, rather it grabs revenue from everyone else. Looking at it that way it reads as a serious pivot towards the Enterprise. This becomes clear especially when you go to https://www.stackmob.com/pricing/.
I’m not sure what their plan is for controlling cost, will all apps be free forever? When does a successful app become an “Enterprise” then have to be charged under a custom plan. They do seem to be able to derive *some* revenue from “free” apps by charging for things like “custom code”. Questions abound. Still it’s kind of an exciting, if mysterious move.
Some MBaaS Players Will Pivot to Enterprise
I think these companies will have to try to eke out survival through revenue, which means going after revenue share with hypergrowth apps becomes out of reach. You end up pivoting towards the enterprise for more short term, bigger chunks of revenue, but less hypergrowth opportunity. We are already seeing a pivot of a number of companies towards the Enterprise. Parse is featuring on its front page this image:
Stackmob has the phrase “Trusted by leading businesses” on their front page, again lending credence to the Enterprisey feeling about their approach.
Vendor Viability Will Become a Key Criteria For Evaluating MBaaS
What happens when your MBaaS provider goes out of business? What happens if it is acquired and substantially changed by the acquirer? These are serious considerations if you are building apps.
If the industry at large is reasonably self-regulating, savvy developers will realize these concerns and know to stay away from vendors who might not even last the next six months.
The smaller players may try to pivot to Enterprise revenue to try to stabilize their cash flow, but smart Enterprise buyers should exercise caution in selecting vendors who are so unstable. These vendor viability concerns will impact the revenue of the smaller players, leading to a shortened runway.
It’s that time of year where pundits prognosticate about the upcoming year. I’ll bite–MMX (that’s Roman numeral for 2010) is shaping up to be a doozy of a year (although I prefer 7DA, which is 2010 in hex). Last night I decided to re-watch 2010 “The year we make contact”. It’s still an incredible movie and a fascinating way to examine people’s assumptions and predictions about the future. The book 2010 was published by Arthur C Clarke in January of 1982. Some of the striking differences between today and the 2010 imagined by Arthur C Clarke in his book include:
The radical advancement of Artificial Intelligence (AI) in the form of the HAL 9000 computer
Substantial investment in Space Exploration including a second manned trip to Jupiter 9 years after 2001, the first trip
Nuclear conflict between the United States and the Soviet Union, a nation that no longer exists
Contact between a super powerful alien race and humankind (this might yet happen but time is short)
Computer User Interfaces are barely better than a dumb terminal
It’s interesting how the predictions often say more about the time of publication and about the author than about the future–in the 1980’s the threat of nuclear war with the Soviet Union, as portrayed in the movie WarGames, which came out in 1983. Of course the advancement in the space program was a fond hope of Arthur C. Clarke, who is certainly a childhood hero of mine in terms of his message of technology transcendentalism and his pioneering science-based fiction.
So with this backdrop, I will venture to make my own technology predictions for 2010, focused on Enterprise Software, Cloud Computing and related topics.
Prediction 1: nothing will happen in 2010
A bold prediction. I’ve already read my share of predictions for 2010 including those from:
I’ve listed (in parentheses) some of the predictions made, above. First of all, the predictions I highlighted were the ones I found the most interesting. Aside from the unlikely (Steve Ballmer will leave Microsoft) and the just-plain-crazy (Supercomputers will achieve the same raw processing power as human brains), I can’t say that any of these predictions gets my blood moving. All due respect to those pundits and prognosticators, many of whom I consider my friends and colleagues.
So why won’t anything happen in 2010?
The short version is that big changes that you’d notice take a long time. It also happens that such changes also take a very short time.
If you find the previous statements irritating or conflicting, you are not alone. Big changes in technology and society are frequently driven by exponential functions–and Albert A. Bartlett, Professor Emeritus at UC Boulder (many thanks to my friend @avh for tweeting this video) makes a solid case that “The Greatest Shortcoming of the Human Race is the Inability to Understand the Exponential Function”. If you feel challenged by my previous statements, please take the time to have a look at this video:
As you can see, the exponential function is just a fixed percentage of growth that compounds. Albert Einstein never said “Compound Interest is the most powerful force in the Universe”, but he should have. The exponential function is the fundamental driver of many driving forces and the resulting human impact. This includes:
human population growth (overcrowding)
energy consumption (oil prices)
pollution emissions (global warming)
transistor density on a chip (computer industry)
DNA sequencing rate (Human Genome project)
Almost all of the hugely transformational items on any technologist’s list for the Enterprise are going to be growing slowly next year. Service Oriented Architecture(SOA), Business Process Management(BPM), Cloud Computing and others. According to IDC Chief Analyst Frank Gens (@fgens), “2010 will be a year of modest recovery for the IT and telecommunications industries. But the recovery will not mean a return to the pre-recession status quo. Rather, we’ll see a radically transforming marketplace — driven by surging demand in emerging markets, growing impact from the cloud services model, an explosion of mobile devices and applications, and the continuing rollout of higher-speed networks. These transformational forces will drive key players to redefine themselves and their offerings and will spark lots of M&A activity.”
But many of the core transformational topics in Enterprise Computing will be growing at single and double (but not triple) digit rates.
Ok, nothing is going to happen, now what?
We’ve established some very Twitter friendly names for 2010 such as MMX (the Roman) and 7DA (the Geek). But to peer farther into the future we should take a look at the upcoming decade. Every decade has a bit of a “theme” that emerges that you can use for when you have nostagia parties in future years. Here are some examples:
The Psychadelic Sixties 1960-1969
The Disco Seventies 1970-1979
The Yuppie Eighties 1980-1989
The Internet Nineties 1990-1999
The Miserable Naughties 2000-2009
Yes, we are good and ready for the Naughties to be over. Bad Naughties, no Krispy Kreme donut (NYSE:KKD)! Lets look back to January 2000.
Just a short (almost) ten years later and Time-Warner is divesting AOL, the Dow Jones Industrials are lower, Unemployment is at 10%, and we’re fighting global warming, economic collapse and Al Qaeda. Can I just say that we are all SICK and TIRED of the Naughties, the nothings, the zero decade, the lost decade, the decade from hell.
Predictions for the Teenies
Technically if the 2020’s will be called the “twenties”, perhaps the next decade should be called the “tens”. I’m not keen to focus on the early part of the decade, so I am going to point to 2013 and beyond, which we can refer to as the “Teenies” (2013-2019). If we absolutely must have a name for the interim period, lets call them the “Tweenies” (kids aged 10-12 are referred to as such). A few other reasons why I like Teenies as a name for this upcoming decade is:
We are an adolescent species, mere teenagers–more on this later
Growth in this decade will come from “teeny” things
As many forecasters will tell you, it will take a good long time to build our way out of what’s now called The Great Recesssion–and though we are seeing “green shoots” now, it will take a long time, well into the decade to start to see the significant effects. So to be fair, the prediction made earlier that “nothing will happen in 2010″ can be recast as a prediction about the decade as a whole–and in this spirit, lets carry on making some predictions about the Teenies.
Prediction 2: Very Teeny Things Become Very Big
Here’s the short list of very small things that will become very big in this upcoming decade:
The Higgs Boson
The Carbon Dioxide Molecule
Genomics & Personal Medicine
Chlorophyll and Artificial Photosynthesis
Dehalococcoides ethenogenes and other pollution eating microbes
Among many others. This prediction is of course very general, but it is intended to provide an impressionistic view on some of the leading advances approaching the boundary of industrial exploitation. In computing in particular, quantum computing is beginning to show promise, as is nanoassembly which is the more bottom-up approach to extremely small circuit design. at the 45nm chip design scale, the fabrication costs are already growing prohibitive. Nanotechnology is also showing tremendous promise in transforming the storage industry.
Prediction 3: Storage and Persistence are transformed
Naturally storage experiences a doubling interval similar to Moore’s law. But we are reaching a significant inflection point, both for the application requirements of persistence as well as the persistence technologies. companies like Steve Wozniak’s FusionIO are pioneering solid state technologies and distributed caching technologies are radically improving performance across traditional APIs according to researchers like Forrester’s @JohnRRymer and @MGualtieri. Companies like TerraCotta and RNA Networks and others are leading the charge. The exciting thing about these technologies is that they are completely disruptive technologies but also backwards compatible with today’s technology APIS, so they can be inserted into everyday applications. Unlike the radical wave of “Complex Event Processing” (CEP) vendors such as StreamBase that require completely rewritten applications (even as they use familiar SQL-like query languages), these solutions provide up to 6 orders of magnitude theoretical performance basis (millisecond disk access vs nanosecond RAM access) over interfaces such as filesystem mount points.
Beyond these advances in software, we see hardware advances such as bottom up storage using nanoscale self-assembly. Ting Xu, a UC Berkeley assistant professor with joint appointments in the Department of Material Sciences and Engineering and the Department of Chemistry, says in the February issue of the journal Science: “The density achievable with the technology we’ve developed could potentially enable the contents of 250 DVDs to fit onto a surface the size of a quarter”. “The challenge with photolithography is that it is rapidly approaching the resolution limits of light,” added Xu. “In our approach, we shifted away from this ‘top down’ method of producing smaller features and instead utilized advantages of a ‘bottom up’ approach. The beauty of the method we developed is that it takes from processes already in use in industry, so it will be very easy to incorporate into the production line with little cost.”
Prediction 4: Just like teenagers, we have trouble getting over ourselves
Despite utopian visions like Star Trek, the “Enterprise” struggles with it’s scale. The Star Trek universe is based on the concept of “Federation”. Daryl Plummer, VP and Research Fellow at Gartner defined Federation as “what autonomy you have to give up in order to be part of something bigger.” This is a great definition as it speaks to organizational silos as well as down to individuals in the Enterprise. I wrote about this challenge both in my blog post “There is no “I” in IT–oh yes there is” and a rational response at the Enterprise IT level in the InfoQ article “SOA Governance Revitalized” (thanks @FloydMarinescu and Ryan Slobojan @straxus)
The Shift Index 2009 (download the abstract here), published by @JHagel shows how poor we are at scaling organization. Since 1965, Return on Asset has declined 75% across US Public Corporations.
We’re not good at federation and scaling organization.
Even Order-To-Cash is going to require collaboration across Enterprise technology silos and Organizational tribes. The problem of Great-Idea-In-The-Shower-To-Cash requires Enterprise collaboration and continuous measurement, alignment and accountability across organizational boundaries. The problem is, the Enterprise may not be the best place for this kind of innovation. Recall that Enterprises are defined (yeah defined by me in this blog post: Top 5 Definitions of Enterprise) as organizations that require size and longevity in order to pursue their mission. The problem with size and longevity is the production of organizational and technology silos.
This results in a complex IT supply…
What remains to be seen is if organizations of size and longevity (read: fat and old) can collaborate at a rate competitive with smaller (perhaps Dunbar-number-sized) organizations. Christopher Allen (@ChristopherA) has an excellent blog post on organizational size as it relates to Dunbar’s number (commonly approximated as 150 people). These smaller organizations can have simpler IT (such as Cloud Powered) while being able to integrate and meet complex business requirements and form complex value chains. Large organizations will not be able to retain talent during the economic upturn and growth of the Teenies, nor will they be able to acquire and consolidate innovators due to the reopening of the IPO markets and the expansion of Market Capitalization proportional to the growth potential of these innovators.
Prediction 5: Trust will take time to heal
One of the reasons for Prediction 1 is the speed at which trust can be restored to the economy. The principle of exponential growth can be seen as a simple reiteration of the financial principle of Compound Annual Growth Rate (CAGR). However, exponential growth can also be a hiding place for charlatanism and multibillion dollar fraud schemes such as those perpetrated by Former NASDAQ Chairman Bernard Madoff.
The ripple effect is both cause and effect–the collapse of the pillars of the economy produces large scale job losses–which also puts fear and mistrust into the economy. Lets take a look at an animation that graphically depicts this blow to our economy regionally in the United States:
Thanks to Super VC David Hornik (@DavidHornik) for tweeting this video.
Speaking of Venture Capital–these are the people who are investing in exponential growth. Trust is returning to those markets as well with Benchmark’s amazing day thanks to Peter Fenton (@PeterFenton) and RedPoint’s successful IPO of Fortinet. Since the greatest failing of humankind is the inability to understand the exponential function, it is hard to understand how to combine trust with transformation and the unique chemistry that is Silicon Valley.
But at a Compound Annual Growth Rate of only 14% we have a doubling interval of 5 years. And interestingly enough, we are experiencing a much shorter cycle time for technology adoption. It took 38 years for radio to attract the first 50 million users. It took 13 years for Television to hit a similar number of users. 4 years for the Internet, 3 years for the iPod and less than 2 years for FaceBook. So we are very bad at understanding the exponential function and also terrible at federation and scaling organization. But the good news is that we have a tailwind.