Last week, I spoke at IBM’s Information on Demand conference about big data and Watson. I also moderated a panel discussion on the initial applications being developed using the technology developed for Watson. Below is the annotated and properly hyperlinked text of my presentation.
Data is emerging as an extremely valuable asset; some even characterize it as valuable as a natural resource. Last year the Economist wrote that data is becoming the new raw material of business: an economic input almost on par with capital and labor.
As we continue to instrument our world, we are able to collect data of amazing quantity and extraordinary detail. Just think for example of the data collected by Nike+, a relatively simple sensor that provides a detailed runner’s log, or Zeo, one of our portfolio companies, that uses a more complex sensor to collect data on sleep patterns. Data is coming from the roads we drive in, the trucks that move our goods around the country, the grid that transmits our energy, and countless other sources. In a recent speech Eric Schmidt, Google’s executive chairman, reported that we are generating 5 exabytes of information every 2 days, and the pace is increasing. An Exabyte is a 1 followed by 18 zeroes. It is 1 billion gigabytes, or the capacity of approximately 15.6M 64GB iPads.
Data comes in all forms and complexity. Some is clean and well-structured like the data received from the smart meters that record electricity consumption in our homes. Other is less structured, messier and hard to understand like the data a technician records while performing a repair. Some of this data, such as a consumer’s purchasing activity, can be stored for later use and reuse. Other, such as traffic data, must be used immediately because its utility is only short-term.
Today we refer to Big Data as data that is:
- Large in size.
- Semi-structured or unstructured in form.
- Real-time or near real-time in the way it is generated and consumed.
Through advances that resulted in the dramatic decrease of storage, computing and bandwidth costs and in the increase of network access, our ability to collect, store and manage Big Data has changed dramatically and has kept up with the generated volumes. Unfortunately our ability to analyze Big Data has not kept pace. Today we find ourselves in the position which is similar to knowing that we have large oil deposits but lack the right extraction equipment to bring oil to the surface, the refineries to convert it to fuel, the machinery to put it to productive use, and more importantly the petroleum engineers that tell us where to drill, how deep to drill, and what process to use in order to make best use of the oil we find.
Realizing our shortcomings in analyzing Big Data and operationalizing, or actionalizing if you will, the analysis results, many organizations have embarked on efforts to develop new tools and processes. These efforts are resulting in a data analytics renaissance. As venture investors of high-tech companies we have been actively funding this renaissance and today close to 20% of our active portfolio includes companies that collect, manage, analyze or operationalize Big Data. Since the beginning of this renaissance close to 100 data analytics companies have been funded by venture investors alone.
Recognizing the potential of data exploitation in today’s business environment, my firm, Trident Capital, continues to look for investments in three areas that address Big Data’s critical needs:
- People. Data scientists and business analysts have emerged as the critical personnel for the analysis and utilization of data. We are looking for services companies with large teams of such people and scalable analytic processes.
- Tools. Today’s Big Data analysis tools remain too low-level requiring analysts to perform many tasks manually. For this reason we are searching for new technologies to help with data ingestion, manipulation and exploitation.
- Applications. Businesses are still not able to capitalize on the results of the performed analyses in a timely manner often missing important opportunities. We are searching for companies developing analytic applications that enable business users to actionalize the Big Data sets their organizations collect. Example applications include customer experience data analysis enabling organizations to offer the right level of customer support, Internet of Things data analysis to optimize supply chains, and applications that use analysis results to assist professionals with complex tasks, such as a doctor during the diagnosis process.
As an investor of Big Data analytics companies I find Watson’s approach important for several reasons:
- First, it uses a question and answer interaction which business users find more natural as it enables them to incrementally improve their understanding of a problem.
- Second, it effectively combines structured with unstructured data some of which is curated, such as published articles of special or general interest, while other is dynamically collected from the open internet.
- Third, Watson’s data analysis speed, that is the result of its underlying architecture, makes the system suitable for several application areas, particularly those where data remains useful for a short period such as medical analysis, financial analysis, and consumer sentiment analysis.
- And finally, Watson’s concurrent use of many analysis and prediction techniques, not only provides a unique approach to machine learning and fact-prediction, but more importantly it enables the analytic application to explore more alternatives to a possible solution, thus increasing the probability of successfully addressing a problem.
We are just starting to comprehend some of the many benefits from the innovations made by the Watson research team and I believe that we will continue profiting from the advances they made for years to come.