iStock
Newequipment 10814 Gettyimages 961970568

Machine Learning Accelerates the 'R' of R&D

Sept. 6, 2019
The front-end of the R&D process in manufacturing industries relies on effective literature search, which after decades of stagnation is now, finally, getting significantly better.

R&D is the perpetual pursuit of “new” – not for the sake of “new”, but for the sake of “better”. So it’s only fitting that the tools which enable R&D also relentlessly tread the path to “better”.

The front-end of the R&D process in manufacturing industries relies on effective literature search, which after decades of stagnation is now, finally, getting significantly better.

In 1994, when the first internet search engine deployed, a user entered search terms in a search box, hit the search button, generated a list of documents, manually reviewed the list looking for interesting documents, downloaded one of them, scanned the document to find insights, and then returned to the list of documents. The user continued downloading and scanning documents until they ran out of time or got too frustrated with the process.  Because users actually examine a tiny fraction of the documents on the search result, this dominant search strategy can be best characterized as “hope for amazing good luck.”

Now, for the first time since 1994, a search is different, and that change is a product of several converging factors.

Things started to change when smartphones and tablets running iOS and Android became ubiquitous. These mobile devices are hostile to keyboard-based processes like search. Typing a complex query with one finger is not much fun on a phone, nor is working through the search results to find and consume the insightful documents.

Additionally, the mushrooming presence of Millennials in the workforce, with their bias toward social media and speed in information gathering – Millennials grab and move on, they don’t slow down to focus – has accelerated the trend away from traditional search as an information-gathering strategy.  This is the year when Millennials will pass the 50% mark in terms of the workforce composition, so paying attention to their cognitive style is no longer optional.

Enter machine learning. Applying machine learning to enterprise applications became feasible in 2016 when Google open-sourced its key machine learning algorithms. Today, machine learning technology is beginning to play a role in business research within the enterprise, including in R&D, as some tasks historically performed by information specialists can now be leveraged by computers.

There are several compelling use cases.

For starters, machine learning enables the automated generation of document summaries based on an analysis of the important ideas in a document. A researcher can glean important insights without having to download and read a document, resulting in much faster business research – and a corresponding gain in productivity.

A second (and very significant) business research application of machine learning is the “More Like This” button, similar to what we’re used to seeing on consumer shopping websites. Here’s why: In business and technology research, users typically search on vague general terms; they under-specify queries by not expressing what they are actually looking for. For example, a user interested in the impact of the Internet of Things (commonly called “IoT”) on manufacturing might just search on “IoT.” By leaving “manufacturing” out of the query, the user gets a search result that has a few interesting hits scattered in a sea of less interesting hits. Users compensate for this by manually scanning up and down the search result looking for items of interest.  Users resist rewriting the query to make it more specific because they dislike writing queries in the first place.

Machine learning-based “More Like This” functionality changes the game by letting the user scan the initial (much-too-general) search result and, once they have found an on-point hit, invoke the machine’s intelligence with a single mouse click to rewrite the query based on an in-depth semantic analysis of the on-point document, which produces an entire search result of highly relevant material. The resulting search result is more on point than users can imagine possible based on the performance of search using just human-selected keywords.

An extension of “More Like This” is the automatically generated “Recommended Reading List” specific to the interests of each user. The computer can watch the user in the background, build a topic model for that user (which considers all the factors described in the discussion of “More Like This”) based on his or her behavior, and make recommendations based on what the user downloads.

This is a crucially important function in today’s scientific publishing world. For example, an oncology researcher would have to read 50 peer-reviewed articles every day, weekends included, to stay current on the state of knowledge in the field. Having the machine monitor the content flow for a researcher based on an in-depth knowledge of what that individual researcher cares about moves the task of staying current from “not doable” to “doable.”

Finally, perhaps the most transformational and impactful machine learning-enabled capability is the advent of the automated search report, or “Insights Report”. This represents the first fundamental change in the search experience since 1994. No more slogging through search results, downloading and scanning documents, going back to the search result and looking for more.  The user can let the machine read the documents; then the user just reads the Insights Report generated by the machine, which tells the researcher what it finds that is pithy, relevant, and important.

For a quick overview of a topic, the Insights Report confers more knowledge transfer to the user than traditional search.  Experience shows that the occasional business professional user conducting traditional search downloads on average one document per search, thereby missing the valuable insights in the rest of the documents on the search result. (Users do this because the traditional search process is so tedious and time-consuming.) With the Insights Report, the knowledge in the other documents on the search result contributes to the user’s knowledge.

For power users, the Insights Report saves time and improves knowledge transfer as well. Power users often perform a two-step search process: once to get the overview by scanning the entire search result and maybe inspecting some promising documents; then a second time to drill in once they have grasped an overview of the important topics. With the Insights Report, the overview is instant and comprehensive, allowing power users to progress to the second step immediately and with a better overview.

The business and technical content that can be mined automatically in this way for the benefit of R&D is vast. Conference abstracts, PubMed and other databases, IEEE, and scientific journals all are rich sources of human knowledge that machine learning algorithms are now capable of “reading” on behalf of the human. I believe this is a moment of radical discontinuity in the power of human beings to master complex subjects.

Ultimately, such machine learning-enabled capabilities will make search as we have known it unnecessary. The goal is to have the machine do the search and then tell the user what it finds and what the user should know. The time and effort this literature research workflow can save on the front-end of an R&D project is incalculable. The net effect is to accelerate the literature research phase of a project and get good ideas into development sooner. Time-to-completion, time-to-publish, time-to-market are all cut – and the research result is better informed and more likely to produce breakthroughs.

Now that’s way better.