Surviving and thriving with data science and machine learning means not only having the right platforms, tools and skills, but identifying use cases and implementing processes that can deliver repeatable, scalable business value.
However, the challenges are numerous, from selecting data sets and data platforms, to architecting and optimizing data pipelines, and model training and deployment.
In response, new solutions have emerged to deliver key capabilities in areas including visualization, self-service, and real-time analytics. Along with the rise of DataOps, greater collaboration, and automation have been identified as key success factors.
DBTA recently hosted a special roundtable webinar featuring Alyssa Simpson Rochwerger, VP of AI and data, Appen; Doug Freud, SAP platform and technology global center of excellence, VP of data science; and Robert Stanley, senior director, special projects, Melissa Informatics, who discussed new technologies and strategies for expanding data science and machine learning capabilities.
According to a Gartner 2020 CIO survey, “only 20% of AI projects deploy,” Rochwerger said. The top challenges are skills of staff, understanding the benefits and uses of AI, and the data scope and quality.
She said businesses need to start out by clarifying a goal so they can then know where the data is coming from. Once organizations know where the data is coming from, they can find and fill in the gaps. Having a diverse team of humans can make it easier to sift and combine data.
According to “Data2020: State of Big Data Study –Regina Corso Consulting 2017,” 86% of companies aren’t getting the most out of their data and they are limited by data complexity and sprawl, Freud explained.
SAP Data Intelligence can meet companies in the middle, Freud said. The platform boasts that its enterprise AI meets intelligent information management.
The platform features benefits that include:
- End-to-End Tooling support to set up and manage the machine learning lifecycle
- Accelerate and scale your machine learning projects
- Automate retraining, maintenance, and retirement of machine learning artifacts
- Build agile, data-driven applications and benefit from enterprise visibility and governance
Stanley took another approach by introducing the concept of data quality (DQ) fundamentals with AI. AI can be useful for DQ, particularly with unstructured or more complex data, bringing competitive advantage.
Using AI (MR and ML), more efficient methods for identification, extraction and normalization has been developed. AI on clean data enables pattern recognition, discovery and intelligent action.
Machine reasoning (MR) relies on knowledge captured and applied within “ontologies” using graph database technologies – most formally, using SDBs, he explained.
Machine reasoning can make sense out of incomplete or noisy data, making it possible to answer difficult questions. MR delivers highly confident decision-making by applying existing knowledge and ontology-enable logic to data, Stanley noted.
An archived on-demand replay of this webinar is available here.
First published at Database Trends and Applications