Analytics is generally used on unprocessed data (visual or numeric, structure, unstructured) to uncover some insights. However, graph analytics (or network analysis) specifically covers the analysis of the relationship between graph database entries via an abstraction called graph model. These entries or entities can be either customers, products, operations, or devices. This approach is leveraged mostly in social network analysis, fraud detection, supply chain, and search engine optimization. According to a recent market report, the graph analytics market was valued at US$600 million in 2019, and it is forecast to grow US$2.5 billion by 2024, at a CAGR of 34 percent during the forecast period. The main reason behind its increasing popularity is because of its ability to incorporate new data sources and mine new relations among those data sets with ease.
Anatomy of a Graph
The structure of a graph is made up of nodes (also known as vertices) and edges. While nodes denote points or entities in the graph data, edges represent a relationship or lines of communication between nodes. Every edge has a direction (one-way, bidirectional, non-directed) and a weight that symbolizes the strength of the relationship. In real-world scenarios, nodes can be people like customers, employees, or places like retail stores, airports, or represent things like assets, grids, bank accounts, URLs, and so on. Edges can stand for likes and dislikes, emails, payments, phone calls, and much more. The ancillary information or attributes that describe a node are called properties.
Types of Graphs.
Basically, graph analytics are of 4 types:
1. Path Analysis: Examines the relation between nodes in a graph. In other words, it determines the shortest distance between two nodes.
2. Connectivity Analysis: It helps in comparing connectivity across networks by outlining how strongly or weakly two nodes are connected. This helps to determine how many edges are flowing into a node and how many are flowing out of the node.
3. Centrality Analysis: It enables estimation of how important a node is for the connectivity of the network. It determines the most influential people in a social network through ranking or finds out the most highly accessed web pages.
4. Community Analysis: This is a distance and density-based analysis of relationships used in groups of people, to find groups of people frequently interacting with each other in a social network. It also identifies whether individuals are transient and can predict if the network will grow.
Why is it different than Relational Analytics?
In general, the relational analytics investigates relationships by comparing one-to-one or maybe even one-to-many, whereas, graph analytics can also compare many-to-many relations. The latter helps to identify paths that were missed by relational analytics. Moreover, relational analytics works best on the structured and unchanging data sorted in tables and columns. And graph analytics is resourceful in interpreting the unstructured, fluctuating, variable datasets. It provides information and context about relationships in a network and deeper insights that can enhance the accuracy of predictions and decision-making.
Graph analytics saves much time since it takes less time in data organization and consumes fewer efforts while merging more data sources or points. This also makes it easier to work with. It also allows modeling, storage, and retrieval of data it analyzes. Further, graphs are visually appealing and easier to understand than most of the other data analytics tools and models. It also can find indirect relations and can cohesively express large and complex data.
Given the vast possibilities of graph analytics, it is used in several areas.
• Journalists can use it to navigate through thousands of documents and sources to extract information in a structured format. E.g., International Consortium of Investigative Journalists (ICIJ) research on Panama Papers, where it identified how leaders, influencers, celebrities, and politicians used complex sets of shell companies to hide their wealth from the public.
• Search engines like Google use Knowledge Graph to improve recommendations in its search engines by studying the search history patterns of users.
• In sales, it helps brands in social network analysis, by identifying influencers, decision-makers, and dissuaders who can be instrumental in driving more leads via collaborations or endorsement schemes.
• It helps in identifying illegal or fraudulent behavior and criminal activity like money laundering, cybercrimes, etc. This allows banks in determining whether to sanction loans to an applicant, strengthen enterprise or institutional security, and so on.
• In healthcare and pharmaceuticals, graph analytics can discover new, effective ways of treatments by analyzing relationships in proteins, chemical pathways, DNA, cells, and organs, and find how combinations of lifestyle choices and medications influence them.
• It can also optimize logistics for manufacturing and transportation industries by finding out the fastest and safest routes, weather conditions that might affect the routes and other factors.
• Further, it can help optimize the configuration of system resources to balance loads and maximize system utilization. This can be done by identifying overloaded and strained resources, model reallocation of traffic to reduce risk, and reconfigure the topology to improve operations.
Some of the key players in graph analytics are, Oracle’s Spatial and Graph, Apache’s Gremlin Microsoft’s Graph Engine, Amazon’s Neptune, SAP’s HANA, IBM’s Compose for JanusGraph. Other outstanding tools include Neo4j, FlockDB, Cayley, TigerGraph , and GraphFrames.
First published in Analytics Insight