Your Life Is A Graph, Look At It That Way

With online learning becoming the norm, many teachers in the U.S. use an instructional content platform called Newsela to assign reading to their students. The platform delivers news articles, historical speeches and scientific papers rewritten for various reading levels together with related material. Teachers can filter for content that meets particular state or national educational standards. 

Newsela has been a boon to teachers forced to teach online, and it’s a great example of how technology can improve remote learning. But it’s also an example of a new way that data is organized and delivered to users.

Newsela uses a “graph database” to speed the delivery of content while making it easier for the company’s developers to create new features. 

“We decided that we should be running this on a graph database because it is a graph of content,” said Nelson Pecora, lead developer on Newsela’s content management system team. 

For centuries, people have organized data in columns and rows, from clay tablets to green-paper ledger books and eventually to spreadsheet programs such as Microsoft Excel. That led to the development of relational databases, in which data is stored in tables. Think of troops marching in formation.

But real life is rarely so regimented and as the Internet age took hold, there was a surge in the kinds of data stored and used by applications. 

Developers began using graphs to model the way different kinds of data are related to each other. Consider a soldier marching in formation; let’s say he has friends several rows ahead and several rows behind, he has brothers and sisters in other regiments and parents in the viewing stands. His battalion’s formation of columns and rows captures none of that. But a series of points with lines between individuals in the formation and beyond will. That is a graph.

In the early days of computerization, applications lived inside organizations and were primarily about blocks of financial data or records of transactions. But thanks to the Internet and smartphones, applications are now everywhere and touch everything from people to products. Because apps have become more complex, and the interactions and relationships of data has become so important, developers began looking at graphs as a better way to model data. 

Researchers built prototypes of graph databases in the 1970s but they remained primarily in the halls of academia until the rise of social media, which is built around the relationships between users. Retrieving data in graph form is complicated, so Facebook created an open-source data-query language, GraphQL, specifically for that purpose. 

For years, developers have used GraphQL to build a layer on top of relational databases that fetches data and puts it into a graph format that is used to populate applications. For online publishers, that could be articles with related images, videos, infographics and audio files as well as recommendations to other content. 

Think of GraphQL as a commander who can look up where a particular soldier is in the formation, then look up where each of his friends and siblings are in that formation and where his parents are in the viewing stands and send orders for each of them to report to him at headquarters.

Newsela, too began using GraphQL on top of a relational database. But like most organizations that take that route, they ran into performance and production issues because relational systems are not designed for graphs. It takes many commands – lines of code – to call all of the related data.

That’s when Newsela discovered Dgraph, a graph database that stores information in a graph structure so that there is no need for multiple database queries or to reconfigure data. Facebook, Airbnb, Dropbox and Pinterest, to name a few companies, use graph databases.  

“It made our internal workflow much faster because we eliminated most of the code on our server, and queries go directly to and from the database,” said Mr. Pecora. “It also made it a lot more flexible.” 

Dgraph, founded by a former Google employee, Manish Jain, is the only native graph database in the market built from scratch around GraphQL.

“Instead of having to build a relational database and then write some layer that translates between the relational database and the graph front end, you just get a graph to start with,” said Mr. Jain. 

If instead of marching in formation, imagine all those soldiers were linked by colored strings, red to their friends and family, blue to their regimental comrades. All the GraphQL commander needs to do to call an individual soldier with his family and friends, is pull out that soldier and everyone connected to him with a red string. He only has to look up where the soldier is and send one command.

Mooncamp, a small startup in Europe that builds goal-setting applications for companies to define and track objectives, were able to cut their code by 60% to 80% while getting the performance they wanted, Mr. Jain said. On the other end, he added. many Fortune 500 companies are using Dgraph as their primary database.

“You can easily add new content to the graph, like articles, tags, videos, images, infographics, and for us, quizzes and educational standards,” said Mr. Pecora.

Each state and some regions have different educational standards for what they want kids to learn for each subject at each grade level. Students in third grade social studies, for example, should learn about the Emancipation Proclamation according to most standards. 

“We connect the content to different standards and we connect standards to other standards,” said Mr. Pecora. 

Teachers can quickly filter content by the standard they are following. They can change reading levels, group things together, make assignments and create quizzes to help evaluate how students doing.

“Basically, if your data is a graph,” Mr. Pecora said, “it’s useful to model it in your database as a graph.”

First published on Forbes

Similar Posts: