Here come the robot reporters. This week the AP announced it will use software to automatically generate news stories about college sports that it didn’t previously cover. Specifically, it’s turning to a content generation tool called WordSmith, created by a Raleigh, North Carolina-based company called Automated Insights.
It’s latest case of big news organizations turning to algorithms to create content. The AP — which is an investor in Automated Insights — already uses WordSmith to generate stories on corporate quarterly earnings reports. Meanwhile, automated content competitor Narrative Science provides similar services to publications such as Fortune and Big Ten Network. And a Los Angeles Times journalist used custom software to auto-generate a story minutes after an earthquake hit Los Angeles last year.
But is anyone actually reading any of this machine generated content? Automated Insights CEO Robbie Allen says that’s the wrong question to ask. Although the company generated over one billion pieces of content in 2014 alone, most of this verbiage isn’t meant for a mass audience. Rather, WordSmith is acting as a sort of personal data scientist, sifting through reams of data that might otherwise go un-analyzed and creating custom reports that often have an audience of one.
For example, the company generates Fantasy Football game summaries for millions of Yahoo users each day during the Fantasy Football season, and it helps companies turn confusing spreadsheets into short, human readable reports. One day you might even have your own personal robot journalist, filing daily stories just for you on your fitness tracking data and your personal finances.
“We sort of flip the traditional content creation model on its head,” he says. “Instead of one story with a million page view, we’ll have a million stories with one page view each.”
WordSmith essentially does two things. First, it ingests a bunch of structured data and analyzes it to find the interesting points, such as which players didn’t do as well as expected in a particular game. Then it weaves those insights into a human readable chunk of text. You can think of it as a highly complex form of Mad Libs — one that takes an understanding of both data and writing to create.
Allen came up with the idea eight years ago, back when he was working as an engineer for Cisco. Allen, who has written ten books, wanted to create something new, so he decided to combine his passion for computer science, writing, and sports analysis into a company called StatSheet.
“The traditional approach of hiring a lot of writers wasn’t attractive to me,” he says. “What’s exciting about sports recaps is that 90 percent of what you do is write about the numbers.”
Soon, however, Allen realized that the idea could be applied to any quantitate data — not just sports. So the company changed its name to Automated Insights to bring its technology to a wide range of industries, including finance, health care and, of course, journalism.
Today WordSmith can only work with structured, quantitative data — the sort of things you find in well formatted spreadsheets and databases. Allen says there’s certainly potential for other companies to create software that can go further in automating research or writing by summarizing lengthy texts, rewriting press releases, or sifting through unstructured documents for insights. But he doubts that Automated Insights will stray from its roots in quantitative in the foreseeable future.
Last month the company was acquired by private equity firm Vista Equity Partners, which also owns the sports data company STATS and business intelligence company TIBCO. By partnering with Vista’s other companies, Allen says Automated Insights will have more than enough work to keep them busy. “It’s kind of a no brainer for us,” he says. “We have so much opportunity ahead of us in structured data, why take on a space that people have struggled with for years?”
In the meantime, expect to see more stories written for a very particular audience: you, and you alone.