Gamers obsess over it. Game developer's pay is based on it. But is the review aggregator site still relevant in the changing world of video games?
Gamers obsess over it. Game developer's pay is based on it. But is the review aggregator site still relevant in the changing world of video games?
Marc Doyle was headed off for a weekend getaway in Santa Barbara with some of his old college buddies when he received the news that the Washington Post's critic did not enjoy the video game Uncharted 4: A Thief's End. The critic thought that the graphics were garish, the gameplay was a "laborious trudge," and the overall experience was an "inconclusive wreck." Doyle knew in an instant that his weekend was shot. This critical pan was sure to have an effect on the game's Metascore.
Doyle is the co-founder and editor-in-chief of the review aggregator Metacritic, and he personally oversees the video game section of the CBS Interactive-owned site. He selects the game outlets that Metacritic tracks, collates their reviews, converts their review scores to Metacritic's 0 to 100 scale, and uses them to calculate a single Metascore for each game.
The Washington Post doesn't print numerical scores alongside its reviews. But the publication considers Metacritic to be so important that the reviewers directly relay what their score would have been to Doyle. (Other big outlets like the New York Times have made similar arrangements in the past.) The WaPo critic awarded Uncharted 4 a score of 40. "As soon as I saw that, I knew exactly what was gonna happen," says Doyle. "I knew what the audience reaction would be."
According to Metacritic, 111 different outlets gave Uncharted 4 a positive review. Its overall Metascore was 94 out of 100, the highest score any game had received all year. But that one negative review from the Washington Post was enough to pull the game's Metascore average down from 94 out of 100 to 93 out of 100.
In the grand scheme of things, a single point may not seem all that important. But fans of the game were outraged. So were some people who had never even played it. Uncharted 4 is exclusive to Sony's Playstation 4 console, and many who sided with Sony in the neverending console wars see the success of a flagship game as a sort of victory for the home team, a chance to lord it over their friends who opted for the rival Xbox One console. Uncharted 4 losing a point was like their NFL team dropping in the playoff rankings.
Doyle spent the weekend monitoring the uproar on his phone, firing off emails, and missing out on fun with his college buddies. He knew that his audience could be passionate – the game section of Metacritic drives far more traffic than the sections providing Metascores for movies or TV shows or albums. But complaints weren't just coming from a few very angry but isolated online voices. A Change.Org petition was created which demanded that Metacritic remove the Washington Post review from Uncharted 4's Metascore. The petition racked up almost 10,000 signatures.
Doyle held firm that the review score would stand. "The folks from the Washington Post were apologizing to me for the trouble this was causing, and I'm like, 'Guys, don't apologize!' This is what we do," says Doyle.
The hue and cry around that score is the best evidence that Metacritic deeply matters to many people. And not just fans – the bonus payments that game makers receive from their publishing companies is often tied to the Metascore and those same publishers spend a great deal of time and effort trying to predict the number as it could affect everything from retail orders to returns.
But the games business, and games themselves, are changing. In many ways, a snapshot of what the critical consensus is at the time of launch does not reflect the ultimate nature of a game. Is Metacritic still relevant in this new climate?
MAKING THE BEST USE OF YOUR TIME AND MONEY
Metacritic was originally formed in 1999 by three classmates at USC. Jason Deitz had the initial idea, and Marc Doyle and his sister Julie Doyle Roberts joined him in creating the site, selecting outlets for inclusion, and tabulating review scores. The mission was straightforward. "Metacritic tries to educate the consumer about how to make best use of their time and money," says Doyle.
When critics didn't include numerical scores with a review, Metacritic staffers would guesstimate one based on the tone and content, as well as their knowledge of that reviewer's sensibilities. This method was generally effective for movies and albums, though the critics would occasionally write in to quibble. "Joe Morgenstern from the Wall Street Journal told us, 'You have me giving Pearl Harbor a 40 out of 100; it actually should be more like a 10 out of 100,'" says Doyle.
Guesstimating was fine for movies and albums, but Metacritic staff didn't do that when it came to games. Game reviews have traditionally been more likely to include some sort of numerical quantification. Gamers seemed to demand more precision.
You get bonuses based on sales, and you get bonuses based on Metacritic scores. An executive producer might get a bonus of around $100K, and a regular programmer might get something like $15K, enough for a car.
In many ways, that's understandable. A game is a work of art, but it's also a piece of software. You don't have to download a patch to access the last cut on a music CD. You don't need to consult a strategy guide to make it through the difficult middle section of a movie. You don't lose all of your progress in a TV series due to a faulty save system. "Games are expensive, and they're a heavy time commitment," says Doyle. It's not uncommon to purchase a game for $60, then give up on it after an hour or two because it's uninteresting or unplayable. All of which is why it's reasonable for game reviews to try to carefully quantify the general experience that players are likely to have.
"People are like, 'Hey, should I get this Walking Dead game?'" says Doyle. "Well, if it's Walking Dead: A Telltale Game Series, that has a score in the 90s on Metacritic. But if it's Walking Dead: Survival Instinct, that one has a score in the 30s."
In addition to boiling sometimes hundreds of critical opinions down to single number, Metacritic employs a handy color-coded stoplight system. A green light is assigned to any game with a Metascore of 75 or higher. A yellow light is given to games that score 50 to 74. And a red light is given to anything lower than 50. (This is far more exacting than the Metascores for film, TV, and movies, which get a greenlight if they're 60 or higher and a yellow light on down to 40.) A site that offers a quick way to gauge critical consensus can certainly be useful. But Metacritic has increasingly come to be seen as an absolute meter of quality. And it's not just gamers who treat it that way.
Game industry analysts often make note of Metacritic scores in their financial reports, and some claim that scores can even move the stock price of companies like Activision and Take Two. And then there's the bonus payments that those publishers give to game developers. Chris Avellone of Obsidian Entertainment claims that his team did not receive a bonus (rumored to be a million dollars) for completing the 2010 game Fallout: New Vegas because the game failed to clear the required review score threshold on Metacritic. He claims it missed the cut-off by a single Metascore point. "That's a real thing, and it still happens all over," says Kevin Dent, CEO at games business management consultant firm Tiswaz Entertainment and a longtime executive and investor in the industry. "You get bonuses based on sales, and you get bonuses based on Metacritic scores. An executive producer might get a bonus of around $100K, and a regular programmer might get something like $15K, enough for a car."
"I meet people who work for game companies, and they explain to me how important a Metascore is to their lives, their jobs," says Doyle. "If the general idea is that publishers are concerned with quality and not just sales, then I guess that's a good thing. But I have no idea how they're implementing that. I'm not involved in that at all."
"Bonuses based on sales I understand," says Dent. "But I never understood the ones based on Metacritic. People in the industry view Metacritic with fear, because it absolutely affects their bottom line. Simple as that."
THE SECRET FORMULA IS CREDIBILITY AND REPUTATION
Doyle doesn't consider himself an industry insider, but he monitors coverage closely. "I subscribe to 30 mags from America and the UK and around the world," he says. "At any given time, I'm tracking reviews in 140 to 150 outlets."
Some of those outlets seem obscure to your average gamer. "I've been in the industry for a long time, and even I had never heard of some of them," says Dent. "I looked at the reviews for Overwatch, and I had to look up all of these really obscure outlets on there to find out what they are."
If you look at the 62 reviews Metacritic collects for the PC version of Overwatch, you see a lot of popular general outlets, and some that are more focused, PC-specific outlets. But there's also the Greek site Ragequit.gr. There's IGN, but also IGN Sweden. There's Eurogamer Poland. There's Level from the Czech Republic. There's Game World Navigator magazine, or Навигатор игрового мира, as it's known in its home country of Russia.
In addition to vetting outlets for inclusion, Doyle and his colleagues determine a weighting for each outlet, which means that a 70 from one deemed to be particularly trustworthy is worth more than a 70 from another.
Doyle doesn't believe that American or English-speaking outlets have a monopoly on insightful reviews. Every year, he uses the lull after the big holiday releases to evaluate new outlets for possible inclusion. He says he distributes a lengthy questionnaire to those seeking to be tracked by Metacritic asking about their operation, their editor's background, the composition of their writing staff, statistics about their reviews, their scoring scale and scoring philosophy. He asks foreign language publications for translations of their reviews, and consults with the network of international critics he's built up over the years to assess the reputations of each new outlet. "I'm looking for a reputation for credibility and independent thinking, as well as internal scoring integrity," says Doyle.
Credibility and reputation are key to Metacritic's secret formula, the proprietary system that prevents it from being a simple straightforward average of scores. In addition to vetting outlets for inclusion, Doyle and his colleagues determine a weighting for each outlet, which means that a 70 from one deemed to be particularly trustworthy is worth more than a 70 from another. The company is up front about every aspect of its scoring system except this one. "The weighting is the one element of our system that isn't transparent," Doyle says. "That's our secret sauce."
Doyle downplays the effect of the weighting, noting that Metacritic's sister site Game Rankings, which calculates a straightforward average of scores from a variety of English-language outlets, often produces results that are very similar to the Metascore. "It's especially true for the biggest games which tend to have many more review scores calculated into their averages," he says.
Still, game publishers care deeply about it. They are desperate to determine which outlets have the biggest effect on Metascores. Rumor has it that some feel that they have successfully reverse engineered the scoring.
"I happen to know that we did have significant weighting, and we did come under significant pressure," says Oli Welsh, editor of the UK-based Eurogamer. "Eurogamer often scored games a bit lower than other outlets, and because of weighting, we could bring the Metascore down a point or two."
He's using the past tense because in late 2014, Eurogamer decided to drop review scores. (Meanwhile, Eurogamer Poland and Eurogamer Spain and Eurogamer Italy, which have separate editorial teams, still appear on Metacritic. Confusing, I know.)
Welsh says that part of the reason that his outlet opted out was precisely the impact that Metascores were having. "The people behind Metacritic are good guys, and they set up their website to be useful," says Welsh. "But it's having a negative effect on the industry, on publishers as well as players. I don't think it's Metacritic's fault that it's become an unhealthy influence, but... it's become an unhealthy influence."
Other outlets are also moving away from scored reviews. "For a while we used Yes/No system, but we even got rid of that last year," says Patricia Hernandez, deputy editor of the site Kotaku. "I understand the appeal of review scores – numbers are easily digestible. But that itself says everything you need to know about how review scores can flatten criticism."
CAPTURING CONSENSUS AT THE MOMENT OF RELEASE
Welsh says that the outsized influence of Metacritic was just one factor. "We felt games had become too complex, and too mutable," he says. "Our reviews used to be concentrated in a single article, but now it's more likely to be spread out across several articles over the course of several weeks."
There are other reasons that Metacritic is the review aggregator site – it's all about capturing the critical consensus at the moment of release, when people are poised to make a purchasing decision. But several high profile game companies no longer send out review codes in advance of a game's release. Since it can take scores of hours to play through games, that means that reviews aren't available for days or even weeks after release.
Music fans don't tend to care as much about reviews. Someone gave 'Purple Rain' a bad score? Who cares? Gamers are much more concerned, more invested. It's literally like they own stock in it or something.
And as online features become more common and more important to the play experience, it becomes increasingly impossible to judge a game before it has launched. "Look at Destiny," says Welsh, referring to Activision's $500 million sci-fi epic. "A reviewer can't make sense of it until it makes contact with a large number of players. Many outlets came a cropper on it because they approached it thinking of it as a new FPS by the makers of Halo, and they bounced off of it because the story's rubbish. But there's so much more to it when you get into the online experience."
"The industry has gone from being static – here's your game, come back for sequel – to here's your game, we're going to support it forever," says Dent. "Back in the day, Sony and Microsoft would charge you something like $85K to release a patch for your game. But now, look at Minecraft. Look at Overwatch. They get constant updates. Look at H1Z1 – it's a prototype, it's still broken as hell, but it's already really good, it's got this growing community, it's edging closer to being finished every month."
Welsh says that he talked to Doyle about Eurogamer's decision to drop scores, and Doyle sought to accommodate the outlet so that it could stay on Metacritic. "He offered to interpret our reviews and attribute a score himself," says Welsh. "But that process seemed fraught with difficulty. It was a good conversation, and I was really impressed by the extent to which he took his job seriously. But I respectfully asked him to delist us."
YOUTUBERS AND THE POWER OF REVIEWS
The power and authority of game reviews themselves may have begun to wane. At many outlets, reviews have become more personal and essayistic, less about striving for an unachievable objectivity. Welsh recalls the days he used to write for the esteemed British video game magazine Edge, whose un-bylined reviews were so influential that a rare perfect 10/10 reviews score could send shockwaves through the industry. "Much as I enjoyed being part of that tradition at Edge, I definitely prefer critics with their own taste and their own foibles," says Welsh. "Traditional text-based reviews may not be that voice from on high they used to be, but that's not necessarily a bad thing."
"Readers increasingly care about the opinions of specific personalities and voices," says Hernandez. In fact, many of the most popular arbiters of taste in gaming these days are streamers, who give their critical judgments in the form of a running color commentary that plays out over video footage of them playing through a game. It's hard for Metacritic to capture these sorts of critical judgments and increasingly hard for it to compete.
For many gamers, Metacritic has become a sort of metagame grafted on top of the actual experience of playing games, akin to a fantasy football league in which you can root for your picks and root against your friend's picks.
"YouTubers and streamers have become far more relevant," says Dent. "My son doesn't read reviews, he watches them. He goes and watches videos by Total Biscuit or Boogie. I talked to an executive with a major game platform, and he said that he could care less about what written reviews say. He says that the top 10 or 20 YouTubers and streamers give his games far more exposure than all of the text reviews. These days, text reviews are primarily useful for the blurbs that go out in marketing materials and on the box of Game of the Year editions of their titles."
Though the authority of traditional game reviews may appear to be in flux, they still shape the industry at a fundamental – though largely unseen – level. Most big publishers have literally internalized the game review process. While a game is still in development, they hire outside consultants, many of whom are former game reviewers, to look at their games and assess what score the game would garner in its current state. "Remember when Gamergate thought that publishers were buying off reviewers?" asks Dent. "Every PR person I talked to was like, 'I fucking wish.' But in a funny way, they are buying reviewers. They were paying them two or three times what they'd make from a journalism outlet to write mock reviews for internal use. The reviewers will never be able to make their opinions public because it's under NDA, but they'll be able to, y'know, pay down their college loans."
Sources who preferred not to be quoted by name told me that this is now commonplace practice across the industry. For many AAA titles, it happens continuously throughout development, with many different consultants appraising a title at various stages of completion. One source told me that it's not unheard of for a company to kill a game that they've already sunk tens of millions of dollars into if the consultants predict that scores will be low – just to save themselves the cost of marketing the game and getting it onto store shelves. "That's very, very common," says Dent. "Lots and lots of games get cancelled because of mock reviews."
GAMERS ARE MORE INVESTED
But if the power of scored reviews is waning, why the outcry that still greets outlier reviews, like the Washington Post's 40 out of 100 for Uncharted 4? Doyle is mystified by it. "Music fans don't tend to care as much about reviews. Someone gave 'Purple Rain' a bad score? Who cares?," says Doyle. "Gamers are much more concerned, more invested. It's literally like they own stock in it or something."
We probably shouldn't be surprised that gamers would take reviews so seriously. Game hardware and software is expensive, and as anyone who's ever bought a car or any big ticket purchase knows, there's a tiny dopamine rush that comes with having your purchasing decisions validated, and your brand loyalties reinforced. But beyond that, a lifetime of joystick jockeying has conditioned gamers to care passionately about scores and rankings and hierarchies of achievement. Tracking Metacritic scores has become a game in itself for many, and not just in the sense of idle exercises like ranking the best titles of all time. (According to Metacritic, that would be the 1998 The Legend of Zelda: Ocarina of Time, with a Metascore of 99 out of 100.) For many gamers, Metacritic has become a sort of metagame grafted on top of the actual experience of playing games, akin to a fantasy football league in which you can root for your picks and root against your friend's picks.
You can see this play out in Metacritic's user reviews, which run alongside the critic reviews but have no effect on the overall Metascore. Fans will try to pump up the user score average for the games they support, and torpedo the user score average for rival games. "The fanboy wars got to be a big enough issue that four or five years into our existence, we had to restrict user voting to day of release and onward," says Doyle. "A Playstation exclusive like Little Big Planet would get owners of the other console coming in and bombarding it with low user scores before it was even out. There's no way they could've played it."
Doyle is okay with people taking reviews – and his site – less seriously. He says he sees Metacritic as a way to keep consumers informed, but he's mystified by the idea that people would use it to dictate their every purchase. "Would I have caught the film Moonlight if it scored a 70 or a 50? Maybe not. But then, I also love all Pauly Shore; one of my favorite movies is Biodome," he says. (That celluloid atrocity has a barrel-scraping score of 1 out of 100 on Metacritic.)
"When I see that some silly comedy has a low score, that's simply not important to me. I'm gonna go, I'm gonna laugh, I'm gonna enjoy it."