Switch Scores is a hobby project I created back in April 2017 (albeit under a different name). Its purpose is to catalogue and rank the Nintendo Switch games library.
It’s no small task, with over 12,000 titles on the hybrid console, and over 33,000 review scores in the 7 years since I created the site. With this volume of data, automation is key.
This post is about how I get the review scores onto the site. It’s a long post, but stick with me!
How Switch Scores started
I bought the Switch on the day of release. In the first month there weren’t many games to look at (and I was busy playing Zelda anyway!). When I finished Zelda in early April 2017, I had a proper look at the Switch eShop. I thought it might be interesting to keep a list of all the games, something that felt feasible if I started when the numbers were low. So I started a simple list of games as part of a blog.
As new games were announced as “coming soon” or released on the eShop, I had no way to tell if they were worth buying. I went looking for sites with reviews, which I kept in my bookmarks. Reviews are very subjective, so unless it was a must-buy like Zelda or Mario, I had to read several reviews to decide if a game was for me. I created a simple database where I saved the scores, calculated an average for each game, and used this to show the best games.
Even in the first year of the Switch, there were some stellar games, as you can see here: Top 100 Switch games released in 2017.
As the Switch library started to grow, it became clear that the simple layout of the eShop did not make it easy to find games worth purchasing. Unfortunately, this has always been the case in the 7 years since the Switch was released. With Switch Scores, I’ve tried to provide an alternative to the eShop, in terms of finding games you might enjoy.
Goals of Switch Scores
Goal 1: I want to add review data to Switch Scores, so I can generate average scores for each game and rank the entire Switch library.
Goal 2: I want to use the game ranks to aid the discoverability of Switch games, by showing the top rated games by release year and category, and also via a search function.
Problem: Getting the review data onto the site
Adding reviews meant going through the following steps for each site:
- Go to the site
- Look for any reviews I hadn’t already added
- Find the score
- Manually add the review data to Switch Scores
This was a manual process. Early on, there weren’t many reviews to add, so this was feasible. The Switch had around 300 games (see the 2017 graph here), and not many reviewers were on board yet.
Step 1: Getting the reviews
A good solution to this problem is the use of RSS feeds, as these are designed to show when new content is added to a site.
For each site, I had to find a suitable RSS feed. Sites that only published Switch reviews were easy, as I could import all of their content. Some sites published other content, such as reviews from other consoles, reviews of accessories, and more general news and editorial content. This is workable if the sites use categories and/or tags to consistently show which posts were Switch reviews, so I can ignore the rest – and if they have a feed with that data.
About half of the sites provided a usable feed. For the rest, I contacted site owners to ask if they had a feed I wasn’t aware of, or if they’d be open to adjusting their content tagging to help with the import. Most were more than happy to help, although a few were unsure of how to do it. I helped where I could.
Side note: I feel it’s worth shouting out 3 sites who were far and away the best with their feeds. Nintendo Life already had a clean feed I could use with no issues; Cubed3 created a custom feed that worked perfectly for my needs; and Video Chums also made a custom feed that worked perfectly.
Nobody refused to help, however a few failed to respond to emails, tweets and DMs, so it was difficult to find a way forward with those sites. I also had to exclude sites that don’t include scores in their reviews, as the score is integral to calculating the game rank.
Step 2: Building the importer
With a list of RSS feed links, I now needed to write a script to import content from those feeds, and store the data in the database.
Part of the challenge was dealing with the weird and wonderful formats that sites use for their feeds. I looked at every feed and split them into a few categories:
- RSS feeds using a
channel > item
structure – the vast majority, and common with WordPress sites - Atom feeds using a
channel > item
structure – more common on Blogger and Wix, and a few custom sites. This format was parsed as an object rather than an array, so required additional parsing before I could do anything with it - XML using a simple structure with no channel tag – one site. Not a big deal as the format was easy to parse
- A very weird feed that used Atom but with a post tag – thankfully not too difficult to parse, but one more format I had to include
- FeedBurner – don’t get me started on this. FeedBurner normally has a link to view the raw feed, but I couldn’t get this to work for the one site that used FB; it kept directing back to the HTML version. They also did not respond to my messages. Thankfully they moved away from FB and ended up on the same RSS format as the majority of other sites!
I also had to find ways to handle sites that only had one feed for all of their content, so I needed to extract the Switch reviews from everything else. This meant using things like a URL prefix. One of the sites I did this for was also a Wix site that already had a weird feed that was difficult to parse. So, fun and games with that one…
Step 3: Parsing the titles
The feed import script is good for loading the data, but there are a couple more steps to get the data into a usable format.
A feed contains fields such as the post title, the post URL, the author, the post date and time, some information about categories and tags, and an excerpt or the full post content.
To create a review at Switch Scores, I can use the URL and the date. However, I need to link the review to a game. Every game in the database has a unique id, but reviewers will not be able to provide this. I need to look at the post title and determine which game to link it to.
The easiest way would be if every review has the name of the game in it, and nothing else. So for Super Mario Odyssey, the post title was literally be “Super Mario Odyssey”. Easy.
However, different sites use different title formats. Here are some of the formats I have for reviews of Super Mario Odyssey:
- Super Mario Odyssey
- Super Mario Odyssey Review
- Super Mario Odyssey – Review
- Super Mario Odyssey Review (Switch)
- Super Mario Odyssey (Nintendo Switch) Review
- Super Mario Odyssey Review: Mario’s Most Madcap Adventure Yet
- Review: Mario Odyssey (Nintendo Switch) Spoiler Free Edition
- Review: Super Mario Odyssey (Nintendo Switch)
- [Review] Super Mario Odyssey – Nintendo Switch
My first attempt was to strip out words such as Review, Switch, and Nintendo Switch. But due to the use of symbols and the many different styles used, this didn’t work in every case.
In this example I could just search for the title of the game as it’s pretty obvious. It would match all except the one without “Super” in the title. But this is error-prone, and would not work for all games. “RiME” would match any game with Grime, Crime, or (Metroid) Prime. It would also match “Rimelands: Hammer of Thor”. Also, sequels would run into issues: the game “SteamWorld Dig 2” would be partially matched by a review with a title of “SteamWorld Dig”.
The most reliable way I found to do this was to store an expected title format for each site, and be mindful that it may change over time. It may seem like a pointless exercise, as sites can easily change their title formats whenever they like. However, most sites do stick to a similar format for all of their reviews. If they change it, they tend to use a new format and then stick to that for a while.
Let’s look at Nintendo Life first. Here are some recent examples:
- Review: World Of Goo 2 (Switch) – A Superb Sequel With A Few Sticking Points
- Mini Review: MARS 2120 (Switch) – A Mediocre Ode To Metroid Dread
- Review: SNK Vs. Capcom: SVC Chaos (Switch) – A Great-Looking But Painfully Average Fighter
- Review: Thank Goodness You’re Here! (Switch) – A Face-Achingly Funny British Romp
The format is pretty much always:
- Review: or Mini Review:
- Title of game
- (Switch)
- then ” – ” and a short summary
We can parse this using the following regex:
(Mini )?Review: (.*) \(Switch\) - (.*)
and we grab the variable text before the (Switch) part. This means the game title is reliably obtained from one place, and we can use this to match against games in the database.
While doing this, I found a few sites that were very inconsistent with their titles. I reached out to them to ask for some consistency to help with matching. This is also one of the criteria when new sites sign up.
To any reviewers reading this: a consistent title is essential to automating the importing of your reviews! If I can’t match a review to a game, I can do it manually – but this is a daily task, and it’s not realistic to do this for the majority of reviews that come into the site.
So, with the title matched to a game (most of the time), are we done? No – there’s one more thing, and that’s the score.
Step 4: Loading the score
If only more sites were like Nintendo Life, Cubed3, and Video Chums. These sites helpfully include the score of every review in their feed. I can use this to save the score against the review, and I don’t have to do anything manually. Awesome!
A couple of sites – Nintendo World Report, and Pocket Tactics – display recent reviews in a table, with the title, link, and score. This meant writing a scraper just for those tables, but it works fine.
Unfortunately, if the score isn’t in the feed or in a table, I have to resort to scraping the reviews one by one – and if only works the score is shown somewhere in the page HTML that can be loaded reliably.
The best way is to use the itemprop=”ratingValue” attribute, which is used by Pure Nintendo and also God is a Geek – albeit in slightly different ways between the two sites. Nevertheless, I can grab the score from those using this method.
For Pure Nintendo, I do it like this:
$this->domCrawler->filterXPath('//span[@itemprop="ratingValue"]')->innerText();
For God is a Geek, it’s like this:
$this->domCrawler->filterXPath('//div[@itemprop="ratingValue"]')->children()->first()->innerText();
The remaining sites I’ve checked so far are including the rating at the end of the post body, but because of the post HTML it’s not always possible to load the score reliably.
Here are some of the other formats I’ve used – this is three different sites:
$this->domCrawler->filterXPath('//div[@class="entry-content"]')->children()->last()->children()->first()->innerText();
$this->domCrawler->filterXPath('//div[@class="post-entry"]/p/strong/span/span/span')->innerText();
$this->domCrawler->filterXPath('//section[@class="gh-content gh-canvas"]/h2')->innerText();
In each case, I’ve then needed to extract the rating by removing surrounding text, such as Rating, Final Verdict, or Final Score. Then I’ve split the text to an array using the / character and taken the first value. It’s messy and will break if the code changes, but it works for now.
And that’s it!
I’ve enjoyed working on this script. It’s come together gradually, over a number of years, and with more automation introduced where possible. Loading the feeds has been possible for a long time, with different formats added over time. The scraper that loads from a table was added later. And the scraping of scores was only added recently (August 2024).
If everyone used the same format, it would be easier, but then this wouldn’t have been as interesting a problem to solve. In the post, I’ve called out sites as a point of reference, but not as criticism aimed at anyone. Many sites don’t get involved in the technical side of things and just focus on what they do best: writing great content. And that’s fine!
I do think all site owners should look at their feeds, and particularly the categorisation and tagging of their content, to make sure it’s as consistent as possible. You’re making it much easier for sites like mine to include your reviews, and help people find games they enjoy (and some to avoid).
Thanks for reading! Hit me up on social media if you like:
Twitter/X: @benbarden and @switchscores
Threads: @benbarden and @gfdmusic