By kingschiebi 5 Comments
Being still a new user to GiantBomb, this is my first attempt at a little blog that I'll write just to keep a couple of notes and thoughts while messing around with the GB API. I don't think that this will be terribly informative or entertaining, but feel free to add your thoughts as well or ask questions and I'll do my best to help out.
Watching the GiantBomb radio recently, I recall Jeff saying that he would like to see the API being used to help users find games that they like. I though about this for some time, didn't see any other attempt to try it, yet, and first thought of the "Similar Games" pages. Unfortunately, these are user based recommendations and while these are valuable, they can't really be used in a programatic approach to the problem.
This led me to the idea that one would have to use the concepts attached to a game in order to find other games with a similar set of concepts. So I started out writing comparator that takes a look at two games and calculate some (semi-)meaningful values:
- Similarity - How many concepts are attached to both games
- Difference - How many concepts are present in A, but not in B and vice versa (these are two values)
- Distance - Calculates the Jaccard index between both concept sets (simply said: number of similar concepts divided by the union of all concepts)
Of course, this approach assumes that each game has all applying concepts attached to it, but I'll get into the flaws of that a bit later.
At this point I wondered how I could easily test some values and see how things turn out. So I went along and decided to give the Google Web Toolkit a shot (because I like making things complicated by programming in unfamiliar environments) to do some analysis stuff. The idea is rather simple - get all games of a certain franchise, start with the first game in the franchise and compare all of them to each other chronologically to see how their similarity changes.
The first thing I did was to upgrade my comparator to be able to compare 3 games to each other and have enough values to create a nice Venn diagram that displays the overlap in concepts for each title. Just as an example, here is the diagram for the first three entries of the Fallout franchise.
Just looking at this graph, one could interpret the following:
- Fallout 2 dropped a couple of concepts from Fallout 1, but added a lot new concepts (one might say that it "refined" and "expanded upon" its predecessor)
- Fallout Tactics sidestepped the franchise by using some similar concepts of Fallout 2, but using none of the "core" concepts of both famous RPG titles
If you are familiar with all three titles, the flaws of this kind of comparison becomes immediate. One could certainly argue that while Tactics departed from the RPG tropes of the first two titles, it certainly had at least "some" things in common with the first game, despite the disparity in concepts. The most likely reason for this may be attributed to the fact that the concepts for Tactics are simply lacking a couple of entries. Then again, this is not an exact science and as long as the source data is entirely based on user content, there will always be inaccuracies.
Of course, it would be much cooler to see how this data looks for each title in the whole franchise on a timeline:
(* sorry for the bad copy/paste formatting, but I think it'll do)
Let's take a look at what each value means:
- Date - The release date for each title
- Concepts - The total number of concepts for this title
- Name - The name of the title
- Platforms - All platforms that this title appeared on (not necessarily at the release date!)
- Added - Number of concepts added in comparison to the prior title
- Removed - Number of concepts removed in comparison to the prior title
- Distance - Jaccard index for comparing the title to the prior title (a value of 100 means that both titles are identical)
It seems as if there is no "best way" to utilize the distance factor for two titles, but it can prove useful in comparison with other distance values. One has to keep in mind that games never tend to be identical. Gameplay and settings get changed (for the better or worse) and this always leads to a lower index. However, looking at the values above, one could say that Fallout 2 is more similar to Fallout 1 than Tactics is to Fallout 2. Also it appears that Fallout 3 is pretty different from Brotherhood of Steel, but New Vegas is "more" similar to Fallout 3.
As I said, I just got started and have a couple of thoughts on where to go next:
- Select a set of concepts and allow them to be stored as a "genre". Then search for other games that match this "genre" in order to find games with similar concepts and use the distance to sort the results to present the "best match" first
- Allow to select a couple of games or a franchise to identify the "core concepts" (and store them as a "genre" for later use)
- Share "genres" between users to find games more easily
- Figure out an algorithm to find games that distort the comparison due to a considerably low number of concepts attached and possibly ignore them
- Use graphs as arbitrary "proof" on forums how a reboot killed a franchise (just joking, but you should see the XCOM graph ;-) )
I'd love to hear some thoughts on this and will add more later on.
Here are some more examples. My capture program did cut off a bit from the bottom and didn't like some of the images, but it should be plenty to get a general idea.