Posted by kingschiebi (65 posts) -

Hello world,

Being still a new user to GiantBomb, this is my first attempt at a little blog that I'll write just to keep a couple of notes and thoughts while messing around with the GB API. I don't think that this will be terribly informative or entertaining, but feel free to add your thoughts as well or ask questions and I'll do my best to help out.

Watching the GiantBomb radio recently, I recall Jeff saying that he would like to see the API being used to help users find games that they like. I though about this for some time, didn't see any other attempt to try it, yet, and first thought of the "Similar Games" pages. Unfortunately, these are user based recommendations and while these are valuable, they can't really be used in a programatic approach to the problem.

This led me to the idea that one would have to use the concepts attached to a game in order to find other games with a similar set of concepts. So I started out writing comparator that takes a look at two games and calculate some (semi-)meaningful values:

  • Similarity - How many concepts are attached to both games
  • Difference - How many concepts are present in A, but not in B and vice versa (these are two values)
  • Distance - Calculates the Jaccard index between both concept sets (simply said: number of similar concepts divided by the union of all concepts)

Of course, this approach assumes that each game has all applying concepts attached to it, but I'll get into the flaws of that a bit later.

At this point I wondered how I could easily test some values and see how things turn out. So I went along and decided to give the Google Web Toolkit a shot (because I like making things complicated by programming in unfamiliar environments) to do some analysis stuff. The idea is rather simple - get all games of a certain franchise, start with the first game in the franchise and compare all of them to each other chronologically to see how their similarity changes.

The first thing I did was to upgrade my comparator to be able to compare 3 games to each other and have enough values to create a nice Venn diagram that displays the overlap in concepts for each title. Just as an example, here is the diagram for the first three entries of the Fallout franchise.

Just looking at this graph, one could interpret the following:

  • Fallout 2 dropped a couple of concepts from Fallout 1, but added a lot new concepts (one might say that it "refined" and "expanded upon" its predecessor)
  • Fallout Tactics sidestepped the franchise by using some similar concepts of Fallout 2, but using none of the "core" concepts of both famous RPG titles

If you are familiar with all three titles, the flaws of this kind of comparison becomes immediate. One could certainly argue that while Tactics departed from the RPG tropes of the first two titles, it certainly had at least "some" things in common with the first game, despite the disparity in concepts. The most likely reason for this may be attributed to the fact that the concepts for Tactics are simply lacking a couple of entries. Then again, this is not an exact science and as long as the source data is entirely based on user content, there will always be inaccuracies.

Of course, it would be much cooler to see how this data looks for each title in the whole franchise on a timeline:

F.
Fallout: New Vegas
Xbox 360/Xbox Live Marketplace/PlayStation 3/PC
2010-10-19
E.
Fallout 3
PC/PlayStation 3/Xbox 360/Xbox Live Marketplace
2008-10-28
D.
Fallout: Brotherhood of Steel
PlayStation 2/Xbox
2004-1-14
C.
Fallout Tactics: Brotherhood of Steel
PC
2001-3-14
B.
Fallout 2
Mac/PC
1998-9-30
A.
Fallout
PC/Mac
1997-9-30
DateConceptsNamePlatformsAddedRemovedDistance
30.09.1997134FalloutPC/Mac000
30.09.1998248Fallout 2Mac/PC121749.8
14.03.200180Fallout Tactics: Brotherhood of SteelPC1218026.15
14.01.200432Fallout: Brotherhood of SteelPlayStation 2/Xbox35134.94
28.10.2008436Fallout 3PC/PlayStation 3/Xbox 360/Xbox Live Marketplace41065.88
19.10.2010254Fallout: New VegasXbox 360/Xbox Live Marketplace/PlayStation 3/PC7826034.24

(* sorry for the bad copy/paste formatting, but I think it'll do)

Let's take a look at what each value means:

  • Date - The release date for each title
  • Concepts - The total number of concepts for this title
  • Name - The name of the title
  • Platforms - All platforms that this title appeared on (not necessarily at the release date!)
  • Added - Number of concepts added in comparison to the prior title
  • Removed - Number of concepts removed in comparison to the prior title
  • Distance - Jaccard index for comparing the title to the prior title (a value of 100 means that both titles are identical)

It seems as if there is no "best way" to utilize the distance factor for two titles, but it can prove useful in comparison with other distance values. One has to keep in mind that games never tend to be identical. Gameplay and settings get changed (for the better or worse) and this always leads to a lower index. However, looking at the values above, one could say that Fallout 2 is more similar to Fallout 1 than Tactics is to Fallout 2. Also it appears that Fallout 3 is pretty different from Brotherhood of Steel, but New Vegas is "more" similar to Fallout 3.

As I said, I just got started and have a couple of thoughts on where to go next:

  • Select a set of concepts and allow them to be stored as a "genre". Then search for other games that match this "genre" in order to find games with similar concepts and use the distance to sort the results to present the "best match" first
  • Allow to select a couple of games or a franchise to identify the "core concepts" (and store them as a "genre" for later use)
  • Share "genres" between users to find games more easily
  • Figure out an algorithm to find games that distort the comparison due to a considerably low number of concepts attached and possibly ignore them
  • Use graphs as arbitrary "proof" on forums how a reboot killed a franchise (just joking, but you should see the XCOM graph ;-) )

I'd love to hear some thoughts on this and will add more later on.

Cheers

Edit:

Here are some more examples. My capture program did cut off a bit from the bottom and didn't like some of the images, but it should be plenty to get a general idea.

#1 Posted by kingschiebi (65 posts) -

Hello world,

Being still a new user to GiantBomb, this is my first attempt at a little blog that I'll write just to keep a couple of notes and thoughts while messing around with the GB API. I don't think that this will be terribly informative or entertaining, but feel free to add your thoughts as well or ask questions and I'll do my best to help out.

Watching the GiantBomb radio recently, I recall Jeff saying that he would like to see the API being used to help users find games that they like. I though about this for some time, didn't see any other attempt to try it, yet, and first thought of the "Similar Games" pages. Unfortunately, these are user based recommendations and while these are valuable, they can't really be used in a programatic approach to the problem.

This led me to the idea that one would have to use the concepts attached to a game in order to find other games with a similar set of concepts. So I started out writing comparator that takes a look at two games and calculate some (semi-)meaningful values:

  • Similarity - How many concepts are attached to both games
  • Difference - How many concepts are present in A, but not in B and vice versa (these are two values)
  • Distance - Calculates the Jaccard index between both concept sets (simply said: number of similar concepts divided by the union of all concepts)

Of course, this approach assumes that each game has all applying concepts attached to it, but I'll get into the flaws of that a bit later.

At this point I wondered how I could easily test some values and see how things turn out. So I went along and decided to give the Google Web Toolkit a shot (because I like making things complicated by programming in unfamiliar environments) to do some analysis stuff. The idea is rather simple - get all games of a certain franchise, start with the first game in the franchise and compare all of them to each other chronologically to see how their similarity changes.

The first thing I did was to upgrade my comparator to be able to compare 3 games to each other and have enough values to create a nice Venn diagram that displays the overlap in concepts for each title. Just as an example, here is the diagram for the first three entries of the Fallout franchise.

Just looking at this graph, one could interpret the following:

  • Fallout 2 dropped a couple of concepts from Fallout 1, but added a lot new concepts (one might say that it "refined" and "expanded upon" its predecessor)
  • Fallout Tactics sidestepped the franchise by using some similar concepts of Fallout 2, but using none of the "core" concepts of both famous RPG titles

If you are familiar with all three titles, the flaws of this kind of comparison becomes immediate. One could certainly argue that while Tactics departed from the RPG tropes of the first two titles, it certainly had at least "some" things in common with the first game, despite the disparity in concepts. The most likely reason for this may be attributed to the fact that the concepts for Tactics are simply lacking a couple of entries. Then again, this is not an exact science and as long as the source data is entirely based on user content, there will always be inaccuracies.

Of course, it would be much cooler to see how this data looks for each title in the whole franchise on a timeline:

F.
Fallout: New Vegas
Xbox 360/Xbox Live Marketplace/PlayStation 3/PC
2010-10-19
E.
Fallout 3
PC/PlayStation 3/Xbox 360/Xbox Live Marketplace
2008-10-28
D.
Fallout: Brotherhood of Steel
PlayStation 2/Xbox
2004-1-14
C.
Fallout Tactics: Brotherhood of Steel
PC
2001-3-14
B.
Fallout 2
Mac/PC
1998-9-30
A.
Fallout
PC/Mac
1997-9-30
DateConceptsNamePlatformsAddedRemovedDistance
30.09.1997134FalloutPC/Mac000
30.09.1998248Fallout 2Mac/PC121749.8
14.03.200180Fallout Tactics: Brotherhood of SteelPC1218026.15
14.01.200432Fallout: Brotherhood of SteelPlayStation 2/Xbox35134.94
28.10.2008436Fallout 3PC/PlayStation 3/Xbox 360/Xbox Live Marketplace41065.88
19.10.2010254Fallout: New VegasXbox 360/Xbox Live Marketplace/PlayStation 3/PC7826034.24

(* sorry for the bad copy/paste formatting, but I think it'll do)

Let's take a look at what each value means:

  • Date - The release date for each title
  • Concepts - The total number of concepts for this title
  • Name - The name of the title
  • Platforms - All platforms that this title appeared on (not necessarily at the release date!)
  • Added - Number of concepts added in comparison to the prior title
  • Removed - Number of concepts removed in comparison to the prior title
  • Distance - Jaccard index for comparing the title to the prior title (a value of 100 means that both titles are identical)

It seems as if there is no "best way" to utilize the distance factor for two titles, but it can prove useful in comparison with other distance values. One has to keep in mind that games never tend to be identical. Gameplay and settings get changed (for the better or worse) and this always leads to a lower index. However, looking at the values above, one could say that Fallout 2 is more similar to Fallout 1 than Tactics is to Fallout 2. Also it appears that Fallout 3 is pretty different from Brotherhood of Steel, but New Vegas is "more" similar to Fallout 3.

As I said, I just got started and have a couple of thoughts on where to go next:

  • Select a set of concepts and allow them to be stored as a "genre". Then search for other games that match this "genre" in order to find games with similar concepts and use the distance to sort the results to present the "best match" first
  • Allow to select a couple of games or a franchise to identify the "core concepts" (and store them as a "genre" for later use)
  • Share "genres" between users to find games more easily
  • Figure out an algorithm to find games that distort the comparison due to a considerably low number of concepts attached and possibly ignore them
  • Use graphs as arbitrary "proof" on forums how a reboot killed a franchise (just joking, but you should see the XCOM graph ;-) )

I'd love to hear some thoughts on this and will add more later on.

Cheers

Edit:

Here are some more examples. My capture program did cut off a bit from the bottom and didn't like some of the images, but it should be plenty to get a general idea.

#2 Posted by takua108 (1483 posts) -

I know this is way late, but I think this shit is awesome, and I just wanted to say that. I made a similar but far less awesome thing awhile ago, kind of just as a proof-of-concept thing, and this is way better in every regard.

It's too bad that a relatively small percentage of Giant Bomb users care about the wiki at all, and an even smaller fraction care about doing cool stuff with the API :(

#3 Posted by kingschiebi (65 posts) -

@takua108: I don't think that it is late for stuff like this at all. Unfortunately, I am a bit short on free time right now due to a pretty high work load.

I'll keep on working on this though and might extend it even further.

Already got it to the point where you can group different concepts into a "genre" and search for potential matches, but the problem here is the uneven distribution of concepts to make this happen automagically. I tested this with Doom, Duke Nukem 3D and Quake in order to get a concept set that would suggest similar titles like Hexen, ROTT etc., but it doesn't work out as nicely - yet.

Dave already mentioned this on some occasions, but an advanced search option would help greatly as we still have to remodel some of the standard features ourselves. A good example for this would be the auto-suggestions which are doable, but you'll always have to get the full list (or most of it) in order to find exact matches that can be prioritized.

Right now, I've just build a bunch of tools and helper classes that might come in handy later on.

The only "invisible" problem (to me at least) I try to avoid is to hit any kind of request limiting. Those queries can get quite huge, especially for a full franchise like Final Fantasy. While I suppose that it is not a really big deal in terms of server load and bandwidth, it might become later on if I should ever release something that other people want to use. Still restructuring a lot of stuff, too, but that is more related to my unfamiliarity towards the GWT framework. It is awesome to use for several reasons, but as with everything new, I always realize how stupid some early implementation was before I knew how to do it properly.

#4 Posted by LordAndrew (14430 posts) -
@takua108 said:

I know this is way late, but I think this shit is awesome, and I just wanted to say that. I made a similar but far less awesome thing awhile ago, kind of just as a proof-of-concept thing, and this is way better in every regard.

It's too bad that a relatively small percentage of Giant Bomb users care about the wiki at all, and an even smaller fraction care about doing cool stuff with the API :(

Maybe if we keep showing off cool stuff that can be done, people will start to care more. People like cool stuff, right? :)
#5 Posted by kingschiebi (65 posts) -

@LordAndrew said:

Maybe if we keep showing off cool stuff that can be done, people will start to care more. People like cool stuff, right? :)

I certainly hope so and as for cool stuff, it does seem tempting to do a rouge-like using the API once. The hardest part about a good rogue-like would be getting enough content together to make each experience unique and tapping just into the concept department would make a nice starting point for assigning some random attributes.

Maybe something similar to the system that Scribblenauts employs by having words relate to different concepts.

Ah, I'm rambling again ;-)