Testing the library

The fastest way to test the library on your own machine is to run the test.PrepareMatch class, providing as parameters the ids of the opponent teams in the database. For more details please consult the user documentation.

In order for the match to run, we need to get several details about the opponents from the database. The JDBC driver is used for connecting to the MySQL database. The database address is defined in the config.txt file, whose sample is included in the project files. Two pairs of queries are executed: each pair is for each team. The convention of naming the teams as “home” and “away” is followed in the source code and in the documentation. This is not, however, to suggest that a team has an advantage over the other. In fact, no home ground advantage is implemented in these early stages of the project. One query retrieves the team details, while the other retrieves the set of player details.

The team details required are just the name of the team, which is given as a parameter to a core.Team constructor. The (squad) player details retrieval and integration with the library is a little more complicated.

Firstly, there is a difference in how the players of a team are stored in database for this “desktop” demonstration. Naturally, each team has a squad of players and these players are selected for the starting lineup, either from the human manager or the CPU manager. This is how it is implemented in the online demo. To keep things simple (as the desktop demo does not have a GUI) the lineup is actually preselected and only the starting lineups of players are stored in database. What’s more, the semantics of the player positions as stored in database are different and less sophisticated than what the library may support and demonstrated in www.pubsoccermanager.com.

Each player position corresponds to a number: 1 for goakeeper, 2 for defender, 3 for midfielder and 4 for forward. The players positions are used as parameters to determine the team tactics.
There are 3 different classes which are used for mapping the real world entity of player. For this demonstration, we only need the gameplay.Player class, as this is the one holding data relevant to the match itself. The details to retrieve from database for each player are its shirt number (which is used to identify a player during the match and not the database id which is used in other occasions by default), the full name (for displaying to the user) and its position (for determining tactics and positioning during the match). Last, we need the player attributes. Each player has a set of attributes and a rating is associated with each attribute, determining how skilled a player is. These are also loaded from database and are attached to the gameplay player objects. Now, all the players are ready for the match (their details are loaded).

As mentioned above, the team tactics are determined by the player positions. In the online demo you get to select the tactics separate from the players, leaving you with more freedom. The Team.defineTactics method is used to create a Tactics object so that tactics are calculated once at the beginning of the match. This is to save cycles but also because we need to define the tactics so that players are placed in the horizontal axis (left, centre or right). Of course, it matters where a player is placed, because player skills differ depending on whether playing on the wings or centre, etc. However, the pre-selected players are ordered in database in such a way that their optimal skills match their horizontal position. A special method is used for this demo: Team.alingPlayersDesktop, which aligns the players to the horizontal axis based on database ordering. The general case is, of course, more sophisticated and a different method is used for player alignment to horizontal axis positions.

A Team.displayLineup method is provided for displaying the team’s lineup. This is used for debugging and its output is not “beautified”. However, it is called by default.

For the Match object to be initialized we need the two opponents Team objects to be passed as parameters to its constructor. Calling the method Match.play(int startTime) with startTime = 0 kicks-off the match. For interruptions to be handled, a match is played in chunks. This demonstration is a simple case so the match will be played at once (no interruptions required), so we only start the match and check whether the match has finished in each chunk.

The play method returns a Signal object. This object’s subtype denotes the outcome of each match chunk, along with the relevant details. We are only interested in signals of EndOfMatch subtype, so we check whether the Signal object is of that subtype in each iteration. If the match has not reached the end then we call the play method with the next time step as parameter.

When the end of match is reached, the after-match processing takes place (e.g. displaying results and statistics to the user). To trigger that to the Match object, we call the play method with the full match duration (2 * halfDuration), to indicate that we only want the match summary and not any other match processing.

This is the first of a (hopefully) ongoing series of posts, regarding developer documentation of OpenFootie. It will, naturally, include source code discussion, but in this early stage it will serve good as a more user-oriented documentation, as well. It is assumed that you have access to the source code. Excerpts will be included where required, but I will generally avoid it, unless the discussion is hard to follow through the source code files themselves.

The subject of this first post is running the OpenFootie project. The initial idea was to have the necessary input in a (MySQL) database and load it from there. The user interaction is restricted to selecting the opponents by providing the database team ids as command line arguments. This approach has the advantage of providing the user with a minimal choice as to which teams will play against each other without needing to tweak the code and recompile.

However, as I was working on some new features, I realized that a hard-coded input would actually add more flexibility in trying different input combinations (of course, this is how I worked myself, but the aspiration of providing a more user-friendly “interface”, made the interaction less flexible). Anyway, by the time you are reading this a sample class which starts the match with hard-coded input will have been added to the trunk. The latter approach will be described in the rest of this post.

Using hard-coded input

A more straightforward way of testing the match engine is the hard-coding of input. The test.HardcodedMatch is a sample class demonstrating how the ‘raw’ input is provided to the match engine.

The main difference is that all input that would be fetched from database are now hard-coded. We start with the team names: the match is based on the International Philosophy Monty Python sketch, so the opponents are the Greek vs. the German philosophers.

Each team’s starting lineups are represented by an ArrayList of gameplay.Player objects. The data that define each player and are the constructor parameters to the player objects are the shirt number, first name, surname and position in tactics.

Each player has a set of ratings each corresponding to a specific attribute or skill. Since, these can’t be loaded from the database, they have to be hard-coded. For keeping things simple in the sample file, we assume that all players have “excellent” ability in all attributes. The excellent rating corresponds to a “6” in this file, which is the minimum numerical value for a player with an excellent rating for an attribute.

After the players are defined and added to their team’s lineups, we need to put them in their respective positions in the lineup according to tactics. As in the previous section, this is automatically done by the alignPlayersDesktop method of the Team class. While we didn’t have to worry about the order of the players in the lineup, as this was already defined by the database queries (so it was 100% automatic in terms of user input), in the hard-coded version the ordering with which the players are added affects their actual lineup position on the horizontal axis. This concludes the review and the differences between the ways with which a match engine can run on your PC.


The configuration file


The configuration of OpenFootie comprises of the input/output data sources used by the application. The typical configuration is:

database = jdbc:mysql://localhost:3306/match_engine?user=root&password=123456
probmodel = data/match report.data
matchreport = data/match report.txt
playerstats = data/player stats.txt
statssummary = data/stats summary.txt

The configuration file is called config.txt and it is placed in the classpath. In this blog entry, we are going to take a look at the structure of each data source.

The database

The database supported is MySQL. You can avoid using the database for running the application on your desktop, if you just want to give it a quick shot or you don’t mind changing hardcoded input for some variety between match simulation runs.

It wouldn’t be too difficult to add support for other databases, as well, but I am more inclined towards providing a custom binary format for game input, as it would be more appropriate. However, as I mentioned elsewhere, the database is a quick and dirty solution plus it was needed anyway for the host application on the web.

So, what’s in the database of OpenFootie? The input is the minimal it can be, as OpenFootie is initially intended as a library and not a full application. The minimum entities required are teams and players. To add meaning to the input, the players must be associated with certain abilities or weaknesses and the team must have a formation on the pitch.

For the team, we only need to store a name as minimal input. The data provided are most of the teams pariticipating in the old 2010 world cup. (As I am writing this, I apologize for not updating to the newer world cup, that has already passed. Although no further updates are envisioned for the software itself, I could still update the data. The project is not as neglected as it seems, from a broader point of view).

Next are the players. The player names are not real. With each team 11 players are associated, as no substitutions or tactics changes are supported from the command-line desktop application. Each player has a shirt number, unique within his team, used also by the application code for identification during the match. Finally, each player is associated with a number between 1 and 4 to denote his position.

The position field also implies the team’s tactics. First of all, intuitively, 1 means goalkeeper, 2 defender, 3 midfield and 4 forward. This is a different semantic from the one used by the web host application and both semantics are supported by the OpenFootie library. This is, of course, the simplest one which accomodates for the minimalistic nature of the command-line application. When the application reads the players of the selected team to form the input (lineup + formation), it infers the tactics by the positions of the team players and puts the players in corresponding positions according to the tactics and order of the players in the database (by their id, that is). So the order of midfielders (position = 3) of a selected team defines which players play at the left, right or centre. You don’t need to actually worry about that as the players skills match their implied positions. It would only be a little tricky if you would actually want to tweak the input from the database. In that case, you could alternatively try the test.HardcodedMatch class (currently available from the trunk) or the online demo application.

In a separate table, the player skills of each player are stored. The table’s structure is self-explanatory, while the rates of the attributes are from zero to 6+. Generally, a player with attribute skill of 6 or more is considered of having “excellent” value for that attribute, while I wanted to give an extra variety considering the difference in attribute rates of “excellent” players.

Probability model

This is the file that represents a real football match in a custom language. The file used is in binary format, however it is converted from an XML file which is readable by a human. The file is not included in the 0.6 distribution, however it can be downloaded from the repository now in github. This is going to be a short specification of this XML file.

The concept

The concept behind representing and eventually reproducing a football match used by OpenFootie is to see a football match as a sequence of states having a cause and effect relationship. Not all events can be connected and not with the same frequence. For instance, a team awarded a penalty kick cannot concede a corner kick in the next “moment”. Similarly, it is a rare occasion that a goalkeeper scores a goal by kicking the ball from his area. On the other hand, a crossing of the ball will probably be followed by an aerial challenge or a shot may end to a goal scoring opportunity.

What is included in the file is a definition of events that may happen at a football match (this is not a comprehensive list but it is rather based on the first half of a CL game played some time ago). Next is the relationship with other events in a manner that reflects the probability of a particular event happening after another. Since it is a small sample, there is no special representation for the probability, but it is rather implied by the sequence itself. Although this would not be efficient (“search for all the results of a particular event and count them by type”) in case the sample was really big, it doesn’t matter for now.

The structure of the XML file will be covered in the next sections. The representation is rather simplistic, aiming for a statistical reproduction of the match, instead of a more user-friendly approach.


The coordinates are measured in an ordinal scale, rather than a ratio scale. For the length coordinates, the various positions are characterized as being in “Defence”, “Centre” or “Attack”. This is in reference to the team (defined as) having possession of the ball. I would also like to apologize if the terminology seems a bit strange (e.g. “Midfield” could be used instead of “Centre”). This is partly due to the casual way of giving names to things I needed while I was developing the engine and partly due to influence from my mother tongue.

For the width coordinates, only two areas are defined: “Axis” and “Flank”. Right or left flank definition does not matter for the representation of the match. It should be noticed, that since the match representation language semantics could not be comprehensive enough, especially in its first edition, the match engine implementation plays partly that role. Therefore, while the match engine is really an “interpreter” or “processor” of the language defined in our XML file, it intervenes in adding semantics dynamically. This was only a side note and we can see examples of this in subsequent documentation. So, the exact side of flank is only defined in “run-time” of the match, according to the actual match engine implementation.


A key factor of a specific instant of a football match is the pressure on the team having possession of the ball. We specify pressure with three values: “Clear”, “Avoid” and “Under”. These are more or less self-explanatory. The “Avoid” value for pressure represents the case of the team having the ball making an effort to move forward or move quickly to avoid the pressure.


The action signifies the way of connection between different states. It correponds, of course, to the real world meaning of the word, and specifically the action of the player having the ball. According to the current state and the action “chosen”, the transition to new state is defined. There is a number of different actions supported, each with a self-explanatory name. The possible actions to be taken in each state correspond to the mapping of real world actions to state modelling.


Another attribute you may notice is that of “ModuloRowId”. Please ignore this one as it is used for debugging purposes.


Each state is named “condition” in the XML file, in the sense that each state is the “if” part of an “if…then…” statement. Each condition has a “result” (another state) or a “challenge” and a “result”. The challenge is an intermediary state, which does not involve an action. For instance, a player makes a long pass which results in an aerial challenge before another player has possession of the ball in a state where he can choose his next action. Maybe you will remark that the concept of a challenge may not be necessary in terms of keeping things simple, and it could be ommitted. However, the sample and the corresponding language evolved naturally during the representation of a real match, and I was trying to depict abstractions of exactly what I was seeing. In general, some things might not be needed to be included in the first version of the probability model, however they don’t harm in their own right either (and they could make more precise statistics). We will see some examples of challenges in a following section.


Result tags are essentially the description of a state resulting from a “condition” state, along with some attributes which describe how the transition takes place. We saw from the previous sections that the only attributes needed to describe a state are the coordinates of the ball position and the degree of pressing by the opponents. The two additional attributes for result states are the “Team” attribute, which denotes whether the ball should go to an opponent or a teammate and sometimes a tag denoting the way the ball has changed possession, which is used for statistics. A side note here is that the player having possesion of the ball does not “know” the possible outcomes of his actions, and the choice of actions depends only on the frequency they are encountered in the probability model for a specific state. Even if the outcome would be 100% negative, the player would still “choose” the action according to the direction from the probability model file, without employing any kind of artificial intelligence.

Except of the “canonical” results described above, there are also the “special” results. If, for instance, the result is a foul from a tackling, there is no need to duplicate the next state, which is implied, however a “special” tag is included denoting that a foul took place.


The challenge tag may describe a variation of things relative to the real world. It could mean from a simple aerial challenge to a sequence of “states” where the ball hasn’t touched the ground for a minute. This implies that the time cost would vary if absolute measures of time were taken into account. Time itself is implied by allocating a specific number of states for each match. Based on the number of states used to describe that first half of the sample match, the magic number of “510” is allocated for the number of states of each match. This, of course, may change in the future as more samples are taken.

The “Y” atribute is informative as to where in the pitch the challenge takes place. The “X” coordinate is omitted for challenge represenations. The “Team” attribute denotes which team starts the challenge (and it does not always describe the conclusion of the challenge, as there may be a result tag which denotes that). Finally, an “Ending” attribute may be included for describing how the challenge has concluded.


Analyzing each and every state transition rule in the data file is out of scope of this article. However, some examples would be really useful.

<condition Y='Attack' X='Flank' Pressure='Avoid' Action='Ball Control' ModuloRowId='87'>
<result special='Foul'/>

The initial state is that the team having possession of the ball is attacking and has the ball in the “attack” area (let’s say around the penalty box or in the box). Since the X coordinate has value ‘Flank’, the ball should be about either left or right of the penalty box. The player having the ball is under pressure and tries to avoid the defender(s), by holding the ball (action = ‘Ball Control’). The result in this case is winning a free kick (result special=’Foul’). Notice that there is no need to define the coordinates of the result state (they should be the same as the “current” state).

<condition Y='Defence' X='Axis' Pressure='Clear' Action='Forward Pass' ModuloRowId='48'>
<result Y='Centre' X='Axis' Pressure='Avoid' Team='Own'/>

The ball is in the half of the team which has possession, in the ‘axis’ or the center of the field, without any pressure. The player having the ball decides to make a ‘forward pass’ which results in a teammate (Team=’Own’ in the result) having the ball in the opponent’s half (Y = ‘Centre’), trying to avoid the pressure of the opponents.

<condition Y='Centre' X='Axis' Pressure='Avoid' Action='Back Pass' ModuloRowId='49'>
<challenge Y='Centre' Team='Opp' Ending='Loose Ball'/>
<result Y='Centre' X='Axis' Pressure='Avoid' Team='Opp'/>

This transition has a challenge as part of the result, or reaches the result state through a challenge. The team having the ball, has it on the opponents’ half in the centre of the X axis, and they are under pressure which they are trying to avoid. The action chosen is a ‘Back Pass’, which results in a challenge. This example demonstrates that a challenge is an intermediate state in which no team has clearly possession of the ball. You can imagine that the opponent team intercepted the pass (Team = ‘Opp’ in ‘challenge’ element), which has the result of the ball being ‘loose’ (see ball possession change modes section). Finally, the opponent team has the ball in the half of the team which originally had possession of the ball, trying to avoid the defenders’ pressure.

Ball possession change

For statistics reasons, the information of how possession of the ball changed is also recorded in the data file. The following modes are taken into account:

  1. Normal: Nothing special about possession change
  2. Pass interception: A defending player is credited with an interception
  3. Goalkeeper interception: The goalkeeper wins the ball from e.g. a crossing
  4. Man challenge lost: The defending player wins the ball from someone trying to dribble him
  5. Loose ball: Normally a result of a challenge where no team had clear possession of the ball immediately after the challenge
  6. Lost ball control: The ball is won from someone attempting to hold or control the ball
  7. Bouncing off: Ball is won from bouncing off a shot

Based on actual examples, this is quite a detailed list for this stage. In the data file a ball possession change type is represented with the “PossessionChange” attribute of the “result” tag, or the “Ending” attribute of the “challenge” tag mentioned above.


I hope this section clarifies the structure of the data file and provides an overview of the workings of the match engine. You might have noticed some strange terminology regarding the values of some of the attributes, while some others might be self-explanatory. I won’t go into detail about their semantics, which right now can only be traced in the source code. This data file representation, after all, was meant to be used as a binary file and its xml version is only used as a convenient representation. However, besides semantics, the attribute values you see function as qualifiers in terms of determining the states chaining (i.e. which state should follow) or for producing stats (as mentioned above in the “ball possession change” attribute).


(The old) blog is back

It’s been a while since I posted anything about the project in blog form. The project used to be hosted on WordPress from Sourceforge, but when one day this stopped, I wasn’t very prompt in transferring it elsewhere.

Well, here we are now. Since I am doing my little research on CMSs, I thought I would give WordPress a try by re-posting my old blog entries. A lot have happened in the project since the last time I blogged: absolutely nothing.  The project now is somewhat deprecated, but a new one is on the way. No more updates will be done, but a rewrite will be based on the original one. As a result, there will be no new blog posts, at least not relating to new functionality.

In the meantime after the original blog was down, the blog posts were sent as documentation via email to the (very) few who were interested. The original blog almost had no visitors (if Sourceforge showed stats correctly), but at least I know first-hand that some people actually read these posts/documentation. Now, I will be making these entries public again.