Simulation engine for a social teamwork game
Table of Contents
About The Project
The purpose of the extended model is to build a simulation engine that serves as the basis for the “social teamwork game”. The model simulates the state of the world and its development over time for workers within an organization that are assembled in teams to work on projects. The simulation engine will then serve as the underlying model for the world in a Python-Django prototype for a web application (e.g., hosted on Heroku). In that prototype a user (either a worker or an organization) goes through the steps of the user journey and takes part in the social teamwork game. The model will be used to test a number of hypotheses to understand emergent properties and/or to fine-tune the input parameters, to visualize those results, to generate training data for the Machine Learning prototype and to gather model proof points for further discussion.
Built With
SuperScript is built with the following key frameworks/packages:
- Mesa: agent-based modeling framework.
- Scipy.optimize: optimisation package used for team allocation.
- Pathos: used for parallel optimizations on multicore architectures.
Getting Started
It should be straightforward to get SuperScript working on any system that has Python installed. To get a local copy up and running follow, these simple steps.
Prerequisites
The following are required in order to run SuperScript locally:
- Python3.6 or above (python_version>="3.6")
- either pip (recommended) or conda
- venv
Installation
The recommended installation method is to use the package manager pip, because this is the most lightweight solution on new machines (e.g. on AWS). But instructions are also provided for installation using conda.
Using pip:
-
Clone the repo and change into the directory:
git clone https://github.com/cm1788/SuperScript # Windows: chdir SuperScript # Linux: cd SuperScriptNote: if you have an authentication error when using
git clone(or you get arepository not founderror), you can use the syntax:git clone https://username:password@github.com/cm1788/SuperScript -
Create a new virtual environment:
python -m venv superscriptenvNote: the
pythoncommand here may need to be replaced withpython3orpython3.6orpydepending on your system, whichever points to the version of python that you want to use. -
Activate your virtual environment:
# Windows: superscriptenv\Scripts\activate # Linux: source superscriptenv/bin/activate -
Install requirements (ensuring that you use the correct requirements file for windows/linux):
python -m pip install --upgrade pip # Windows: python -m pip install -r requirements.txt # Linux: python -m pip install -r requirements_linux.txtNote: just using
pythonhere should be fine provided that you have activated thesuperscriptenvenvironment.Note: The requirements.txt file was produced using
pip list --format=freeze > requirements.txton a linux system with Python3.6. Depending on your installation system the following dependencies may have problematic version specifications:kiwisolver, numpy, scipy. If these or any other dependencies fail to install, try manually removing the version specification fromrequirements.txt(i.e. remove the==X.X.Xfrom that line in the file).
Using conda:
- Clone the repo
git clone https://github.com/cm1788/SuperScript - Create virtual environment from the YAML file:
conda env create -f superscriptenv.yml - Activate the virtual environment:
conda activate superscriptenv
Usage
To run SuperScript in the Mesa server, first ensure that the superscriptenv environment is activated and then use:
Most of the important model parameters can be selected in the GUI, but for other parameters they will need to be adjusted in config.py before launching the server.
To activate the social network visualisation, you need to uncomment line 318 in
server.py (this feature is deactivated
by default because it is slow to recompute the network layout on each timestep). However, the social network can be
saved for later analysis by setting the SAVE_NETWORK flag to True.
Note: The parallel basinhopping optimisation (ORGANISATION_STRATEGY = Basin) can be very slow depending on the
size of simulation. For real-time visualisation it is best to use Random or Basic, although the teams
produced will not be as good.
Running simulations on AWS
Instructions for getting set up on AWS are provided in documentation/aws_instructions and there is a python script provided for running these simulations: aws_run_simulations.py
Running batch simulations
A python script for running batch simulations is provided: batch_run_simulation.py
Analysis
There are five analysis notebooks and the data to run them are include in the repository (in analysis/):
- initial_aws_simulations.ipynb explores early simulated data (pre-release).
- testing_hypotheses.ipynb explores batch simulation data the were produced using release v0.1 of the model.
- testing_hypotheses_2.ipynb expands on the previous analysis to study the effects of the skill decay parameter.
- network_analysis.ipynb studies the social network of successful collaborations between workers.
- probability_components.ipynb explores the project success probabilities and their components and develops a probability normalisation method that is used in releases v1.2 and above to ensure that success probabilities lie in the closed interval [0, 1].
The hypotheses are detailed in the model specification document. There are also new analysis scripts in retrospective_metrics. These scripts are for metrics or analyses that are currently (v1.2) being computed retrospectively, however in the next release they will be incorporated into the core simulation model, and the scripts will be deprecated (see Roadmap below). The scripts are:
- network_reconstruction_ne.py which reconstructs the network of successful collaborations between workers on every timestep and stores this in a concise representation (an initial and diff file) to be imported to networkx.
- preset_E.py which creates a new pseudo-simulation by post-processing pre-simulated data to removing the inactive workers from the workforce on each timestep (down to a target slack of 10%). The resulting simulation is referred to as 'Preset E' in the Streamlit application.
- roi.py which computes the new 'Return on Investment' metric and averages it across the workforce on each timestep. See the model specification document for more details.
Roadmap
The current roadmap consists of the following milestones with expected shipping dates in square brackets:
- Add ROI calculation, 'Preset E' and new network output format to the core simulation code (rather than using scripts to retrospectively compute these.) [Q1 2022]
- Complete unit testing framework to reach 100% (currently at 97%). [Q1 2022]
- Add performance benchmarking (for optimisation routine). [Q3 2021]
- Implement Reinforcement Learning approach to team allocation and compare performance with basin-hopping. [Q2 2022]
- Alter training mechanism (and explore the 'rich get richer' effect). [tbc]
- Edit social network to track more information about historical collaborations. [tbc]
- Review model extensions and choose those to implement. [tbc]
See open issues for details of milestone 1.
Tests
The repository is currently at 97% code coverage. To run the unit tests use:
coverage run -m unittest discover && coverage report
For an interactive html coverage report run coverage html -i and then open index.html in your browser.
Model development
The documentation folder contains a word document with the full model specification.
The directory model_development contains Jupyter Notebooks relating to various stages of development of the model, from initial experiments prior to coding to the model, through to integration tests and performance benchmarking. These notebooks include:
- active_project_equilibrium.ipynb: short experiment to determine equilibrium behaviour of the model based on proposed dynamics (i.e. number of active projects, number of active workers, number of workers per project). Written prior to coding up the model.
- function_definitions.ipynb: contains definitions of the various functions used throughout the model, with the original piecewise functions that were proposed in the draft spec, and possible alternatives that were suggested. In the model code functions are supplied by the FunctionFactory.
- go_allocate_experiment.ipynb: determines the time taken to find the global optimum team allocation (which was originally called 'go_allocate'). The conclusion, as expected, is that the problem is not computationally tractable for large simulations, and so numerical optimisation will be required.
- manually_testing_model.ipynb: simple integration test to check that model is running correctly and inspect some of the variables.
- comparing_strategies.ipynb: comparing the 'Random' and 'Basic' team allocation strategies to confirm that 'Basic' does improve over the random method in terms of project success probability.
- optimization_experiments_gekko.ipynb: failed attempt to get Gekko mixed-integer optimisation working (included for completeness).
- optimization_experiments_scipy.ipynb: experiments to get SciPy optimisation up and running.
- Benchmarking notebook: to be added in future milestone.
