Skip to content

Meeting 8 November 2017

Decidetto edited this page Nov 8, 2017 · 8 revisions

Meeting Minutes - Project Group 8

Location:  Kurt's office (SSK)
Date:   08 November 2017
Time:   14:30 - 15:00

Attendance

  • Joshua Scheidt
  • Marciano Geijselaers
  • Timo Raff
  • Max Meijers
  • Simon Craenen
  • Kurt Driessens

Agenda Questions

  • Does the website has to be online, or is a local offline website alright as well?
  • Could we use the github pages generated website?
  • Can we use restricted boltzmann machines as a classifier? When all games have been classified into categories, we can train one for each category and when we receive a new game, we give the features ( yet to be selected ) into all of the machines and the ones with the lowest energy.
  • Classification of the games by hand might be hard, do you think we could use some auto-encoder or something like that could find features itself on how to classify it or classify the games itself.

Agenda notes:

Look for papers on:

  • Functional gradient policy descent
  • Non-parametric policy gradient -- kurt's name on it

Categorising games

  • Puzzle game characteristic detectable at runtime -- might be if there are movable objects in level
  • Look very specifically into game characteristics visible in code

Possible approaches

    • Discover type
    • Select strategy for type
      • Learn strategy for type
    • <s,a> state action pairs -> hidden layer auto-encoder -> <s,a>

    • Do Q-learning in latent space

    • Have to find the latent space <- real question

    • Otherwise <s,a,s'> state, action, next state. Essentially encodes dynamics of game

      • Might want to include the reward in the tuple, might not be optimal.
    • Point locations of Q values of tuples occupy latent space

      • Groups might occupy a certain area of latent space
      • Sum of all for all games assumed to be gaussian
      • Axes of latent space not defined, have to find them based on relations between points
      • Might be able to classify based on location
      • When we find a Q-value, it will give a probability of
    • Could even add the category/type to the tuple


Dialing back to try if it works at all

  • Choose number of games to collect data on - <s,a,s',r>

  • Need to sample from games. Many many many many many many many MANY samples from games

  • Try making autoencoder and doing latent space on single game

    • Keep it low-dimension
      • 2, maybe 3
    • Keras?
    • Variational autoencoders? Variational Autoencoders in Keras
  • How to represent the state

    • Deictic -- limit visible space to reduce state space and introduce stochasticity

-Learn to land the rocket game as a trial -Taking sample apparently takes zero time, so take advantage of that


TODO

  • Start collecting data
  • Try solving rocket game
    • Collect experience
    • Try solving things
  • Continue assigning categories to games