top of page

Solving MDPs using Value iteration

To download the notebook and other relevant files please visit the Gitlab repository:

 

https://gitlab.tudelft.nl/pbhustali/mdp_tutorials

> Application examples

Paper Title    Graph-based reinforcement learning for discrete cross-section optimization of planar steel frames 

Year              2022

Author(s)      Kazuki Hayashi, Makoto Ohsaki   

Link              https://www.sciencedirect.com/science/article/pii/S1474034621002603 

ML Tags

Reinforcement Learning

Value iteration

Q-learning 

Topic Tags

Structural Optimization

Graph Embedding

Summary 

​

This paper proposes a combined method of Graph Embedding and Reinforcement Learning to optimize the 2-dimensional cross section of steel frames. The cross sections are chosen from a given set of standard dimensions and constraints related to the structural performance of the frame are integrated into the problem. The cross-section size of each member changes monotonically along its length to allow for realistic learning time. 

​

Graph Embedding process is adjusted to extract member features, which are later transformed into action values. Q-learning is used to formulate the loss function to be minimized. Besides the large computational time needed and the complexity of the training process, the agent showed strong performance and resulted in reasonable solutions in the case of planar profiles. However, the method cannot be applied in 3D shapes or frames with irregular cross sections. 

Paper Title    Energy optimization associated with thermal comfort and indoor air control via a deep

                       reinforcement learning algorithm 

Year              2019

Author(s)      William Valladares, Marco Galindo, Jorge Gutiérrez, Wu-Chieh Wu, Kuo-Kai Liao, Jen-Chung Liao,

                      Kuang-Chin Lu, Chi-Chuan Wang  

Link              https://www.sciencedirect.com/science/article/pii/S0360132319302008?casa_token=eULhiqiCwHIAAAAA:aXFZGFWg8QbQjEvYNWCVxS3E2sYVcVDTqM5ucuOYZUU3CIhiV6o_QFLj54jn8OrexlKzDt2Z 

ML Tags

Reinforcement Learning

Q-learning

Double Q-learning 

Topic Tags

Building performance optimization

Thermal comfort

Indoor air quality

HVAC control 

Software & Plug-ins Used 

​

  • EnergyPlus for energy simulation (in combination with SketchUp Make 2017 Edition and Open Studio v1.12.4)

  • BCVTB as co-simulation open-source framework

  • Python as programming interface for the DRL agent 

​

Summary 

​

This paper proposes a RL-based method that preserves sufficient levels of thermal comfort and air quality in indoor environments while spending minimum energy from mechanical ventilation, such as air-conditioning and ventilation fans. The RL framework is developed based on the Q-learning approach and, particularly, uses double Q-learning which can process the subsequent complex interactions. 

​

Particularly, 3 main stages are defined; pre-training, training and control of actions. During the pre-training phase, some initial data are created and saved in order to be leveraged during the training. After the experience data are aggregated, the learning process and control loop are initialized. The algorithm is applied in a classroom and a laboratory case study. In both environments, the agent manages to control sufficiently the given needs, maintaining satisfactory PMV and air quality values. 

bottom of page