By Derong Liu, Qinglai Wei, Ding Wang, Xiong Yang, Hongliang Li
This ebook covers the newest advancements in adaptive dynamic programming (ADP). The textual content starts with an intensive heritage evaluate of ADP with the intention that readers are sufficiently acquainted with the basics. within the middle of the booklet, the authors tackle first discrete- after which continuous-time platforms. assurance of discrete-time platforms starts off with a extra common kind of price generation to illustrate its convergence, optimality, and balance with entire and thorough theoretical research. A extra reasonable type of price generation is studied the place price functionality approximations are assumed to have finite blunders. Adaptive Dynamic Programming additionally info one other street of the ADP strategy: coverage generation. either uncomplicated and generalized sorts of policy-iteration-based ADP are studied with whole and thorough theoretical research when it comes to convergence, optimality, balance, and blunder bounds. between continuous-time structures, the regulate of affine and nonaffine nonlinear structures is studied utilizing the ADP strategy that's then prolonged to different branches of regulate thought together with decentralized regulate, strong and warranted price regulate, and online game idea. within the final a part of the ebook the real-world importance of ADP thought is gifted, targeting 3 program examples built from the authors’ work:
• renewable power scheduling for clever energy grids;• coal gasification tactics; and• water–gas shift reactions.
Researchers learning clever keep watch over tools and practitioners seeking to practice them within the chemical-process and power-supply industries will locate a lot to curiosity them during this thorough therapy of a complicated method of control.
Read Online or Download Adaptive Dynamic Programming with Applications in Optimal Control PDF
Similar robotics & automation books
The harddisk is among the most interesting examples of the precision regulate of mechatronics, with tolerances below one micrometer completed whereas working at excessive pace. expanding call for for larger information density in addition to disturbance-prone working environments proceed to check designers mettle.
In a well timed topic, arrived a few week after i ordered it and the booklet is in solid conition.
LEGO Mindstorms NXT is the preferred robotic out there. James Kelly is the writer of the preferred weblog on NXT (http://thenxtstep. blogspot. com/) with over 30,000 hits a month. The NXT-G visible programming language for the NXT robotic is totally new and there are at present no books to be had at the topic.
Das Werk gibt eine ausführliche Einführung in die Identifikation linearer und nichtlinearer Ein- und Mehrgrößensysteme. Es werden zahlreiche Identifikationsverfahren vorgestellt, mit denen aus gemessenen Ein- und Ausgangssignalen ein mathematisches Modell zur Beschreibung des Systemverhaltens ermittelt werden kann.
- Stellgeräte für die Prozeßautomatisierung: Berechnung — Spezifikation — Auswahl
- Robots, Androids, and Animatrons
- Robust Adaptive Control
- Moral Machines: Teaching Robots Right from Wrong
- Optimal control : an introduction to the theory and its applications
- Optimization and Optimal Control in Automotive Systems
Additional info for Adaptive Dynamic Programming with Applications in Optimal Control
IEEE Circuits Systems Mag 9(3):32–50 43. Lewis FL, Vrabie D, Vamvoudakis KG (2012) Reinforcement learning and feedback control: Using natural decision methods to design optimla adaptive controllers. IEEE Control Syst Mag 32(6):76–105 44. Li H, Liu D (2012) Optimal control for discrete-time affine non-linear systems using general value iteration. IET Control Theory Appl 6(18):2725–2736 45. Li H, Liu D, Wang D (2014) Integral reinforcement learning for linear continuous-time zerosum games with completely unknown dynamics.
4, the critic network training is done by 14 1 Overview of Adaptive Dynamic Programming Fig. 8) k where Qk = Q(xk , uk , Wqc ) and Wqc represents the parameters of the critic network. 7). From Fig. 4, we can see ˆ xk+1 , Wc ) = J( ˆ F(x ˆ k , uk ), Wc ). that Qk = Q(xk , uk , Wqc ) is equivalent to Jˆk+1 = J(ˆ The two outputs will be the same given the same inputs xk and uk . However, the two relationships are different. In model-free ADP, the output Qk is explicitly a function of xk and uk , while the model network Fˆ becomes totally hidden and internal.
Denotes the control sequence starting at time k, U(·, ·) ≥ 0 is called the utility function and γ is the discount factor with 0 < γ ≤ 1. Note that the function J is dependent on the initial time k and the initial state xk , and it is referred to as the cost-to-go of state xk . The cost in this case accumulates indefinitely; this kind of problems is referred to as infinite-horizon problems in dynamic programming. On the other hand, in finite-horizon problems, the cost accumulates over a finite number of steps.