Ex Numerus: August 2011

Thursday, August 4, 2011

Random Notes: Validation of models vs physical systems when controls are developed independently

Problem Statement

How do you validate a model of a system against a physical system when a controller is necessary to make the system operate and the the operational policies of the controllers were developed independently.

Discussion

Consider a development process with a well defined operation cost model which drives both the model and physical system to optimization that operations cost. If the model uses full state feedback for control and the physical system uses either full state feedback or implements output feedback with state estimators and relies on certainty equivalence, there is no guarantee that under identical test conditions the trajectories of the cost and the trajectories of the states will be identical even if both system are optimal with respect to the operational costs. This is because optimal operational costs are unique, but there are no guarantees that optimal tratectories are unique. Furthermore, if the controls in both cases are not globally optimal, by only near optimal, then likelihood of non-unique trajectories is even more likely. However, because the operational costs can be unique, the validation exercise can be decomposed into two validation steps.

First, the equations which model the physics can be validated against test data on the physical system by measuring the states in the real system, then substituting the integrator in the model with the state measurements. Ideally, the physical system could execute both policies. The error in costs, and derivative calculations can be compared to quantify the error between the model of the physics and the real physics.

Second, once the errors in the model of the physics are quantified, the error in the costs under the different controllers can be quantified. Ideally, from this step, the optimality of the controls wrt to a globally optimal controller (or minimizing controller) can be established. Once interesting possibility is to use policy improvement to see if the independently developed policies can merged for better performance. Alternatively, if there are unexplained differences, then the constraints respected by the different policies need to be reconciled. Things like robustness may also contribute to differences. Robustness will in general be a driven by different view of noise and risk sensitivity. Following on that, there is the possibility that the equation structure in the different policies lead to different performance limitations. Again, policy improvement may provide a way to identify these structure imposed limitations.

This work is licensed under a Creative Commons Attribution By license.

Wednesday, August 3, 2011

Random Notes: Thoughts on metrics for engineering tools

If training is required for successful usage, what is the average half-life of the training. In other words, if a group of users is trained, how long until 50% of the users will forget some key aspect of tool usage which drives them to abandon the tool?
What is the average time for user to need to go to the help files to complete a task if they do not use the tool constantly?
Can a user successfully use the tool without training?
Are the documentation and examples sufficient for self learning?
How many actions are required to complete a ‘quickstart’ example?
How many decisions are required to complete a ‘quickstart’ example?
How many choices are the in each decision in a typical workflow?
How difficult is it to integrate the tool into automated work flows?
How difficult is it to customize the tool?
- Can a power user customize the tool?
How long does it take to introduce a new feature in the tool?
How many sentences does it take to describe why a user should adopt the tool?
In the absence of process enforcement, would the users naturally adopt this solution?
What is the time saving for the individual, team, and organization from the adoption of the tool?
If the tool reduces error rates, is there feedback to the users to help them understand the improvement?
Can the input and output to the tool be reused so that the effort can be reapplied?
What is the ‘activation potential’ to get a new user to adopt the tool?

Do new users request access to the tool?

In a corporate setting, how difficult are the permissions to manage?

If a new user if not setup, will the team be able to duplicate the permissions with without calling the developers?

This work is licensed under a Creative Commons Attribution By license.

Monday, August 1, 2011

Draft: Notes on dynamic programming equations which solve cost models for dynamic systems

Deterministic Cost Models

Description	Cost Model	Dynamic Programming Equations	Restrictions
Finite Horizon Total Cost	\(J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right)\right)\)	\( V_{k}^{\pi}\left(x\right)=c_{k}\left(x,\pi\left(x\right)\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)\),\(\forall k\in\left\{ 0,\cdots,K-1\right\} \) \(V_{K}^{\pi}\left(x\right)=c_{K}\left(x,\pi\left(x\right)\right)\)	\(0\leq\alpha<1\)
Infinite Horizon Total Cost	\(J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right)\right)\)	\(V^{\pi}\left(x\right)=c\left(x,\pi\left(x\right)\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)\)	\(0\leq\alpha<1\)
Finite Horizon Shortest Path	\(J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right)\right)\)	\( V_{k}^{\pi}\left(x\right)=c_{k}\left(x,\pi\left(x\right)\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)\),\(\forall k\in\left\{ 0,\cdots,K-1\right\} \) \(V_{K}^{\pi}\left(x\right)=c_{K}\left(x,\pi\left(x\right)\right)\)	\(0\leq\alpha\leq1\) \(\left\{ x\in\chi\|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\} \)
Infinite Horizon Shortest Path	\(J^{\pi}\left(x_{0}\right)=\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right)\right)\)	\(V^{\pi}\left(x\right)=c\left(x,\pi\left(x\right)\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)\)	\(0\leq\alpha\leq1\) \(\left\{ x\in\chi\|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\} \)
Average Cost	\(J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\lim}\frac{1}{K}\sum_{k=0}^{K}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right)\right)\)	\(V^{\pi}\left(x\right)+\lambda=c\left(x,\pi\left(x\right)\right)+V^{\pi}\left(f\left(x,\pi\left(x\right)\right)\right)\)	\(0\leq\alpha<1\) \(V^{\pi}\left(x_{ref}\right)=0\) for some \(x_{ref}\in\chi\)

Stochastic Cost Models

Description	Cost Model	Dynamic Programming Equations	Restrictions
Finite Horizon Total Cost	\(J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right),w\right)\right]\)	\(V_{k}^{\pi}\left(x\right)=E^{W}\left[c_{k}\left(x,\pi\left(x\right),w\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]\) \(V_{K}^{\pi}\left(x\right)=E^{W}\left[c_{K}\left(x,\pi\left(x\right)\right)\right]\)	\(0\leq\alpha<1\)
Infinite Horizon Total Cost	\(J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right),w\right)\right]\)	\(V^{\pi}\left(x\right)=E^{W}\left[c\left(x,\pi\left(x\right),w\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]\)	\(0\leq\alpha<1\)
Finite Horizon Shortest Path	\(J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{K}\alpha^{k}\cdot c_{k}\left(x_{k},\pi\left(x_{k}\right),w\right)\right]\)	\(V_{k}^{\pi}\left(x\right)=E^{W}\left[c_{k}\left(x,\pi\left(x\right),w\right)+\alpha\cdot V_{k+1}^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]\) \(V_{K}^{\pi}\left(x\right)=E^{W}\left[c_{K}\left(x,\pi\left(x\right)\right)\right]\)	\(0\leq\alpha\leq1\) \(\left\{ x\in\chi\|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\} \)
Infinite Horizon Shortest Path	\(J^{\pi}\left(x_{0}\right)=E^{W}\left[\sum_{k=0}^{\infty}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right),w\right)\right]\)	\(V^{\pi}\left(x\right)=E^{W}\left[c\left(x,\pi\left(x\right),w\right)+\alpha\cdot V^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]\)	\(0\leq\alpha\leq1\) \(\left\{ x\in\chi\|c\left(x,\pi\left(x\right)\right)=0\right\} \neq\left\{ \oslash\right\} \)
Average Cost	\(J^{\pi}\left(x_{0}\right)=E^{W}\left[\underset{K\rightarrow\infty}{\lim}\frac{1}{K}\sum_{k=0}^{K}\alpha^{k}\cdot c\left(x_{k},\pi\left(x_{k}\right),w\right)\right]\)	\(V^{\pi}\left(x\right)+\lambda=E\left[c\left(x,\pi\left(x\right),w\right)+V^{\pi}\left(f\left(x,\pi\left(x\right),w\right)\right)\right]\)	\(0\leq\alpha<1\) \(V^{\pi}\left(x_{ref}\right)=0\) for some \(x_{ref}\in\chi\)

Risk Aware/Averse Stochastic Cost Models

Description	Cost Model	Dynamic Programming Equations	Restrictions
Certainty Equivalence with exponential utility	\(J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\limsup}\frac{1}{K}\cdot\frac{1}{\gamma}\cdot\ln\left(E^{W}\left[\exp\left(\sum_{k=0}^{K-1}c\left(x,\pi\left(x\right),w\right)\right)\right]\right)\)
Mean-Variance

Cost Models That don’t work or have issues

Description	Cost Model	Issues
Expected exponential disutility	\(J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\limsup}\frac{1}{K}\cdot E^{W}\left[\textrm{sgn}\left(\gamma\right)\cdot\exp\left(\gamma\cdot\sum_{k=0}^{K-1}c\left(x,\pi\left(x\right),w\right)\right)\right]\)	Does not discriminate among policies
Different version of expected exponential disutility	\(J^{\pi}\left(x_{0}\right)=\underset{K\rightarrow\infty}{\limsup}\frac{1}{\gamma}\cdot\log\left(E^{W}\left[\exp\left(\gamma\cdot\frac{\gamma}{K}\sum_{k=0}^{K-1}c\left(x,\pi\left(x\right),w\right)\right)\right]\right)\)	Generally reduces to cost average

References

B. Deourny, D. Ernst, and L. Wehenkel, Risk-Aware Decision Making and Dynamic Programming
J. Harney and P. Doshi, Risk-Sensitive Querying for Adaptive Service Compositions
G. Avila-Godoy, Modularity Results and Risk Sensitive Controller Markov Chains: A Case Study
R. Cavazos-Cadena and R. Montes-De-Oca, Optimal Stationary Policies in Risk Sensitive Dynamic Programs with Finite State Space and Non-negative Rewards, Application Mathematicae, 27, 2 (2000), pp 167-185.
A. Brau and E. Fernandez-Gaucherand, Controlled Markov Chains with Risk-Sensitive Exponential Average Criteria, Proceedings of 36th Conference on Decision and Control, San Diego, Calif, Dec 1997.
M. Koenig and J. Meissner, Risk Minimizing Strategies for Revenue Mangement Problems with Target Values.
M. Koenig and J. Meissner, Value at Risk Optimal Policies for RM Problems

This work is licensed under a Creative Commons Attribution By license.

Ex Numerus

Thursday, August 4, 2011

Random Notes: Validation of models vs physical systems when controls are developed independently

Problem Statement

Discussion

Wednesday, August 3, 2011

Random Notes: Thoughts on metrics for engineering tools

Monday, August 1, 2011

Draft: Notes on dynamic programming equations which solve cost models for dynamic systems

Deterministic Cost Models

Stochastic Cost Models

Risk Aware/Averse Stochastic Cost Models

Cost Models That don’t work or have issues

References

Contact Info

Search This Blog

Blog Archive

Pages

Labels

Development Tools

Tool Links

Visualization Tools

Other Links

Followers

About Me

Rendering

Ex Numerus

Thursday, August 4, 2011

Random Notes: Validation of models vs physical systems when controls are developed independently

Problem Statement

Discussion

Wednesday, August 3, 2011

Random Notes: Thoughts on metrics for engineering tools

Monday, August 1, 2011

Draft: Notes on dynamic programming equations which solve cost models for dynamic systems

Deterministic Cost Models

Stochastic Cost Models

Risk Aware/Averse Stochastic Cost Models

Cost Models That don’t work or have issues

References

Contact Info

Subscribe To

Search This Blog

Blog Archive

Pages

Labels

Development Tools

Tool Links

Visualization Tools

Other Links

Followers

About Me

Rendering