Statistical Optimization
Uptimai Statistical Optimization gives the optimum zone of design for a given problem. To do that, it builds a set of models using the Uncertainty Quantification tool, to capture and understand the effects and the importance of all variables and interactions in the different scales of the problem and with that knowledge reduces the initial domain of study up to a desired zone. This approach can to find the optimum domain with a low number of samples, and more importantly, give a robust solution that performs well regardless of external uncertainties. The Statistical Optimization method will call for outputs corresponding to specific combinations of input parameters, thus, it is intended to work in connection with engineering computational codes etc.
How to use the interface
The general appearance of the program window, especially its left section, is described in detail in the Input preparation link. Here the main focus is on the other part which is to a certain point individual for each of the supported methods. The initial of the GUI window when preparing inputs for the Statistical Optimization is shown in Figure 1, as the user starts from scratch and needs to Define Input Variables.
Define Input Variables
In the beginning, only two controls are available for the user:
- # of Monte-Carlo samples : Entry field setting the number of samples for Monte Carlo simulation, size of input distributions. Must be an integer value between 1,000 and 1,000,000.
- Add input variable : Creating a new variable (parameter) of the input domain. Each added variable appears at the bottom of the list of already existing variables.
The Monte Carlo sampling is used for model propagation and visualizations. The default value of 100,000 is based on the best-practice trade-off between the speed of the solver and postprocessor, file sizes, and model precision.
Adding an input variable enhances the input domain space with one dimension. There has to be at least one for the model to be created. Figure 2 describes the situation with multiple input variables already created (see this link for more info about the Borehole problem used here as an example). One of the displayed variables is about to be edited. The input variable can be set using the following controls:
- Variable name : Label of the input parameter, which is being used throughout the whole process up to the postprocessing. The variable name cannot contain empty spaces, these are automatically replaced with underscores.
- Distribution : Selection box where the user sets the shape of the probabilistic function for the input variable. According to the distribution type selected, additional entries with shape parameters appear. A detailed description of featured probability distribution types can be found in the section Input distribution types.
- Confirm : Any changes need to be confirmed with this button to take effect.
- "X" : Each input variable can be deleted when clicking this icon.
- "=" : Allows input variable dragging to change the ordering of inputs in the projects.
- "+ Advanced Options:
- Activation Type" : Allows change between Active (by default) and Inactive. Active means that the intrinsic uncertainty of the variable will be propagated and Inactive means that only the nominal value will be used (variable won’t be studied).
- Variable Type : Allows change between Design (by default) and Uncertain. Design variables will be optimized while input distributions of Uncertain variables remain unchanged. These can be e.g. environmental characteristics, material strength, etc.
Be aware only uniform and discrete types of input distributions are allowed for the Statistical optimization method.
Adding one or more input variables activates the Prepare distributions button. This one invokes the preparation of randomly distributed samples according to the settings. In case there are invalid entries in the input variable definition, the user is informed and not allowed to continue to the next step until everything is by the book. Then, the button itself turns into Tweak distribution options, sending the user to this next step. Also, the Tweak Distribution Options item is activated in the fishbone navigation bar on the left.
Tweak distribution options
At this point, the user adjusts the so-called nominal point. The nominal point is a sample acting as a baseline for the created surrogate model and analysis. In the model, the results of all data samples are compared with the result value of the nominal sample. This process allows handling the effects of input parameters and their interactions separately as increments to the nominal value. It must be within the range of each input variable and not be equal to its boundaries. Although not strictly necessary, it is recommended to place the nominal sample into the statistical centre of the domain. Then, the process of the surrogate model creation is most efficient and precise. The nominal sample's default position is suggested as the mean of the probability distribution of each input variable. When changing its position, (shown in Figure 3) it is advised to not shift it by more than 10% of the range of each input. As in the case of input variable distribution definition, all changes must be saved using the Confirm button.
For the sampling of variables leading to periodic or symmetrical functions (typically, but not exclusively, angles of any kind), extra caution is required. It is highly recommended not to set their nominal value exactly to the centre of symmetry of the corresponding input distribution! A typical example can be the angular position of a crankshaft, wave phase, etc.
At the bottom right there is a Define Convergence button. If everything on the page is set correctly, it will send the user to this next step. Also, the Define Convergence item is activated in the fishbone navigation bar on the left.
Define Convergence
On this page, the user can define the Target percentile for every active variable. This value must be between 0 and 1 and defines the minimum percentage of the original domain of the variable that will be kept at the end of the optimization process. By default, this value is set to 0.2, which means that for a given variable its range won’t be reduced more than up to 20% of the original range (an 80% reduction). Note that this value is a limit but the optimization process can converge before achieving this reduction.
Clicking the Generate data button at the bottom right invokes the saving
of .txt
files with randomly distributed samples according to the settings and input domain info.
Then, the button label itself turns into a View Data Histogram,
sending the user to this next step. Also, the View Data Histogram item is activated
in the fishbone navigation bar on the left. For fundamental changes in any input distribution,
the user can return back to the previous step with the Return to Input Variables button.
View Data
In this section, all created input variables can be reviewed to check the generated distributions of Monte-Carlo samples. It is recommended to provide such type of check before an actual statistical optimization run to prevent solver crashes and eliminate misinterpretation of results.
At the top of the screen, users can switch between two Data view options:
Histograms and Scatter plot. Common for both modes is that shown figures
can be saved as *.png
or *.jpg
files with the 💾 icon placed on the upper left of
the plot.
Plus, there are two more buttons under the plot. Return to Distribution Options brings users back
to the previous step of Tweak Distribution Options where they can fix boundaries or
the nominal sample position. The other button will Close Preprocessor since all the input files
required are ready for the next step, which is Core Solver Setup.
Histograms
Histograms are the default view mode, showing the statistical distribution of randomized samples along the range of each input variable. Additional vertical lines that can be seen in the plot show the boundaries of the input variable distribution (input domain edges) and the position of the nominal sample. Also, clicking into the plot invokes the cross with a label showing the exact value of the selected point in the histogram.
To the left of the actual histogram plot, there are controls of the figure to be shown. The box labelled Variables contains the list of input parameters available in the domain. Each item can be selected by mouse clicking, showing the corresponding distribution shape. The appearance of the plot can be changed using the settings in the Plot options section:
-
Plot title : Displayed above the plot, input variable name by default.
-
X label : Label of the X axis, input variable name by default.
-
Y label : Label of the Y axis. The default text contains the number of samples used for the histogram and the number of bins these are split in.
-
Title size : Size of the title font.
-
Label size : Size of the font for both axis labels.
-
Show legend : Switch turning on/off the legend of the plot.
-
Legend font size : Size of the legend font.
-
Range : Double-sided slider allowing to show a slice of the input distribution in detail. Dragging one of the slider's points limits the depicted range one can move with the section along the X-axis by dragging the green bar of the slider (both edge points are highlighted).
All ranges in the plot can be also precisely using the ⚙ icon on the right of each slider. This opens a sub-dialogue with entry fields for writing exact values of range limits. These need to be confirmed with the Set button. Setting values outside the domain's boundaries will reset range limits to the default state.
-
Adjust axes : Toggle if the X-axis range of the plot should be only the range adjusted with the slider above (on) or the full range of the input distribution (off).
-
Normalize plot : Turn on/off normalization of the histogram. Y-axis values change accordingly, Y-axis' default label changes from N to Density.
-
Log. vertical axis : Turns on/off logarithmic scaling of the Y-axis.
-
Bin count : Set the number of bins for the histogram. The recommended value is below . Needs to be confirmed with the Apply button.
Scatter plot
The scatter plot shows the actual positions of randomized samples in the domain. For the setup of the plot, on the left side, there is a checkbox list of the input variables. Users can select one or two input Variables at a time. When only one input variable is selected, it is drawn on both horizontal and vertical axis. The appearance of the plot can be changed using the settings in the Plot options section.
-
Plot title : Displayed above the plot, names of selected input variables by default.
-
X label : Label of the X axis, first selected input variable name by default.
-
Y label : Label of the Y axis, second selected input variable name by default or the name of the single selected input variable.
-
Title size : Size of the title font.
-
Label size : Size of the font for both axis labels.
-
Show legend : Switch turning on/off the legend of the plot.
-
Legend font size : Size of the legend font.
-
Range X : Double-sided slider allowing to show a slice of the data in detail. Dragging one of the slider's points limits the depicted range of input variable values, one can move with the section along the X-axis by dragging the green bar of the slider (both edge points are highlighted).
-
Range Y : Double-sided slider allowing to show a slice of the data in detail. Dragging one of the slider's points limits the depicted range of input variable values, one can move with the section along the Y-axis by dragging the green bar of the slider (both edge points are highlighted).
All ranges in the plot can be also precisely using the ⚙ icon on the right of each slider. This opens a sub-dialogue with entry fields for writing exact values of range limits. These need to be confirmed with the Set button. Setting values outside the domain's boundaries will reset range limits to the default state.
-
Adjust axes : Toggle if the X-axis range of the plot should be only the range adjusted with the slider above (on) or the full range of the input distribution (off).
-
Reduction coefficient : Variable that reduces the total number of samples that are plotted for an easier interpretation of results. If set to , the whole set of samples will be depicted.