# Input distribution types

When defining the design space of the project, the user needs to decide not only the range of values for each input distribution but also its probability distribution. Especially for variables representing e.g. environmental variables or material characteristics, this allows to setup of the case to be much closer to real-world conditions. As a result, the obtained mathematical model represents the reality in a better way.

## Featured distribution types#

### Normal distribution#

Normal distribution, also known as Gaussian distribution, is a continuous probability distribution that is symmetric around its mean value. According to the empirical rule, 99.7% of all its values lie within three standard deviations of the mean value. It is expected for physical quantities that are a sum of many independent processes affected by random errors, e.g. material properties.

### Uniform distribution#

It is considered a continuous symmetric distribution where the probability of occurrence in the distribution is equal for all values from a given range. It is a type of distribution well-suited for all input parameters dependent on the designer's decision.

### Lognormal distribution#

It is a continuous probability distribution where the logarithm of its values is distributed normally. This distribution consists of real positive values only. It can be observed among various phenomena in the fields of e.g. biology, sociology, chemistry, technology, and finance.

### Gumbel distribution#

It is a continuous distribution describing the distribution of the maximum (or the minimum) values of a series of distributions, e.g. maximum rainfall rate in a series of distributions measured over time. Since it is related to the duration of time up to an event, its use can be in e.g. reliability analysis in engineering.

### Laplace distribution#

It is a symmetric continuous probability distribution also known as the double exponential distribution, since it can be seen as an exponential distribution mirrored around the location parameter. It is the distribution of differences between two independent varieties with identical exponential distributions. It can be used for describing the distribution of errors in observed values.

### Beta distribution#

It is a continuous probability distribution defined on the interval (0,1), thus, it can be used to model random variables of prescribed length. Settings of its two parameters can result in a wide variety of shapes of the distribution (e.g. it turns out to be the uniform distribution with default values of parameters, same as in Figure 2), therefore, there is a wide variety of its use.

### Logistic distribution#

It is a symmetric continuous probability distribution very similar to the normal distribution in its appearance but has heavier tails, which can increase the robustness of analyses based on it when compared to those using the normal distribution. It is used for logistic regression model predicting the likelihood of an event based on an input variable.

### Poisson distribution#

It is a discrete probability distribution used to represent integer-valued counts, such as the number of events observed within an interval of time or other specified dimensions such as distance, area or volume. Therefore, it is used for applications in many fields related to counting individual objects, e.g. astronomy, biology, and telecommunication.

### Logarithmic distribution#

Also known as the logarithmic series distribution, is a discrete probability distribution based on the standard power series expansion of the natural logarithm function. It can be used in cases involving several event occurrences within a time period or space.

### Power distribution#

It is a continuous probability distribution defined on the interval (0,1), also known as the power function distribution. It also may be seen as the inverse of the Pareto distribution, thus, its application can cover a similar range of problems.

### Discrete distribution#

As the name suggests, it is a discrete type of distribution, in this case completely arbitrary according to the user's needs. As inputs, two vectors of comma-separated values are required. One is the Vector of positions setting distribution values, the other is the Vector of weights setting ratios of values in the input distributions. At least three distinct position values must be given, weights must be greater than zero.

### Boolean distribution#

A special case of the discrete distribution. Although in its traditional meaning it is used as a $true/false$ or $0/1$ value switch, here it is not limited so strictly. In general, it can be used for any parameter splitting data into two categories. To define the distribution, the user only needs to enter two position values (i.e. categories, characteristics, limits, etc.) and their weights. As for the regular discrete distributions, weights have to be greater than zero.

### User defined distribution#

The principle is similar to the discrete distribution, but this custom set distribution is continuous. Its parameters are set through a standalone GUI whose features are described in a separate sheet available with the Distribution Creator tool described below.

## Distribution Creator#

With the help of the Distribution Creator, a distribution of an arbitrary shape according to the user's needs can be prepared. It is an add-on of the Input Preprocessor program popping up on + Define new item selection from the distribution selector. The distribution shape is designed directly in the graphic interface allowing one to see the resulting shape immediately. The distribution shape itself can be also loaded from a *.csv file when needed.

### How to use the interface#

The initial state of the application's window can be seen in Figure 11. On the left, there is the Existing custom distributions section, where the management of user defined distributions present in the project can be done. By clicking on items in the list below the user can either Edit the existing distribution or Make a copy of it, which can be renamed and adjusted separately. Below the list, there are two buttons. Create new starts a new distribution preparation from scratch. The button Load form data file allows users to upload coordinates of the distribution's control points directly from a *.csv file. It is a very useful option for situations where the probability distribution can be e.g. measured or found in the literature. The source file needs to have two comma-separated columns defining X and Y coordinates of control points. This formatting is very similar to what the user can see in the GUI when creating the distribution directly in the tool.

The central section of the GUI window is dedicated to the graphical definition of the input distribution. On the top of this section, there is an entry field where the input distribution name can be changed. This modification appears instantly in the list on the left. Be aware that the name of each input distribution has to be unique. Right under the Name entry, the Boundaries of the input distribution shape can be adjusted. It is the setting of limits applicable for all definition points of the distribution, it is not possible to create a definition point outside the boundaries.

##### Cropping boundaries

There are two possibilities for how to handle definition points in case of restricting/cropping boundaries of an existing distribution shape. First, points not fitting into cropped boundaries can be deleted completely. The second option is to scale the whole distribution based on the new range of boundaries.

The gridded canvas making the most of the area is an interactive space for the definition of the input distribution shape. Just by the left mouse button clicking, the user can create new definition points of the input distribution shapes - their position within the boundaries and the weight of each point ranging from 0 to 1. Unwanted definition points can be easily removed by the right mouse button click at these. Dragging a point with the mouse adjusts its position and weight but with the restriction given by distribution boundaries and/or positions of surrounding definition points.

Files of each user defined distribution shape are stored in a dedicated subfolder inside the custom_distributions folder, located inside the project. There are two files: info.json with general information about the distribution name and boundaries, the points.csv contains a table of the distribution points' coordinates.