Optimization, Design of Experiments & Deep learning training

Optimization, Design of Experiments & Deep learning training

I have been interested in AI applications in mechanical engineering for some time, especially those related to image processing, object classification, and identification, with applications in autonomous vehicles.

Therefore, courses like the following offered by MathWorks:

https://matlabacademy.mathworks.com/details/deep-learning-with-matlab/mldl

helped me a lot to understand and apply concepts related to the use, creation and training of various types of networks.

An interesting point is that during the network training phase, applying tools such as design of experiments (DOE) and optimization is quite useful to maximize, for example, the accuracy of the network in learning in a more reliable way than only through the method of try to failure.

Next, I will explain how some of the Matlab functions are applied to improve the accuracy in learning the neural network.

The first step is that it was determined that there are 3 factors that determine how high the percentage of network learning is (0 to 100%)

The factors are A, B, C

A= Max Epoch

B= Learning rate

C= Validation Frequency

No hay texto alternativo para esta imagen

Then Generate the experimental runs for a?Box-Behnken design?in coded (normalized) variables [-1, 0, +1]:

CodedValue = bbdesign(3)

No hay texto alternativo para esta imagen

The first column is for the factor A the second column is for the factor B, and the third column is for the factor C. The value of the factor are:

Factor A: 15 to 40 ?no units.

Factor B: 15 to 50 ?no units.

Factor C: 15 to 50 ?no units.

The next step is to randomize the order of the runs, convert the coded design values to real-world units, and perform the experiment in the order specified.

No hay texto alternativo para esta imagen

The results of the experiments executed according to the random order generated are as shown in the following table:

No hay texto alternativo para esta imagen

-Next is to store the results in the TestResults array:

TestResult = [98.21 97.02 93.45 93.45 97.62 96.43 97.62 93.45 97.02 96.43 97.62 93.45 97.62 97.62 97.62]';

-Next, the design values and the response are displayed and stored.

Display designed values & response

No hay texto alternativo para esta imagen
No hay texto alternativo para esta imagen

A quick analysis of the data shown in the results table shows a certain trend, but it does not seem decisive, so I carried an analysis using optimization tools out to make as much as possible at least 99% accuracy in the network learning.

The next step is to propose a useful function to know the optimal values of the parameters A, B & C.

A second-degree function could be enough to model the phenomenon. The proposed function is:

AC = b0+(b1*M)+(b2*L)+(b3*V)+(b4*M*L)+(b5?M?V)+(b6?L?V)+(b7?M^2)+(b8?L^2)+(b9?V^2)

Where?AC?is the Accuracy and?bi?is the coefficient for the term?i. Estimate the coefficients of this model using the?fitlm?function from Statistics and Machine Learning Toolbox:

mdl = fitlm(Expmt,'Airflow~ M*L*V-M:L:V+M^2+L^2+V^2');        

Then display the magnitudes of the coefficients (for normalized values) in a bar chart.

figure()
h = bar(mdl.Coefficients.Estimate(2:10));
set(h,'facecolor',[0.8 0.8 0.9])
legend('Coefficient')
set(gcf,'units','normalized','position',[0.05 0.4 0.35 0.4])
set(gca,'xticklabel',mdl.CoefficientNames(2:10))
ylabel('Accurancy(%)')
xlabel('Normalized Coefficient')
title('Quadratic Model Coefficients')
?        
No hay texto alternativo para esta imagen

Then type mdl.Coefficients in order to know the values of the coefficients of the proposed function and mdl.Rsquared.Ordinary helps us to evaluate how well the proposed function fits the results of the experiments (% Accurrancy).

As long as mdl.Rsquared the value is closer to 1, it means that the fit between the data and the proposed function is well.

No hay texto alternativo para esta imagen
No hay texto alternativo para esta imagen

The previous results show that the proposed function fits the data of the experiments well, but something ideal would be for the value of Rsquared to be at least 0.98, so I should carry out a deeper analysis.

For example, generating generate response surface plots using the plotSlice function, which helps to quantify the effects of factors on AC and is also useful as a graphical method to identify optimal values in % learning accuracy.

The following graphs show the gap in the fit of the proposed model vs the data with the space between the graphs in red and the graph in green, besides a non-linear relationship, but relatively close to it, so the next step is to use a linear model to verify the fit, in case the value of Rsquared remains below 0.98, it is necessary to test a cubic model (third degree).

No hay texto alternativo para esta imagen

None of the following linear models meets the criterion of Rsquared >= 0.98. Therefore, I tested a model of degree 3.

Linear model proposal (1)

No hay texto alternativo para esta imagen

Linear model proposal (2)

No hay texto alternativo para esta imagen

The proposed third-degree model appears below, which perfectly fits the data of the experiments, in addition, the results of generating generate response Surface plots show an exact overlap, so it is possible to proceed to optimizing the values of the factors M, L & V.

No hay texto alternativo para esta imagen
No hay texto alternativo para esta imagen

The best option to optimize the third-degree function is to generate a custom objective function, which is shown below.

No hay texto alternativo para esta imagen

function F = root2d(x)

F(1) = -(97.62 -(0.745*x(1))-(0.595*x(2))-(1.19*x(3))+(0.2975*x(1)*x(2))-(0.745*x(1)*x(3))-(0.895*x(2)*x(3))-(0.96875*x(1)^2)-(1.1187*x(2)^2) - (0.37125*x(3)^2) + (0.2975*x(1)^2*x(2)) -(1.3375*x(1)*x(2)^2) - (0.15*x(1)^2*x(3)) + (0*x(2)^2*x(3)) + (0*x(1)*x(3)^2) + (0*x(2)*x(3)^2) + (0*x(1)^3) + (0*x(2)^3) + (0*x(3)^3));

Minimizing the negative accurancy using?fmincon?is the same as maximizing the original objective function. The constraints are the upper and lower limits tested (in coded values). Set the initial starting point to be the center of the design of the experimental test matrix.

lb = [-1 -1 -1]; % Lower bound????????????????????????????? ???????? 
ub = [1 1 1];?? ?% Upper bound????????????????????? ?
x0 = [0 0 0];?? ?% Starting point
[optfactors,fval] = fmincon(@root2d,x0,[],[],[],[],lb,ub,[]); % Invoke the solver        
No hay texto alternativo para esta imagen

Then convert the results to a maximization problem and real-world units

maxval = -fval;
maxloc = (optfactors + 1)';
bounds = [15 40;15 50;10 50];
maxloc=bounds(:,1)+maxloc .* ((bounds(:,2) - bounds(:,1))/2);
disp('Optimal Values:')
disp({'Distance','Pitch','Clearance','Airflow'})
disp([maxloc' maxval])        
No hay texto alternativo para esta imagen

Therefore, the optimal values to get the maximum potential value of the Accuracy% in learning only considering the factors related to the training options:

Max Epoch = 27.6273

Learning Rate = 34.8419

Frequency = 10

The following graph shows the values of the Max Epoch constant (27.6273) vs Frequency & Learning Rate

No hay texto alternativo para esta imagen

The following graph shows the values of the Learning Rate constant (34.84) vs Max Epoch & Frequency.

No hay texto alternativo para esta imagen

The following graph shows the values of Learning Rate (10) vs Max Epoch & Learning Rate optimization.

No hay texto alternativo para esta imagen

All the above is a sample of how useful the design of experiments and optimization can be in any field, in this case during the training phase of a network.

要查看或添加评论,请登录

Luis S.的更多文章

  • Tolerance stackup analysis: practical approach.

    Tolerance stackup analysis: practical approach.

    In this post I will explain how to perform a stackup analysis with a practical approach. Mechanical stackup analysis is…

  • Design of experiments applications in daily life situations

    Design of experiments applications in daily life situations

    Some time ago I spoke with some colleagues about the update of a specific product and the validation process, in…

  • Does GD&T make component manufacturing more expensive?

    Does GD&T make component manufacturing more expensive?

    Start by specifying that the GD&T (Geometric Dimensioning and Tolerancing) is a system to define manufacturing…

  • Steady-state lap time simulator

    Steady-state lap time simulator

    The origin of this article and this simulator (in updating process) that I will describe below is the passion for…

  • Vehicle dynamics simulation (Part III)

    Vehicle dynamics simulation (Part III)

    Case study: application of the Milliken method. The Milliken method can be implemented in any programming language such…

  • Vehicle dynamics simulation (Part II)

    Vehicle dynamics simulation (Part II)

    The Milliken method: Yaw moment diagram. The Milliken method is a tool developed in the book "Race car vehicle…

  • Vehicle dynamics simulation (Part I)

    Vehicle dynamics simulation (Part I)

    Currently, due to the increase in the power of computers and the need for a rapid car design development minimizing the…

社区洞察

其他会员也浏览了