Đang Thực Hiện

Monte Carlo Algorithm

Consider a world with grid 2x2 ( see attachment)

The cells S1, S2, S3, S4 are the states.

In each state the agent can choose one of the following actions: up, down, left, right.

The S1 state is the terminal state. In any other state the agent is moving to the next cell depending on the action.

For example: we are in the S3 and we choose the action ''Right''. Then the agent moves to S4 with probability 1 and reward -1.

In case of the action selected drives the agent outside the grid then it will hit to a wall and will move to the opposite state with reward -2. For example. At S4 we want to go right, will result the agent to move left to S3.

We consider initially that Q(S,a) is 0 for every S,a.

Monte Carlo algorithm for every visit with exploring starts for an episode of 3 steps.

What will be the policy of the agent after the episode and why?

Kỹ năng: Thuật toán

Xem thêm: what algorithm, example algorithm, an algorithm, algorithm world, algorithm is, algorithm example, s4, monte, episode, algorithm, algorithm c, left right, probability algorithm, s1, drives, 2x2, state algorithm, matlab monte carlo simulation, matlab volume monte carlo, vba monte carlo simulation, matlab monte carlo, matlab monte carlo simmulation, monte carlo simulationmatlab, monte carlo matlab, monte carlo method matlab

Về Bên Thuê:
( 7 nhận xét ) Athens, Greece

Mã Dự Án: #1085542

Đã trao cho:


Hi,Please check your inbox,Thanks.

$35 USD trong 0 ngày
(18 Đánh Giá)

3 freelancer đang chào giá trung bình $35 cho công việc này


Hellow friend Please check PM

$40 USD trong 1 ngày
(6 Đánh Giá)

I can do it

$30 USD trong 1 ngày
(0 Đánh Giá)