Reinforcement Learning - Aamir P
Reinforcement Learning

Reinforcement Learning - Aamir P

This is a type of machine learning where an agent learns the environment by interacting and makes decisions for further work. The actions will be taken on the environment and based on the results of the actions, the agent receives feedback in the form of rewards or penalties.


Let us dive into a real-time example to understand this better. I am training a robot to play a Maze Game.


Imagine that robot needs to reach the goal location. The robot has no knowledge of the maze layout but it knows to move forward, backwards, left and right.


So, what are the steps?

  1. Initialisation

In the beginning, the robot has random moves. It doesn't know in what direction to move. So, the robot is our agent here, and the environment it moves is the place where it operates.

2. Exploration and Exploitation

The Robot takes random actions by exploring the maze. The robot may get lost at the start, there is a possibility.

3. Reward Feedback

If the Robot moves correctly according to the goal it receives positive feedback. If in case, the robot hits the wall or anything it gets a penalty. Based on such feedback the robot learns better.

4. Learning

So, the robot learns reinforcement learning like Q-learning or Deep Q Networks, to update its strategy based on the rewards it received. As a result, the positive ones are only considered by the robot.

5. Policy Improvement

As the robot searches and receives positive rewards, its action strategy policy is improved. This cumulates the positive approach.

6. Optimal Policy

After some iterations, the robot comes to an optimal policy. This is a strategy that improves navigation and it reaches the goal with the highest probability.


To conclude, as the robot undergoes reinforcement learning it knows to navigate intelligently, finding the most efficient path to reach its goal. This way, it improves decision-making skills. This is something similar to trial and error concept. It learns from the outcomes of actions and adjusts the behaviour to maximise the rewards in the maze environment.


This example demonstrates how reinforcement learning enables an agent (the robot) to learn from experience and optimize its actions in a dynamic environment to achieve a specific goal.


So, that’s it for the day! Thanks for your time in reading my article. Tell me your feedback or views in the comments section.

Check out this link to know more about me


要查看或添加评论,请登录

AAMIR P的更多文章

  • CPG (Consumer Packed Goods)— Aamir P

    CPG (Consumer Packed Goods)— Aamir P

    Hello Readers! In this article, we will gain some understanding about CPG. What is CPG? Things that are frequent in…

    1 条评论
  • Dataiku — Aamir P

    Dataiku — Aamir P

    I found this tool very interesting and thought of sharing it with you all. I learnt this from Dataiku Academy.

  • PySpark — Aamir P

    PySpark — Aamir P

    As part of my learning journey and as a requirement for my new project, I have started exploring Pyspark. In this…

  • Data Build Tool(DBT) — Aamir P

    Data Build Tool(DBT) — Aamir P

    This is a command-line environment that allows you to transform and model the data in data warehousing using SQL…

  • SSIS Data Warehouse Developer — Aamir P

    SSIS Data Warehouse Developer — Aamir P

    SQL Server is an RDBMS developed by Microsoft. It is used to store and retrieve data requested by apps.

    4 条评论
  • Talend — Aamir P

    Talend — Aamir P

    Hello Readers! In this article, we will learn about Talend. Data integration is crucial for businesses facing the…

  • Data Warehousing and BI Analytics — Aamir P

    Data Warehousing and BI Analytics — Aamir P

    Hello Readers! In this article, we will have a beginner-level understanding of Data Warehousing and BI Analytics. Hope…

  • TensorFlow - Aamir?P

    TensorFlow - Aamir?P

    Hi all! This is just some overview which I’m going to write about. Some beginners were asking me for a basic…

  • Data Engineering — Aamir P

    Data Engineering — Aamir P

    Hello readers! In this article, we will see a basic workflow of Data Engineering. Let's see how data is stored…

    2 条评论
  • SnowPark Python— Aamir P

    SnowPark Python— Aamir P

    Hello readers! Thank you for supporting all my articles. This article SnowPark Python I am not so confident because…

社区洞察

其他会员也浏览了