Data science with python john jay Machine learning questions
Homework 6 -Extra CreditSpec sheet In this homework, we apply a RL framework to environments available at the OpenAI gym. Mission command approach: As per §4.5 of the Sittyba, we will tell you what to do, not how to do it. That is up to you. However, we want you to: a) Do this homework yourself. Do not copy answers or code from someone else. b) Restrict your methods (for now) to what was covered in the lecture/lab (in other words, basic reinforcement learning involving Q-learning, policy gradients, multi-armed bandits, etc.) Here is what we would like you to do: 1) Go to https://gym.openai.com/ 2) Pick one of the available environments – we recommend one of the classic Atari 2600 games: https://gym.openai.com/envs/#atari [Make sure to pick one we did not already cover in lecture or lab, but you can pick any environment that is not an Atari game too] 3) Train an agent to achieve a reasonable level of performance in this environment. 4) Write a brief statement as to how you trained the agent, how you managed the explore / exploit tradeoff, and explaining any other choices you might have made. 5) Also make sure to comment on how the training went – what was challenging for the agent, what made training feasible? Explanations of what you couldn’t do and why are encouraged with emphasis on the “why” 6) Document the performance of the agent by plotting total rewards as a function of training episodes. 7) Make sure to include your code as a separate file. Suggestions and recommendations: 1. Picking a more complex environment will merit more grade points. To check complexity, go to https://github.com/openai/gym/wiki/Table-of-environments and look at Observation Space and Action Space. We recommend to choose an environment which has Discrete Action space. We want to keep grading criteria (in terms of points) flexible to see what students can actually do, but as a broad heuristic, something with the complexity of “LunarLander-v2” would be ok, something with the complexity of “BipedalWalker-v2” would be good, and something with the complexity of “AirRaid-ram-v0” would be excellent. But don’t necessarily pick those specific environments. Pick something that sparks joy, for you personally. It will shine through. 2. Try implementing an algorithm on your own instead of using stable baselines 3. If you use sb 3, explain what you did to optimize the model. Try checking how far your model can go by trying more complex environments and find the breaking point 3. You can also use the library NEAT-Python: https://neat-python.readthedocs.io/en/latest/ If you decide to use NEAT, experiment on how far NEAT can go and note your observations. 4. So either a) implement your own algorithm, or b) use SB-3 (and note what you did to optimize the model) or c) use NEAT-Python, find the most complex env you are able to solve with NEAT and note what leads to better NEAT implementations 5. Whichever environment you pick, make sure your RL bot is learning the environment reasonably well (as evinced by the plot of total reward over episodes of training).
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.