EY Data Science Take Home Challenge


EY Data Science Take Home Challenge
PROJECT OVERVIEW
In this project, you are being tasked with solving a supervised classification problem. The client has 
provided 5,899 rows of labeled data, and your job will be to create a model that can predict the correct 
output label.
PROJECT STRUCTURE
You will be using Python and Jupyter notebooks to complete the data science process for this project. 
Feel free to include any third-party packages (through pip or easy_install). However, all work must be 
shown in the Jupyter notebook.
PROJECT TIMELINE
You will have three (3) days to complete the project. By the end of the third day, you must email your 
Jupyter notebook and any supporting files/documents to Troy.m.maikowski@ey.com and 
Yi.Liu@ey.com. All of your code, materials, and anything else you plan on using in your presentation 
must be emailed to the team by the end of the third day (e.g. PowerPoint presentation, visualizations, 
graphs, etc.). As an example, if you receive the challenge on a Friday, it will be due by end of day 
(11:59PM) CST on Sunday. Upon receiving the documentation, the EY team will work with you to sched ule a presentation meeting to discuss the results of the project.
PRESENTATION STRUCTURE AND DELIVERABLES
After scheduling the presentation meeting with the team, you will be asked to come into the office or 
given details for a conference bridge where you can present your findings. You will have one (1) hour to 
walk through your process and answer any questions that the team has for you. You will be able to plug 
your laptop into an external monitor through HDMI. If you need any special accommodations (physical 
or technical), please let us know ahead of time so we can make the proper arrangements. You can use 
any presentation medium that you prefer (Jupyter Notebook, PowerPoint, Power BI, etc) as long as you 
are able to run it from your own laptop.
PROJECT EVALUATION
First and foremost, you will be evaluated on the overall model prediction accuracy on your test data. 
Secondly, you will be evaluated on your process. We would like to see a wide variety of techniques 
demonstrated so we can get a feel for the depth and breadth of your knowledge of python and the data 
science process. Bonus points for clean, interesting visualizations. Lastly, you will be evaluated on your 
communication and presentation skills as you deliver your findings to the team. Remember, great data 
scientists can tell a compelling story.
PROJECT DETAILS
• Columns A through G are available for use as input data
• Column H is the classification label to be used for prediction
• Set aside 10% of the data set to serve as test data
• You may use any learning algorithm(s) of your choice to complete the project, but be prepared 
to explain your choice
• You are free to use any data exploration and visualization techniques that you see fit
• This document is intentionally left vague; we want you to have as much freedom in solving the 
challenge as possible. Use this as an opportunity to showcase your talent, but be prepared to 
explain any and all decisions you make

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值