DATA1002 / 1902 - Informatics: Data and Computation

DATA1002 / 1902 - Informatics: Data and Computation

2024 Sem2

Group Project Stage 2

THE PROJECT WORK FOR STAGE2:

Task                        Description                     Group/individual                        Details

1                   Identify topic and datasets                       Group

The analysis done in this Stage must all be relevant to a single topic or question, which you are investigating because it matters to some stakeholders. You need to then have one or more datasets that you will analyse, to produce results that are relevant to this topic/question. You are allowed to use the same topic as in Stage 1, but you are equally free to change topic. The members of the group are allowed to all work with the same dataset, or some (or all) may choose to work with different datasets. These datasets are allowed to be cleaned data from Stage 1, or integrated data from Stage 1, or you may choose to obtain new/extra data. There are no requirements for particular origin or volume in the datasets for this stage. We will make available a dataset (on a topic of our choice) and any group can use that data instead, if they prefer. Note that all members of the group must be working on the same topic/question as each other, even if they use different datasets that deal with different facets of the issue.

We realize that the results you produce from analysis may not completely resolve the issue you are targeted at, but each result should at least be potentially able to provide some insights. For example, if your topic is “what influences the average level of wealth in a community?”, one analysis may calculate the average wealth in communities having different levels of housing density, and a chart may show how wealth relates to percentage of people living alone in each suburb. Please make sure that your question or issue is not simply a factual matter, but instead looks at relationships where insights might be impactful for some stakeholder groups (for example, it is not a good choice of question to ask just “which country has the highest level of wealth?”).

2                            Choose summaries and charts to produce                 Group

Each member needs to calculate one or more grouped-aggregate summaries from the dataset they are using, and they also must produce one or more charts from that dataset. The number of summaries and charts, and some constraints on what sort of attributes are used, depends on the level of score you are seeking. Details are in the marking scheme below. It is required that all the summaries be distinct from one another, and similarly each chart must be distinct. So you need to coordinate among the members, in case two members want to do the same calculation, one at least will need to change!

3a                     Use Python to produce a few tables from parts of the data                   Individual

Each member then needs to work with their chosen dataset, to produce the material for their section in Part A of the report. This will involve writing Python code to calculate one or more summaries, and running that code to get the output, this can then be formatted as a Table in the report.

3b           Produce a few charts from parts of the data; evaluate the effectiveness of each chart                Individual

Charts are produ

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值