EMET8002 Analysis and Econometrics Semester 2 2024

Java Python EMET8002 Case Studies in Applied Economic

Analysis and Econometrics

Semester 2 2024

Computer Lab in Week 4

This week we use the 2015 Programme for International Student Assessment (PISA) dataset for our analysis. PISA assesses skills of 15-year-old students across many countries and involves data on student performance (reading, mathematics and science knowledge), as well as parent, school and teacher questionnaires. The data is publicly available and can be downloaded from the PISA website under the section “SPSS (TM) Data Files (compressed)” (https://www.oecd.org/pisa/data/2015database/).  PISA codebooks and questionnaires are available through the same link. However, these are large files and take sometime to download  and convert to Stata format. Therefore, we have prepared a smaller merged dataset of the  Student and School questionnaire data files which now only includes the relevant variables.  You can download this data (“Week4_PISA_data.dta”) from Wattle.

Question 1: Categorical Dependent Variables

We are interested in what determines whether a student has to repeat a grade at school at least once. We call our dependent variable “repeated” and this is a dummy variable that is equal to 1 if a student has repeated a grade at least once, and equal to 0 when the student has never repeated a grade.

(a) Drop missing observations for the “repeated” variable (hint: missing values are

sometimes coded as “.” or “.a”; you may consider “keepor “drop commands).

(b) There are several dummy variables that we will include as independent variables.

These are usually coded as “1” or “0”, but in the PISA dataset these are given as “2” or “1” . For example, “male” is equal to 2 if the student is male and equal to 1 if the student is female. Recode these dummy variables such that they are equal to 1 if the variable description applies to the student, and 0 otherwise (e.g., the “male” dummy is equal to 1 if the student is male, and equal to 0 if the student is female). Do this for the ”male”, “international_lan”, “quiet study” and “training” dummy variables. (hint: you can use the “replacecommand)

(c) Explain why we only need to include one dummy variable (e.g., “male”) in the regression to describe a binary variable with two possible values (e.g., male or  female).

(d) Run alogit model with “repeated” as the dependent variable and the following

independent variables: “international_lan”, “male”, “mother_edu”, “father_edu”, “quiet_study”, “school_size”, “class_size”, “training”, “computer” . How do you  interpret the estimated coefficients?

(e) Exclude missing values for all independent variables. Repeat the logit model from

part d but estimate it without missing values. Are there any differences? Which model do you prefer?

(f)  Repeat your preferred logit model from part e but estimate it using clustered standard errors (by school). EMET8002 Analysis and Econometrics Semester 2 2024 When do we use clustered standard errors and why?

(g) What are the effects of mother’s education (“mother_edu”) on whether a student has

to repeat a grade? Include joint tests in your answer. (hint: this variable is based on the survey question: "What is the <highest level of schooling> completed by your mother?" Possible answers: ISCED level 3A (coded as 1); ISCED level 3B, 3C (coded as 2); ISCED level 2 (coded as 3); ISCED level 1 (coded as 4); She did not complete    ISCED level 1 (coded as 5). For definitions of ISCED education levels see,for example, https://stats.oecd.org/glossary/detail.asp?ID=1436)

(h) Calculate and interpret odds ratios for the logit regression in part d.

(i)  Calculate and interpret marginal effects for father’s education (“father_edu”) based on your logit regression part d.

(j)  Run both logit and probit models using the same variables as in part d and export your results to an excel file. How do your results differ across the two models? (hint: commonly used commands to export data from Stata are the “outreg2” and “estout commands)

(k) Compare the predicted probability values for father’s education (“father_edu”) for

both logit and probit models from part j and export them to an excel file. How do your results differ across the two models?

Question 2: Preparation for Research Proposal [not required for problem set]

The Research Proposal is due on 1 September 2023. First of all, you are expected to provide a summary of the main findings, data and methods of the paper that you have chosen for your research project as well as a brief literature review. Next, you need to propose a minor extension to the original paper. Your idea for the extension needs to be backed up with prior literature and/or theory. You need to explain why your research question is interesting, what you expect to find, and how you are able to test it (with specific statistical hypotheses, data and methods). There is no need to do any analysis at this stage.

(a) Identify a couple of references from the academic literature that are relevant to your project. How are they similar or different from your paper? (methods, data, sample   period, country). Are their findings consistent or contradictory to your paper?

(b) Discuss what involves developing an effective research question. You may refer to section 3.1.2 of the following document:

https://www.economicsnetwork.ac.uk/handbook/ugresearch/31

(c) Come up with a few ideas for your extension. Why do you think that they are  interesting? Based on theory and prior literature, what do you expect to find?  Formulate your idea in the form. of a research question and a formal statistical hypothesis. Make sure that your extension is feasible given the data and time  constraints that you have         

【优化调度】基于改进遗传算法的公交车调度排班优化的研究与实现(Matlab代码实现)内容概要:本文围绕“基于改进遗传算法的公交车调度排班优化的研究与实现”展开,重点介绍了利用改进遗传算法解决公交车调度与排班这一复杂优化问题的方法。研究通过构建数学模型,综合考虑发车频率、线路负载、司机排班、运营成本等因素,采用Matlab进行仿真与代码实现,验证了改进遗传算法在提升调度效率、降低运营成本、优化资源配置方面的有效性。文中对比了多种遗传算法变体(如变异遗传算法、精英遗传算法等),并展示了其在实际公交系统优化中的应用潜力。; 适合人群:具备一定编程基础,熟悉Matlab工具,对智能优化算法(尤其是遗传算法)感兴趣,并从事交通调度、运筹优化、城市规划等相关领域的研究人员或工程技术人员。; 使用场景及目标:①解决城市公交系统中存在的发车不均、资源浪费、司机疲劳等问题;②为公共交通管理部门提供科学的调度决策支持;③研究和比较不同改进遗传算法在复杂调度问题上的性能差异,推动智能优化算法在实际工程中的应用。; 阅读建议:此资源以Matlab代码实现为核心,读者应重点关注算法的设计思路、约束条件的处理以及仿真结果的分析。建议结合文中提供的代码进行实践操作,尝试调整参数或引入新的约束条件,以加深对算法原理和应用场景的理解。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值