STATS 2DA3 R questions

STATS 2DA3 Fall 2024

ASSIGNMENT 1

1. (10 MARKSUsing the iris dataset which is available in R, answer the following questions:

(a) Use one or two lines of R code to display how many rows and columns are in the dataset. (i.e. do not just output all observations in the dataset. Write some code that will output the required information).

(b) Which variables are categorical and which are continuous?

(c)  Graph  1:  Using the ggplot function,  make a scatterplot of “Sepal.Length” against “Petal.Length” (putting “Sepal.Length” on the x-axis).

• Make the data points blue.

• Label the x-axis Sepal Length.

• Label the y-axis Petal Length.

• Label the graph Iris Data.

(d)  Graph 2: Use ggplot to make a bar chart (geom bar) displaying “Species” .  “fill” using “Species” (i.e. each species of ris should be a different colour on the graph).

(e)  Display graphs 1 and 2 in one image using R code (i.e.  do not just screen grab the 2 images and combine them).

2. (3 MARKS) Consider the plot below; it displays information on Vehicle Type and on theassociated drive train. There are 3 different types of drive train : 4 = four wheel drive, f = front wheel drive, r = rear wheel drive.

(a) Which Vehicle Type has the least observations associated with it in the dataset? (b) For “suv” vehicles, what is the majority drive train type?

(c) For “compact” vehicles, which of the 3 drive train types occurs least often?

3. (7 MARKS)

For the Arthritis dataset in the vcd package  [there  are 3 different levels of improvement (None, Some or Marked) that a patient can experience after receiving 1 of 2 medical treat- ments (Placebo or Treated)], perform the following tasks:

(a)  Create a Double Decker plot, displaying “Improved” as a function of “Treatment” and “Sex” . (“Treatment” should be on the lowest x-axis.)  Colour the “Improved” variable so that each level is a different colour.

(b) For female patients in the Treated group, what was the most reported level of improve- ment?

(c) For male patients in the Treated group, what was the least reported level of improve- ment?

(d) Using ggplot make a bar chart (geom bar) displaying “Treatment” .  Colour (“fill”) the “Treatment” variable with respect to the “Improved” variable.

Assignment Standards

• Answer each question.  Do not just provide code.  Any graphs must be rendered and reproduced in the report.

• LATEX is strongly recommended but not strictly required.  The use of Markdown in R studio is also recommended.

•  Submit your assignment as one .pdf documentAll R code should be included and  organized  either  at  the end  of the  assignment  or  inline  (if  using  R Markdown).

• Approximately eleven-point font (times or similar) must be used with around 1.5 line spacing and margins of at least 1 inch all around.

•  Do not include a title page.  The title and your name should be printed at the top of the first page.

• Various tools, including publicly available internet tools, maybe used by the instructor to check the originality of submitted work.

•  Students are not permitted to use generative AI in this course.   In alignment with McMaster academic integrity policy, it  “shall be an offence knowingly to . . .  submit academic work for assessment that was purchased or acquired from another source” . This includes work created by generative AI tools. Also stated in the policy is the fol- lowing, “Contract Cheating is the act of “outsourcing of student work to third parties” (Lancaster & Clarke, 2016, p.  639) with or without payment.” Using Generative AI tools is a form of contract cheating.  Charges of academic dishonesty will be brought forward to the Office of Academic Integrity.

<template> <div class="quiz-container"> <div class="quiz-info"> <h1>知识问答挑战</h1> <div class="question-number">题目 {{ currentQuestionIndex + 1 }}/{{ totalQuestions }}</div> </div> <div class="progress-bar"> <div class="progress" :style="{width: progressWidth}"></div> </div> <!-- 核心:只渲染3个主要DOM元素 --> <!-- 1. 题目容器 --> <div class="question-container" v-once> <h2 class="question-title">{{ currentQuestion.text }}</h2> </div> <!-- 2. 选项容器 --> <div class="options-container"> <div v-for="(option, index) in currentQuestion.options" :key="index" class="option" :class="{selected: selectedOption === index}" @click="selectOption(index)" > <input type="radio" :id="&#39;option&#39;+index" :value="index" :checked="selectedOption === index" > <label :for="&#39;option&#39;+index"> <strong>{{ String.fromCharCode(65 + index) }}.</strong> {{ option.text }} </label> </div> </div> <!-- 3. 导航按钮容器 --> <div class="nav-buttons"> <button class="nav-btn prev-btn" :disabled="currentQuestionIndex === 0" @click="prevQuestion" > ← 上一题 </button> <div class="counter">已答:{{ answeredCount }}/{{ totalQuestions }}</div> <button class="nav-btn next-btn" :disabled="selectedOption === null" @click="nextQuestion" > {{ isLastQuestion ? &#39;完成答题&#39; : &#39;下一题 →&#39; }} </button> </div> </div> </div> </template> <script> export default { data: { questions: this.generateQuestions(50), currentQuestionIndex: 0, selectedOption: null, userAnswers: [], totalQuestions: 50 }, computed: { // 当前问题数据 currentQuestion() { return this.questions[this.currentQuestionIndex]; }, // 进度条宽度 progressWidth() { return `${((this.currentQuestionIndex + 1) / this.totalQuestions) * 100}%`; }, // 是否为最后一题 isLastQuestion() { return this.currentQuestionIndex === this.totalQuestions - 1; }, // 已答题目数量 answeredCount() { return this.userAnswers.filter(answer => answer !== null).length; } }, methods: { // 生成模拟题目数据 generateQuestions(count) { const questions = []; const topics = [&#39;JavaScript&#39;, &#39;CSS&#39;, &#39;HTML&#39;, &#39;Vue.js&#39;, &#39;React&#39;, &#39;Node.js&#39;, &#39;HTTP协议&#39;, &#39;算法&#39;]; const difficulty = [&#39;简单&#39;, &#39;中等&#39;, &#39;困难&#39;]; for (let i = 0; i < count; i++) { const topic = topics[Math.floor(Math.random() * topics.length)]; const diff = difficulty[Math.floor(Math.random() * difficulty.length)]; questions.push({ id: i + 1, text: `以下关于${topic}的叙述中,哪项是${diff}级的知识点?`, options: [ { text: &#39;正确选项&#39;, value: true }, { text: &#39;干扰选项1&#39;, value: false }, { text: &#39;干扰选项2&#39;, value: false }, { text: &#39;干扰选项3&#39;, value: false } ] }); } return questions; }, // 选择选项 selectOption(index) { this.selectedOption = index; }, // 上一题 prevQuestion() { if (this.currentQuestionIndex > 0) { this.currentQuestionIndex--; // 恢复用户之前的选择(如果有) this.selectedOption = this.userAnswers[this.currentQuestionIndex]; } }, // 下一题 nextQuestion() { // 保存用户答案 this.userAnswers[this.currentQuestionIndex] = this.selectedOption; if (!this.isLastQuestion) { this.currentQuestionIndex++; // 重置选择状态,如果之前有答案则恢复 this.selectedOption = this.userAnswers[this.currentQuestionIndex] || null; } else { // 最后一题完成 this.showResults(); } }, // 显示结果 showResults() { const correctCount = this.userAnswers.reduce((count, selected, index) => { return count + (selected !== null && this.questions[index].options[selected].value ? 1 : 0); }, 0); alert(`答题完成!\n共答对 ${correctCount} 题\n正确率: ${(correctCount/this.totalQuestions*100).toFixed(1)}%`); // 重置测试 this.currentQuestionIndex = 0; this.userAnswers = []; this.selectedOption = null; } } } </script> <style> * { margin: 0; padding: 0; box-sizing: border-box; font-family: &#39;Microsoft YaHei&#39;, sans-serif; } body { background-color: #f5f7fa; padding: 20px; } .quiz-container { max-width: 800px; margin: 0 auto; background-color: #fff; border-radius: 10px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.08); overflow: hidden; padding: 25px; position: relative; } .progress-bar { height: 8px; background-color: #e9ecef; border-radius: 4px; margin-bottom: 25px; position: relative; overflow: hidden; } .progress { height: 100%; background: linear-gradient(90deg, #4c6ef5, #228be6); transition: width 0.4s ease; } .question-container { background-color: #f8f9fa; border-radius: 8px; padding: 20px; margin-bottom: 25px; min-height: 150px; transition: all 0.4s ease; } .question-title { font-size: 18px; color: #343a40; line-height: 1.6; margin-bottom: 20px; } .options-container { display: flex; flex-direction: column; gap: 15px; } .option { display: flex; align-items: center; background: white; border: 2px solid #e9ecef; border-radius: 8px; padding: 14px 20px; cursor: pointer; transition: all 0.2s; } .option:hover { background-color: #f1f3f5; border-color: #ced4da; } .option.selected { background-color: #edf2ff; border-color: #4c6ef5; } .option input { margin-right: 15px; cursor: pointer; } .option label { cursor: pointer; font-size: 16px; color: #495057; flex-grow: 1; } .nav-buttons { display: flex; justify-content: space-between; align-items: center; } .nav-btn { padding: 12px 25px; background-color: #4263eb; color: white; border: none; border-radius: 6px; font-size: 16px; font-weight: 500; cursor: pointer; transition: all 0.2s; display: flex; align-items: center; justify-content: center; } .nav-btn:hover { background-color: #3b5bdb; transform: translateY(-2px); } .nav-btn:disabled { background-color: #adb5bd; cursor: not-allowed; transform: none; } .prev-btn { background-color: #868e96; } .prev-btn:hover { background-color: #495057; } .counter { font-size: 15px; color: #868e96; } .quiz-info { margin-bottom: 15px; display: flex; justify-content: space-between; align-items: center; } .question-number { font-weight: bold; color: #4263eb; font-size: 16px; } </style> 修改一下
06-07
公示是写的这段代码:{ "cells": [ { "cell_type": "markdown", "id": "267c9a27-fcec-4e35-9920-b28d1b244879", "metadata": {}, "source": [ "# Problem Set 1\n", "\n", "This is the first homework assignment, which accounts for $30\\%$ of your final grade. There are two questions, and their weights are:\n", "* Q1: $75\\%$ ($25\\% \\times 3$),\n", "* Q2: $25\\%$.\n", "\n", "You may work with other students. The maximum number of students per group is two. However, you can work on your own. Be sure to indicate with whom you have worked in your submission.\n", "\n", "### Deadline: Nov 14, 2025 (5 PM HK Time). \n", "\n", "There is a penalty for late submissions: $5\\%$ will be subtracted from the total mark for every additional day after the deadline. " ] }, { "cell_type": "markdown", "id": "99230c5e-5ce4-47c4-8030-3da66f532fe9", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": null, "id": "cba9bc45-4c53-47aa-b714-a4d25fef1d7c", "metadata": {}, "outputs": [], "source": [ "import random\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import sqlite3\n", "plt.rc(&#39;figure&#39;, figsize=(8, 4))\n", "PREVIOUS_MAX_ROWS = pd.options.display.max_rows\n", "pd.options.display.max_rows = 15\n", "np.set_printoptions(precision=4, suppress=True)" ] }, { "cell_type": "markdown", "id": "2a0f53f5-7626-46f3-86af-95d4804c4b7b", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "a06dbcb0-18fe-4128-bf43-9c60e6824236", "metadata": {}, "source": [ "## Q1. Modelling Stock Prices via Generalized Random Walks" ] }, { "cell_type": "markdown", "id": "031c9fc5-22fd-43e7-8c91-8b6e975da963", "metadata": {}, "source": [ "In the in-class exercise of lecture 5, we twist the simple walk to model stock returns/prices. Details can be found in the lecture notes. The function ```stock_price_simulations``` and the related model parameters are given below:" ] }, { "cell_type": "code", "execution_count": null, "id": "1615be14-9440-40d6-9c18-0d905828d66e", "metadata": {}, "outputs": [], "source": [ "def stock_price_simulations(P0, T, delta_t, mu, sigma, nwalks):\n", " \"\"\"\n", " P0: the current stock price;\n", " T: time-to-maturity;\n", " delta_t: the time step, double numeric (we choose delta_t such that T/delta_t is an integer);\n", " mu, sigma: mean and standard deviation of the stock return, double numeric;\n", " nwalks: the number of stock return random walks, integer;\n", " Return: a two-dimensional np.array, with each row storing one draw of stock price\n", " random walk with nsteps. \n", " \"\"\"\n", " \n", " nsteps = int(T/delta_t) # the number of steps in each stock return random walk, integer;\n", " draws = np.random.randint(0, 2, size=(nwalks, nsteps))\n", " steps = np.where(draws > 0, 1, -1)\n", " returns = steps*sigma*np.sqrt(delta_t) + mu*delta_t\n", " P1 = P0 * (1+returns).cumprod(axis=1)\n", " return P1\n" ] }, { "cell_type": "code", "execution_count": null, "id": "34cb62f0-7182-4148-8450-db071a89ee1d", "metadata": {}, "outputs": [], "source": [ "### Please don&#39;t change the parameter values\n", "T = 1 # time to maturity\n", "mu = 0.2 # annualised average return\n", "sigma = 0.2 # annualised volatility\n", "r = 0.03 # annualised risk-free rate\n", "P0 = 50 # current stock price\n", "delta_t = 1 / 2520\n", "nwalks = 10000 # the number of random walks" ] }, { "cell_type": "code", "execution_count": null, "id": "173eb2cb-3cb5-4bca-9302-29531f246bf8", "metadata": {}, "outputs": [], "source": [ "np.random.seed(12345)\n", "P1_sim = stock_price_simulations(P0, T, delta_t, mu, sigma, nwalks)\n", "print(P1_sim.shape)" ] }, { "cell_type": "markdown", "id": "832247f3-2497-4c58-9b3c-8f7e855dd779", "metadata": {}, "source": [ "### Q1.1" ] }, { "cell_type": "markdown", "id": "069fec17-f217-4db5-8750-4ca2c117ba2f", "metadata": {}, "source": [ "There are 10,000 simulation paths in ```P1_sim```, each of which has 2520 steps. Based on the simulated stock prices in ```P1_sim```, what is the probability of the terminal stock price being less than 60 but larger than 40? (Hint: You need to count the percentages of stock prices in the final trading period being less than 60 but larger than 40).\n", "\n", "Second, how does the probability of the terminal stock price being less than 60 but larger than 40 depend on the values of ```mu``` and ```sigma```? To answer this question, you need to consider a variety of ```mu``` and ```sigma```, as follows:\n", "```python\n", "mu_seq = np.arange(-0.20, 0.22, 0.02)\n", "sigma_seq = np.arange(0.10, 0.32, 0.02)\n", "```\n", "\n", "For each pair of ```mu``` and ```sigma``` in ```mu_seq``` and ```sigma_seq```, you need to estimate the probability of the terminal stock prices being less than 60 but larger than 40. Based on your estimates, please briefly describe how the values of ```mu``` and ```sigma``` determine the probability that the stock price is less than 60 but larger than 40 in the final trading period. " ] }, { "cell_type": "code", "execution_count": null, "id": "6fb8c11d-22d6-40c3-9d64-4b7211c64769", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "998972d8-00e4-4a01-bf0f-b9ccb74be31c", "metadata": {}, "source": [ "### Q1.2 Newton-Raphson Algorithm to Find the implied volatility\n", "\n", "The Black-Scholes-Merton formula of the European call option is given as follows: <br>\n", "<br>\n", "\\begin{equation}\n", "c(P_t, K, \\sigma, r, T-t) = P_t \\cdot N(d_1) - e^{-r (T-t)} K N(d_2)\n", "\\label{eq:euro_call_BS} \\tag{1}\n", "\\end{equation} \n", "<br>\n", "where \n", "- $P_t$ is the stock price at time $t$,\n", "- $K$ is the strike price of the European call option,\n", "- $\\sigma$ is the volatility of the stock return,\n", "- $r$ is the risk-free rate,\n", "- $T-t$ is the time-to-maturity, and \n", "- $N(\\cdot)$ is the cumulative distribution function of the standard normal ditribution, \n", "\n", "\\begin{equation}\n", "d_1 = \\frac{\\ln(\\frac{P_t}{K}) + (r + \\frac{1}{2} \\sigma^2)(T-t)}{\\sigma \\sqrt{T-t}}, \\ \n", "d_2 = d_1 - \\sigma \\sqrt{T-t}.\n", "\\label{eq:euro_call_BS_d} \\tag{2}\n", "\\end{equation}" ] }, { "cell_type": "code", "execution_count": null, "id": "7055b1bc-995d-4cad-9cab-664402733f97", "metadata": {}, "outputs": [], "source": [ "from scipy.stats import norm\n", "\n", "def euro_call_BS(P0, r, T, sigma, K):\n", " \"\"\"\n", " P0: current asset price;\n", " r: (annualized) risk-free rate;\n", " T: time to maturity;\n", " sigma: volatility of asset returns;\n", " K: strike price;\n", " \n", " Return the price of European call option implied by Black-Scholes Formula.\n", " \"\"\"\n", " \n", " d1 = (np.log(P0/K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T))\n", " d2 = d1 - sigma * np.sqrt(T)\n", " return(P0*norm.cdf(d1) - np.exp(-r*T)*K*norm.cdf(d2))\n", " " ] }, { "cell_type": "code", "execution_count": null, "id": "c4af8ffc-c2c7-4b3e-a1a4-1db570435d64", "metadata": {}, "outputs": [], "source": [ "K = 50\n", "price_bs = euro_call_BS(P0, r, T, sigma, K)\n", "print(\"Black-Scholes price of European call option is\", \"%.4f\" % price_bs)" ] }, { "cell_type": "markdown", "id": "7a3742bf-03e1-46d2-be65-8d658a77ce62", "metadata": {}, "source": [ "In lecture 5, we learned how to use the function ```euro_call_BS``` given above to compute the fair value of a European call option, given the parameter values of ```(P0, r, T, sigma, K)```. However, in financial markets, we often observe ```(P0, r, T, K)``` and the option price, and the stock volatility is reverse-engineered using the Black-Scholes-Merton formula. Note that the Black-Scholes-Merton formula is a nonlinear function of ```sigma```, and no close-form solution is provided. In this subquestion, we rely on the ***Newton-Raphson*** algorithm to back out ```sigma``` (Details about the Newton-Raphson algorithm can be found in ***Problem Set 0, question 4***).\n", "\n", "You may need to use the options greeks: https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model#The_Options_Greeks. " ] }, { "cell_type": "markdown", "id": "8b19647b-904e-44e1-8298-84ebec389fb4", "metadata": {}, "source": [ "In particular, you need to create the following function to estimate the volatility parameter ```sigma``` given the call option price and other model parameters:\n", "```python\n", "def implied_vol_newton(call_price, P0, r, T, K, epsilon, max_steps=1000):\n", " \"\"\"\n", " call_price: the market price of a European call option; \n", " P0: current asset price;\n", " r: (annualized) risk-free rate;\n", " T: time to maturity;\n", " K: strike price;\n", " max_steps int, epsilon > 0. \n", " Returns the implied volatility of a European call option based on the Newton-Raphson algorithm.\n", " If such a float does not exist (the number of loops is more than max_steps), it returns None. \n", " \"\"\"\n", " pass\n", "```\n", "\n", "After you create the above function, please execute the following codes to find out the implied volatility:\n", "```python\n", "call_price = 4.7067\n", "implied_vol_newton(call_price, P0, r, T, K, epsilon=0.001, max_steps=1000)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "48fa9994-56fc-4a06-be61-8d25383a63c0", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "ddf0c0d0-ba2a-4e27-a93e-b2c4a4daa5af", "metadata": {}, "source": [ "### Q1.3 Market price of an exotic option in the Black-Scholes-Merton economy\n", "\n", "In this question, you are asked to price an exotic option. The payoff structure of this option is as follows: When the market price at the maturity date (time $T$) is between 40 and 60, the payoff is equal to one. However, if the market price is higher than 60 or lower than 40 at the maturity date, the payoff is zero. What is the fair market value of this exotic option in the Black-Scholes-Merton economy? \n", "\n", "Remember that in the risk-neutral pricing, the fair price of this exotic option with strike price $K$ is equal to \n", "\n", "$$\n", "C(t, S_t) = \\exp \\{ - r (T-t) \\} \\cdot \\mathbb{E}^Q_t [ 1_{K_1 < P_T < K_2} ], \n", "$$ where $P_T$ is equal to $P_T = P_t \\cdot R_{t \\to T}$, and $1_{K_1 < P_T < K_2}$ is an indicator function that equals one if the terminal stock price $P_T$ is between $K_1$ and $K_2$ and zero otherwise. \n", "\n", "To approximate $\\mathbb{E}^Q_t [ 1_{K_1 < P_T < K_2} ]$, we need to draw a huge amount of simulation paths of terminal stock prices. We denote the simulated outcomes of stock price as $\\{P_T (\\omega_j) \\}_{j=1}^{M}$, then\n", "$$\n", "\\mathbb{E}^Q_t [ \\max\\{ P_T - K, 0 \\} ] \\approx \\frac{1}{M} \\sum_{j=1}^M 1_{K_1 < P_T (\\omega_j) < K_2}.\n", "$$\n" ] }, { "cell_type": "markdown", "id": "d404b9f6-5e93-4cc1-acfa-f7313a0b420a", "metadata": {}, "source": [ "You need to create the function ```exotic_call_MC``` to estimate the fair value of the exotic option based on Monte-Carlo simulations:\n", "```python\n", "def exotic_call_MC(P0, K1, K2, T, delta_t, mu, sigma, nwalks):\n", " \"\"\"\n", " P0: the current stock price;\n", " K1, K2: the strike prices;\n", " T: time-to-maturity;\n", " delta_t: the time step, double numeric;\n", " mu, sigma: mean and standard deviation of the stock return, double numeric;\n", " nwalks: the number of stock price random walks, integer. \n", " The payoff of this exotic option is 1 when terminal stock price is between K1 and K2 and zero otherwise.\n", " Return: the approximated price of the exotic option based on Monte-Carlo simulations.\n", " \"\"\"\n", " pass\n", "```\n", "\n", "After creating the function ```exotic_call_MC```, please execute the following codes:\n", "```python\n", "K1 = 40\n", "K2 = 60\n", "print(\"Option price based on Monte-Carlo simulations is\", \n", " \"%.4f\" % exotic_call_MC(P0, K1, K2, T, delta_t, mu, sigma, nwalks))\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "464b006b-5d8e-4c76-8c12-1bcaef8ed79a", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "80348e9c-d6e7-460e-9e49-6a7fa2519489", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "d25d61d2-3986-4662-a4c8-c194c13b6306", "metadata": {}, "source": [ "## Q2. Using SQL ```SELECT``` Statement\n", "\n", "In this question, you will use the **SQL** (more precisely, ```SQLite3```) statements to explore the database. " ] }, { "cell_type": "markdown", "id": "dedc98f7-cdc5-41ce-8ad0-759f5230bc83", "metadata": {}, "source": [ "### Q2.1\n", "\n", "First, you need to download ```\"ps1_stocks.db\"```, using the following Dropbox link: https://www.dropbox.com/scl/fi/ujplikyo63wddn717xkgr/ps1_stocks.db?rlkey=kfh251d49i5mbbfoxlp8brwnv&st=rgig6w8n&dl=0.\n", "\n", "There are three tables in this dataset. \n", "* Table &#39;stock_returns&#39;: contains four variables, ```[&#39;key&#39;, &#39;id&#39;, &#39;eom&#39;, &#39;ret_exc_lead1m&#39;]```;\n", "* Table &#39;st_reversal_signals&#39;: contains nine variables ```[&#39;key&#39;, &#39;id&#39;, &#39;eom&#39;, &#39;iskew_capm_21d&#39;, &#39;iskew_ff3_21d&#39;, &#39;iskew_hxz4_21d&#39;, &#39;ret_1_0&#39;, &#39;rmax5_rvol_21d&#39;, &#39;rskew_21d&#39;]```;\n", "* Table &#39;quality_signal&#39;: contains eight variables ```[&#39;key&#39;, &#39;id&#39;, &#39;eom&#39;, &#39;at_turnover&#39;, &#39;cop_at&#39;, &#39;cop_atl1&#39;, &#39;dgp_dsale&#39;, &#39;gp_at&#39;]```, \n", "\n", "where the variable ```&#39;key&#39;``` is the primary key of the dataset. " ] }, { "cell_type": "markdown", "id": "230e4d63-c596-40d0-a873-df0ebf32eeff", "metadata": {}, "source": [ "After you create the link to ```\"ps1_stocks.db\"``` database, you are asked to execute SQL statements to get the table information as follows:\n", "```python\n", "Information about table stock_returns:\n", "\n", "[(0, &#39;key&#39;, &#39;INTEGER&#39;, 0, None, 0),\n", " (1, &#39;id&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (2, &#39;eom&#39;, &#39;TIMESTAMP&#39;, 0, None, 0),\n", " (3, &#39;ret_exc_lead1m&#39;, &#39;REAL&#39;, 0, None, 0)]\n", "\n", "Information about table st_reversal_signals:\n", "[(0, &#39;key&#39;, &#39;INTEGER&#39;, 0, None, 0),\n", " (1, &#39;id&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (2, &#39;eom&#39;, &#39;TIMESTAMP&#39;, 0, None, 0),\n", " (3, &#39;iskew_capm_21d&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (4, &#39;iskew_ff3_21d&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (5, &#39;iskew_hxz4_21d&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (6, &#39;ret_1_0&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (7, &#39;rmax5_rvol_21d&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (8, &#39;rskew_21d&#39;, &#39;REAL&#39;, 0, None, 0)]\n", "\n", "Information about table quality_signals:\n", "[(0, &#39;key&#39;, &#39;INTEGER&#39;, 0, None, 0),\n", " (1, &#39;id&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (2, &#39;eom&#39;, &#39;TIMESTAMP&#39;, 0, None, 0),\n", " (3, &#39;at_turnover&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (4, &#39;cop_at&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (5, &#39;cop_atl1&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (6, &#39;dgp_dsale&#39;, &#39;REAL&#39;, 0, None, 0),\n", " (7, &#39;gp_at&#39;, &#39;REAL&#39;, 0, None, 0)]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "72e8f643-1689-49de-903f-d7576b2439f2", "metadata": {}, "outputs": [], "source": [ "conn = sqlite3.connect(&#39;data/ps1_stocks.db&#39;)\n", "c = conn.cursor()" ] }, { "cell_type": "code", "execution_count": null, "id": "1a8ecd34-f591-417a-bec0-3196d16c183e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "63a421e9-a530-4b1e-a9d2-bd31b4143e5d", "metadata": {}, "source": [ "### Q2.2\n", "\n", "Select the stock-month observations from the ```&#39;stock_returns&#39;``` table that satisfy the following requirements:\n", "* Select ```&#39;id&#39;```, ```&#39;eom&#39;```, and ```&#39;ret_exc_lead1m&#39;```\n", "* The stock return should range from $-0.03$ to $0.03$ and is NOT missing\n", "* The stock id contains ```222```\n", "* Sort the data by ```eom``` in the descending order and by ```id``` in the ascending order\n", "* Display the first 5 rows of the sorted data. " ] }, { "cell_type": "code", "execution_count": null, "id": "ade1739f-3f66-4c25-8bd8-eb787993b7fb", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "8d8180ae-7024-43f5-84a0-99dbfda56db2", "metadata": {}, "source": [ "### Q2.3\n", "\n", "Compute the average, maximal, and minimal asset turnovers (```&#39;at_turnover&#39;```) for the stocks that satisfy the following requirements: \n", "* Report ```id``` and create three new variables (```mean_at_turnover```, ```max_at_turnover```, and ```min_at_turnover```) to denote the average, maximal, and minimal returns for selected stocks\n", "* The asset turnover should NOT be missing\n", "* The number of time-series observations is greater than 360." ] }, { "cell_type": "code", "execution_count": null, "id": "76b8c0fd-eae7-4491-9ed0-5d816aa42c9f", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "db46d4d2-2dfb-460b-873c-ddef5f20f372", "metadata": {}, "source": [ "### Q2.4\n", "\n", "In this question, you are asked to extract \n", "* ```iskew_capm_21d``` and ```iskew_ff3_21d``` in table ```st_reversal_signals```,\n", "* ```at_turnover``` and ```gp_at``` in table ```quality_signals```,\n", "* ```ret_exc_lead1m``` in the ```stock_returns``` table,\n", "* and ```key```, ```id```, ```eom``` in all three tables. \n", "\n", "Moreover, we require that there is no missing value in ```ret_exc_lead1m```, ```at_turnover```, ```gp_at```, ```iskew_capm_21d```, and ```iskew_ff3_21d```." ] }, { "cell_type": "code", "execution_count": null, "id": "c7138ac4-dacf-4e64-9978-61fa55f5d192", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "c9049f98-ed59-49e6-bd2b-ffa1328ba6b7", "metadata": {}, "source": [ "### Q2.5\n", "\n", "Compute the average returns per period (grouped by ```eom```) that satisfy the following requirements: \n", "* Report ```eom``` and create a new variable ```mean_ret``` to denote the average returns for selected observations\n", "* The observations should satisfy ```iskew_capm_21d > 0.5``` and ```at_turnover > 0.1```\n", "* Sorted by ```eom``` in the descending order." ] }, { "cell_type": "code", "execution_count": null, "id": "73c7b813-298d-482d-b9f6-ddb2942884ac", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "dd130036-a94b-4c64-8f8a-9a3ea4ba06e4", "metadata": {}, "outputs": [], "source": [ "conn.close()" ] }, { "cell_type": "markdown", "id": "916f359e-55e5-4166-b36f-410874a0a50b", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "0fa62f21-71e5-461e-9378-a12ed14986f9", "metadata": {}, "source": [ "# END" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 5 }
11-06
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值