8.Looking Too Hard for Patterns: a post about finding spurious patterns

本文探讨了《Pi》这部电影中主角Max如何运用数学和计算机科学来预测股市,揭示了科学方法的正确应用以及寻找结果带来的偏见。通过使用Google Correlate工具分析搜索趋势,展示了在数据背后隐藏的潜在误导性关联,强调了正确的统计应用对于理解自然现象的重要性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Today, March 14th, is Pi Day. In celebration, this post is related to the film Pi.

Check out the retro style of his computer

Pi is the first film by Darren Aronofsky, who went on to make Requiem for a Dream and Black Swan. I’ll try not to spoil too much, but the starting premise is that the main character, Max, is a mathematician/computer-scientist, who believes he can model the stock market and predict future stock behaviour, if only he finds the right model. I was recently reminded of this central quote from Pi (via Tom Crick), which can be heard in the film’s trailer:

Restate my assumptions:

  1. Mathematics is the language of nature.
  2. Everything around us can be represented and understood through numbers.
  3. If you graph these numbers, patterns emerge. Therefore: there are patterns everywhere in nature.


By stating his assumptions, Max is following the scientific process (hurrah!). This allows us to analyse his assumptions and see if he has made a mistake. Indeed — the implication of his third assumption is flawed: if you graph things, patterns do emerge — but they might well be spurious.

Google Correlate

Google have released a tool that (inadvertently?) demonstrates this wonderfully:Google Correlate. The idea is that you can enter a term and see what other search terms produce a similar trend. That sounds somewhat useful. I decided to use the term “Greenfoot”. Here’s one of the top results I got at the time (Greenfoot is blue, the matching term is red):

That’s quite a decent match, and has a correlation coefficient of 0.9477. As Max suggested, we’ve graphed the numbers, and a pattern has emerged. This red term that matches so well with Greenfoot is… “Google Images”. Not very useful, and not much of a pattern: these two terms correlate well because they originated around the same time, and have grown in search-popularity with a similar pattern ever since. But really, this seems to me to be a spurious result (technically, a “type I” error): we’ve found an effect where really there is none.

This is the problem with Max’s approach. There are patterns everywhere if you look hard enough, but that doesn’t mean that they’re useful. And this is a real problem in science, especially with measurement techniques that generate a large amount of data (on which you can then perform a large variety of analysis). One example of a troublesome area is the neuroscience technique fMRI, wheretoo many comparisons can lead to a dead fish detecting human emotions. The quality of our understanding of the human brain is dependent on statistics being applied properly… by human brains. (Recursion!)

And so in Pi, Max demonstrates the dark side of science: an obsession with finding a result that drives him so hard that he loses his impartiality and risks finding phantom results. There are techniques to mitigate this problem, called alpha-level correction, and I intend to cover some statistics in future blog posts which will explain these sorts of issues.

资源下载链接为: https://pan.quark.cn/s/9e7ef05254f8 行列式是线性代数的核心概念,在求解线性方程组、分析矩阵特性以及几何计算中都极为关键。本教程将讲解如何用C++实现行列式的计算,重点在于如何输出分数形式的结果。 行列式定义如下:对于n阶方阵A=(a_ij),其行列式由主对角线元素的乘积,按行或列的奇偶性赋予正负号后求和得到,记作det(A)。例如,2×2矩阵的行列式为det(A)=a11×a22-a12×a21,而更高阶矩阵的行列式可通过Laplace展开或Sarrus规则递归计算。 在C++中实现行列式计算时,首先需定义矩阵类或结构体,用二维数组存储矩阵元素,并实现初始化、加法、乘法、转置等操作。为支持分数形式输出,需引入分数类,包含分子和分母两个整数,并提供与整数、浮点数的转换以及加、减、乘、除等运算。C++中可借助std::pair表示分数,或自定义结构体并重载运算符。 计算行列式的函数实现上,3×3及以下矩阵可直接按定义计算,更大矩阵可采用Laplace展开或高斯 - 约旦消元法。Laplace展开是沿某行或列展开,将矩阵分解为多个小矩阵的行列式乘积,再递归计算。在处理分数输出时,需注意避免无限循环和除零错误,如在分数运算前先约简,确保分子分母互质,且所有计算基于整数进行,最后再转为浮点数,以避免浮点数误差。 为提升代码可读性和可维护性,建议采用面向对象编程,将矩阵类和分数类封装,每个类有明确功能和接口,便于后续扩展如矩阵求逆、计算特征值等功能。 总结C++实现行列式计算的关键步骤:一是定义矩阵类和分数类;二是实现矩阵基本操作;三是设计行列式计算函数;四是用分数类处理精确计算;五是编写测试用例验证程序正确性。通过这些步骤,可构建一个高效准确的行列式计算程序,支持分数形式计算,为C++编程和线性代数应用奠定基础。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值