机器学习预测股票_是否进行基础投资工作,以尝试通过机器学习预测股票成功...

本文探讨了将机器学习应用于股票预测的可能性,试图通过这种方式来评估基础投资工作的效果。作者通过一个实例展示了如何利用大数据和深度学习技术进行股票成功的预测分析。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

机器学习预测股票

Like most of you, I have a strong interest in making more money and growing my savings faster. And as I’ve started my career over the last 2 years, I’ve been seeking advice on how to best manage the money I’ve earned.

像你们中的大多数人一样,我对赚更多的钱和更快地增加储蓄存有浓厚的兴趣。 在过去两年中开始我的职业生涯时,我一直在寻求有关如何最好地管理所赚钱的建议。

To date, my personal finance strategy has been relatively conservative — invest the bulk of my savings in an S&P 500 ETF and let my them grow with the market. My friend Emmanuel who now works in asset management told me that this method — the Vanguard method — held the best blend of risk and reward, and he gave me a book called “The Boglehead’s Guide to Investing”, written by Vanguard’s founder John C. Bogle. This book explained that putting your money in a total stock market ETF was best because trying to beat the market is time-consuming and nearly impossible — even the best hedge funds could only beat the market for a year or two before ultimately providing similar, or worse, returns.

到目前为止,我的个人理财策略一直相对保守-将我的大部分积蓄投资于S&P 500 ETF,并让它们随着市场增长。 我现在在资产管理部门工作的朋友伊曼纽尔(Emmanuel)告诉我,这种方法-先锋方法-最佳地结合了风险和回报,他给了我一本名为《伯格尔黑德的投资指南》的书,该书由先锋公司的创始人约翰C撰写。博格尔这本书解释说,最好将您的资金投入整个股票市场的ETF,因为试图击败市场是耗时的,几乎是不可能的-即使最好的对冲基金也只能在最终提供类似或之前的一两年内击败市场。更糟糕的是,回报。

Yet still, there have been a few examples of people successfully beating the market time and time again to make extraordinary wealth. The most famous example is Warren Buffet, who has used a process of evaluating companies based on their fundamentals — financial metrics like net income and ratios like Earnings per Share — to make investments that have helped him become one of the wealthiest people on the planet. In fact, a whole school of investing exists based on picking stocks based on their financial fundamentals, aptly named “fundamental investing.”

然而,仍然有一些人成功地一次又一次地击败市场以赚取非凡的财富。 最著名的例子是沃伦·巴菲特(Warren Buffet),他使用了一种基于公司基本面的公司评估流程-诸如净收入之类的财务指标以及诸如每股收益之类的比率-进行投资,帮助他成为地球上最富有的人之一。 实际上,整个投资流派都是根据其财务基础挑选股票而建立的,恰当地称为“基础投资”。

Fundamental investing is based on the idea that you can find stocks undervalued by the market based on their fundamentals. You look at how the market rates a stock compared to how fundamentalists evaluate that stock according to a specific valuation metric. The logic goes that if you can find and buy undervalued stocks, then you can make money when stocks reach their market potential.

基本投资基于这样的想法,即您可以根据基本面发现被市场低估的股票。 您要看一下市场对股票的定价与原教旨主义者根据特定估值指标对股票的评估相比。 逻辑是,如果您可以找到并购买被低估的股票,那么当股票达到市场潜力时就可以赚钱。

To test out whether fundamental investing principles really worked and see which specific fundamental metrics would factor most in predicting stock prices, I decided to build a model to try and evaluate fundamental investing.

为了测试基本投资原则是否真的有效,并查看哪些特定的基本指标是预测股票价格的主要因素,我决定建立一个模型来尝试和评估基本投资。

Note: The next few sections concern the data analysis and modeling process. Feel free to skip to the takeaways below.

注意:接下来的几节涉及数据分析和建模过程。 随意跳到下面的外卖店。

Data

数据

I scraped financial data and valuation metrics from every single company in the Russell3000. The 2 time periods I chose to scrape from were the stock price, several financial metrics, and several valuation metrics from January 2018. I then scraped the stock price of January 2020 in order to calculate the lift over 2 years. Data was scraped from Morningstar and Yahoo Finance.

我从Russell3000中的每个公司中抓取了财务数据和评估指标。 我选择抓取的2个时间段是股价,一些财务指标和2018年1月以来的一些估值指标。然后,我抓取了2020年1月的股价,以计算2年的提升。 数据来自Morningstar和Yahoo Finance。

Thus, I had my independent variables — the financial and valuation metrics — and the dependent predictor variable — the percent a stock increased between Jan 2018 and Jan 2020.

因此,我有了自己的自变量-财务和估值指标-以及因变量。从2018年1月到2020年1月,股票的增长百分比。

Feature Engineering

特征工程

After I collected the data, I engineered several ratios based on this article: “15 Financial Ratios that Every Investor Should Use” by investinganswers.com. Some of these ratios were:

收集数据后,我根据本文设计了几个比率: investingsanswers.com“每个投资者应使用的15个财务比率” 。 其中一些比率是:

  1. Price to Earnings Ratio — The stock price / earnings per share. Gives an idea of how the stock is valued compared to how much that company is actually making.

    市盈率-股票价格/每股收益。 给出与公司实际生产的股票相比,股票的估值方式。
  2. Price to Book Ratio — Price per share / book value per share. Compares how the market values a stock versus how the actual shareholders value it.

    市净率—每股价格/每股账面价值。 比较市场对股票的估价与实际股东对股票的估价。
  3. Return on Assets — Net income / average total assets. Measures of how well the company uses its assets to make money.

    资产回报率-净收入/平均总资产。 公司使用其资产赚钱的程度的度量。
  4. Return on Equity — Net income / average stockholders equity. Measures how much money the company generates for each dollar given to shareholders. If a company generates more money for its shareholders relative to other companies, the stock will be more valuable to the market.

    股本回报率-净收入/平均股东权益。 衡量公司为给股东的每一美元产生多少钱。 如果一个公司为股东带来的收益要比其他公司多,那么股票对市场的价值就会更高。

Modeling

造型

After removing multicollinearity, I tried several different regression models to look at my data. Among these, I used regular linear regression, ridge, and lasso, using GridSearchCV to optimize parameters.

消除多重共线性后,我尝试了几种不同的回归模型来查看我的数据。 其中,我使用规则的线性回归,岭和套索,并使用GridSearchCV优化参数。

However, the model performances on these regression types were very low, producing an R² of .01 and less. (R² measures how well your features capture the variance in your target). This suggested that either the pattern was likely non-linear, or that the data simply had little relationship with the target variable, or both.

但是,这些回归类型的模型性能非常低,R²为0.01和更低。 (R²衡量您的特征捕获目标差异的程度)。 这表明该模式可能是非线性的,或者数据与目标变量几乎没有关系,或者两者都有。

As a last ditch effort to get more out of my data, I created polynomial features out of all my existing ones and applied a Random Forest Regression model, which in a simplified sense uses “if, else” patterns to make predictions on the data. With this model I was able to improve the R² to .04 — a 400% increase although still quite poor.

为了从数据中获取更多信息,我付出了最后的努力,我从所有现有特征中创建了多项式特征,并应用了随机森林回归模型,该模型在简化意义上使用“ if,else”模式对数据进行预测。 使用此模型,我可以将R²提高到.04,尽管仍然很差,但提高了400%。

Results

结果

As stated above, the R² for these models was very poor, showing that the data had explained very little of the variance no matter what type of model and pattern I tried to use. In addition, I was able to reduce another performance indicator, RMSE, from 42% to 39%. But even so, the results still meant that on average, my predictions were 39% off. If a stock price went up by 50%, my model could predict that it only went up 11%.

如上所述,这些模型的R²非常差,表明无论我尝试使用哪种类型的模型和模式,数据都很少解释差异。 此外,我能够将另一个性能指标RMSE从42%降低到39%。 但是即使如此,结果仍然意味着我的预测平均下降了39%。 如果股价上涨了50%,我的模型可以预测它只会上涨11%。

In terms of silver linings, upon analyzing the feature importances of random forest model we see that all of the top 5 features contain 1 of 2 ratios: the Price to Book Ratio, and the Return on Assets.

就一线希望而言,通过分析随机森林模型的特征重要性,我们发现所有前5个特征均包含以下2个比率之一:市净率和资产收益率。

Image for post

Takeaways

外卖

  1. It is very difficult to produce a machine learning model that can accurately and precisely predict stock performance from just fundamental ratios. Likely, many more factors contribute to stock price that are just as if not more important, which is why in my best model stock fundamental ratios only explained 4% of the variance in the target. Such contributing factors could be how well a stock has been performing as of late (technical analysis) or how well a stock’s industry has been moving as a whole.

    很难产生一种仅基于基本比率就可以准确,准确地预测库存表现的机器学习模型。 可能还有更多因素似乎对股票价格产生了更大的影响,这就是为什么在我的最佳模型中,股票基本比率仅解释了目标方差的4%。 此类影响因素可能是股票最近的表现(技术分析)或股票行业整体的发展状况。
  2. This is not to say that fundamental analysis cannot lead to results — it just means that doing so might be enormously challenging. There’s a reason why most hedge funds don’t outperform the stock market each year.

    这并不是说基础分析不能得出结果,而只是意味着这样做可能具有极大的挑战性。 为什么大多数对冲基金每年都不能跑赢股票市场是有原因的。
  3. In cases where fundamental investing does work, perhaps we’re looking for cases where outliers exist in key variables such as the 2 identified above — Price to Book ratio and ROA. With regards to Price to Book, if a stock has an abnormally low ratio, then the market values the stock much lower than its book value. Here, it seems possible that the stock price will increase so that the market aligns more reasonably with the book value. With regards to ROA, a stock with an abnormally high ratio may suggest that this particular company may be doing something special that allows them to get the most out of their resources, suggesting better future performance. However, this is just my guess, and I would need to do further analysis to see if there’s any relationship between outliers and stock performance.

    在基本投资确实有效的情况下,也许我们正在寻找关键变量存在离群值的情况,例如上面确定的2个变量—市净率和ROA。 关于账面价格,如果股票的比率异常低,则市场对该股票的估值远低于其账面价值。 在这里,股票价格可能会上涨,从而使市场更合理地与账面价值保持一致。 关于ROA,具有异常高比率的股票可能表明这家特定公司可能正在做一些特殊的事情,使他们可以从资源中获得最大收益,从而暗示了更好的未来表现。 但是,这只是我的猜测,我需要做进一步分析,以查看离群值与股票表现之间是否存在任何关系。

翻译自: https://medium.com/@prestonlam/does-fundamental-investing-work-attempting-to-predict-stock-success-with-machine-learning-c3dcc139c6c

机器学习预测股票

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值