db2 数据库学习_如果我能在2年以上的时间开始学习,id如何学习数据科学

本文为非技术背景的学习者提供了一份数据科学学习指南,强调了数学与统计、编程基础(特别是Python和SQL)以及机器学习算法和概念的重要性。作者建议先从微积分、统计和线性代数开始,然后学习Python和Pandas,再深入机器学习算法,如线性回归、逻辑回归等。最后,通过实践项目巩固所学知识。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

db2 数据库学习

表中的内容 (Table of Content)

  1. Preface

    前言
  2. Introduction

    介绍
  3. Mathematics and Statistics

    数理统计
  4. Programming Fundamentals

    编程基础
  5. Machine Learning Algorithms and Concepts

    机器学习算法和概念
  6. Data Science Projects

    数据科学项目

前言 (Preface)

Coming from a non-technical background, I was more-or-less on my own.

来自非技术背景,我自己或多或少。

When I first started my data science journey, I spent a chunk of time figuring out where to even begin, what I should learn first, and what resources I should use.

当我第一次开始数据科学之旅时,我花了很多时间弄清楚从哪里开始,应该首先学习什么以及应该使用哪些资源。

Over the past two years, I’ve learned several things that I wish someone could have told me, like whether to focus on programming or statistics first, what resources I should use to learn new skills, how I should approach learning new skills, etc…

在过去的两年中,我学到了一些我希望别人能告诉我的东西,例如是先专注于编程还是统计学,我应该使用哪些资源来学习新技能,应该如何学习新技能,等等。 …

Therefore, this article aims to provide some direction and insights for those who are learning data science.

因此,本文旨在为正在学习数据科学的人们提供一些指导和见识。

介绍 (Introduction)

My assumption is that as an aspiring Data Scientist, you’ll want to fully understand the concepts and details of various machine learning algorithms, data science concepts, and so forth.

我的假设是,作为一个有抱负的数据科学家,您将需要充分 理解各种机器学习算法,数据科学概念等的概念和细节。

Therefore, I recommend that you start with the building blocks before you even look at machine learning algorithms or data science applications. If you don’t have a basic understanding of calculus & integrals, linear algebra, and statistics, you’ll have a hard time understanding the mechanics behind various algorithms. Likewise, if you don’t have a basic understanding of Python, you’ll have a hard time implementing your knowledge in real-life applications.

因此,我建议您在了解机器学习算法或数据科学应用程序之前先从构建模块入手。 如果您对微积分和积分,线性代数和统计信息没有基本的了解,那么您将很难理解各种算法的原理。 同样,如果您对Python没有基本的了解,那么您将很难在现实的应用程序中实现知识。

Below is the order of topics that I recommend you go through:

以下是我建议您完成的主题顺序:

  1. Mathematics and Statistics

    数理统计
  2. Programming Fundamentals

    编程基础
  3. Machine Learning Algorithms and Concepts

    机器学习算法和概念

1.数学与统计 (1. Mathematics and Statistics)

Like anything else, you have to learn the fundamentals before you get to the fun stuff. TRUST ME, I would’ve had a much easier time if I started with learning mathematics and statistics before getting into any machine learning algorithms.

像其他任何东西一样,您必须先学习基础知识,然后才能获得有趣的东西。 信任我,如果我先学习数学和统计学,然后再学习任何机器学习算法我的时间就会轻松得多。

The three general topics that I recommend you review are calculus/integrals, statistics, and linear algebra (in no particular order).

我建议您查看的三个通用主题是微积分/积分,统计量和线性代数(无特定顺序)。

一个。 积分 (a. Integrals)

Integrals are essential when it comes to probability distributions and hypothesis testing. While you don’t need to be an expert, it’s in your best interest to learn the fundamentals of integrals.

当涉及概率分布和假设检验时,积分至关重要。 虽然您不需要成为专家,但是学习积分的基本知识对您来说是最大的利益。

The first two articles are for those who want to get an idea of what integrals is all about or for those who simply need a refresher. If you know absolutely nothing about integrals, I recommend that you complete Khan Academy’s course. Lastly, I’ve provided a link to a number of practice problems to hone your skills.

前两篇文章是为那些想要了解积分的概念的人或仅需要复习知识的人提供的。 如果您对积分一无所知,建议您完成可汗学院的课程。 最后,我提供了一些练习问题的链接,以磨练您的技能。

b。 统计 (b. Statistics)

If there was one topic that you should focus the majority of your time on, it’s statistics. After all, a data scientist is really a modern statistician and machine learning is a modern term for statistics.

如果您有一个主题应将大部分时间集中在一个主题上,那么它就是统计信息。 毕竟,数据科学家确实是现代统计学家,而机器学习是统计学的现代术语。

If you have the time, I recommend that you go through Georgia Tech’s course called “Statistical Methods”, which covers probability fundamentals, random variables, probability distributions, hypothesis testing, and more.

如果您有时间,我建议您参加佐治亚理工学院的“ 统计方法 ”课程,该课程涵盖概率基础知识,随机变量,概率分布,假设检验等。

If you don’t have the time to commit to the course above, I definitely recommend going through Khan Academy’s video on Statistics.

如果您没有时间参加以上课程,我绝对建议您浏览可汗学院关于统计学的视频

C。 线性代数 (c. Linear Algebra)

Linear Algebra is especially important if you want to get into deep learning, but even then, it’s good to know for other fundamental machine learning concepts, like principal component analysis and recommendation systems.

如果想进入深度学习,线性代数尤为重要,但是即使如此,对于其他基本的机器学习概念(例如主成分分析和推荐系统)还是很了解的。

For Linear Algebra, I also recommend Khan Academy!

对于线性代数,我也推荐可汗学院

2.编程基础 (2. Programming Fundamentals)

Just as having a fundamental understanding of math and stats is important, having a fundamental understanding of programming will make your life much easier, especially when it comes to implementation. Therefore, I recommend that you take the time to learn basic SQL and Python before diving into machine learning algorithms.

正如对数学和统计数据有基本了解一样,对编程有基本了解也将使您的生活更加轻松,特别是在实现方面。 因此,我建议您先花些时间学习基本SQL和Python,然后再深入研究机器学习算法。

一个。 SQL (a. SQL)

It’s entirely up to you whether you want to learn Python or SQL first, but if you were to ask me, I’d start with SQL. Why? It’s easier to learn and it’s useful to know if you work for a company that works with data, EVEN if you’re not a data scientist.

是否要先学习Python还是SQL完全取决于您,但是如果您要问我,我将从SQL开始。 为什么? 学习起来更容易,知道是否为一家处理数据的公司工作会很有用,即使您不是数据科学家,也是如此。

If you’re completely new to SQL, I recommend going through Mode’s SQL tutorials, as it’s very succinct and thorough. If you want to learn more advanced concepts, I would check out my list of resources where you can learn advanced SQL.

如果您不熟悉SQL,建议您阅读ModeSQL教程 ,因为它非常简洁和透彻。 如果您想学习更高级的概念,请查看我可以在其中学习高级SQL的资源列表

More importantly, below are a handful of resources that you can use to practice SQL.

更重要的是,以下是一些可用于练习 SQL的资源。

b。 Python (b. Python)

I started with Python, and I’ll probably stick with Python for the rest of my life. It’s so far ahead in terms of open source contributions, and it’s straightforward to learn. Feel free to go with R if you want, but I have no opinions or advice to provide regarding R.

我从Python开始,在余生中可能会继续使用Python。 就开源贡献而言,它遥遥领先,而且很容易学习。 如果愿意,可以随时选择R,但是对于R,我没有任何意见或建议。

Personally, I found that learning Python by ‘doing’ is much more helpful. That being said, after going through several Python crash courses, I found this one to be the most comprehensive (and it’s free!).

我个人发现,通过“做”来学习Python会更有帮助。 话虽如此,在经历了几次Python速成班之后,我发现这是最全面的(而且是免费的!)。

C。 大熊猫 (c. Pandas)

Arguably the most important library to know in Python is Pandas, which is specifically meant for data manipulation and analysis.

可以说,用Python知道的最重要的库是Pandas,它专门用于数据处理和分析。

Below are two resources that should ramp you up pretty quickly. The first link is a tutorial on how to use Pandas and the second link provides dozens and dozens of practice problems that you can use to solidify your learnings!

以下是两个可以使您快速入门的资源。 第一个链接是有关如何使用熊猫的教程,第二个链接提供了数十个练习问题,可用来巩固您的学习经验!

3.机器学习算法和概念 (3. Machine Learning Algorithms and Concepts)

If you’ve gotten to this point, that means that you’ve built your foundation and you’re ready to learn the fun stuff. This part is split into two parts: machine learning algorithms and machine learning concepts.

如果您到了这一步,那意味着您已经建立了基础,并且已经准备好学习有趣的东西。 本部分分为两部分:机器学习算法和机器学习概念。

一个。 机器学习算法 (a. Machine Learning Algorithms)

The next step is to learn about the various machine learning algorithms, how they work, and when to use them. Below is a non-exhaustive list of the various machine learning algorithms and resources that you can use to learn about each one.

下一步是了解各种机器学习算法,它们如何工作以及何时使用它们。 以下是各种机器学习算法和资源的详尽列表,您可以用来学习每种算法和资源。

b。 机器学习概念 (b. Machine Learning Concepts)

Similarly, there are several fundamental machine learning concepts that you’ll want to go over as well. Below is a (non-exhaustive) list of concepts that I strongly recommend you go through. A lot of interview questions are based on these topics!

同样,您还需要了解几个基本的机器学习概念。 以下是(非详尽的)概念列表,我强烈建议您仔细阅读。 很多面试问题都基于这些主题!

4.数据科学项目 (4. Data Science Projects)

By this point, you’ll not only have a strong foundation built, but also a solid understanding of machine learning fundamentals. Now it’s time to work on some personal side projects, the same way coders have their own side projects too.

至此,您不仅将拥有坚实的基础,而且还将对机器学习的基础有深入的了解。 现在是时候进行一些个人项目了,编码人员也有自己的项目。

If you want to look at some simple data science project examples, check out some of my projects below:

如果您想看一些简单的数据科学项目示例,请查看以下我的一些项目:

  • Predicting Wine Quality w/ Classification Techniques (Article, Github)

    使用分类技术预测葡萄酒质量( 文章Github )

  • Coronavirus Data Visualizations using Plotly (Article, Github)

    使用Plotly进行冠状病毒数据可视化( ArticleGithub )

  • Collaborative Filtering Recommendation System for Movies (Github)

    电影协同过滤推荐系统( Github )

Here’s a list of data science projects that you can look at to generate ideas and come up with an interesting side project of your own.

是数据科学项目的列表 ,您可以查看这些项目以产生想法并提出自己的有趣的附带项目。

谢谢阅读! (Thanks for Reading!)

I hope that this provides some direction and helps you in your data science careers. There’s no cookie-cutter way of approaching this, so feel free to take this with a grain of salt, but I truly believe that learning the fundamentals will pay dividends in the future.

我希望这可以提供一些指导,并在您的数据科学职业中为您提供帮助。 没有解决这一问题的方法,因此可以随便拿一点,但是我真正相信,学习基础知识会在将来带来回报。

申ence (Terence Shin)

翻译自: https://towardsdatascience.com/how-id-learn-data-science-if-i-could-start-over-2-years-in-b821d8a4876c

db2 数据库学习

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值