IT面试真题详解SQL（领取SQL详解一份）_关于itsql问题的中文-优快云博客

【课程简介】

SQL不仅在工作中使用频繁，在IT面试中SQL题出现频率也很高，而且题目占比越来越大，经常有同学面试挂在SQL题不会做痛失offer。本次课程将从技术使用跟面试真题两方面展开讲解，分享高效的SQL代码写法，告别拐弯抹角长篇大论轻松满足工作需求；并且将40多家大厂面试真题整理出SQL题集，把常见专题总结出简易解法，套上公式就会做。助力大家轻松掌握SQL！

【教程推荐】IT面试真题详解SQL（一）

【主讲内容】

1.怎样套上公式就会做

2.40+真实大厂原题解法

3.快如闪电高效SQL写法

【主讲教师】

金牌讲师：陈老师

前中国平安高级技术专家，有9年大数据研发经验，精通SQL授课通俗易懂。

1. SQL执行顺序

上面的SQL语句的执行顺序是: from (去加载table1 和 table2这2个表 ) -> join -> on -> where -> group by->select 后面的普通字段，聚合函数count,sum -> having -> distinct -> order by -> limit

--举例：

select a.sex, b.city, count(1) as cnt, sum(salary) as sum1

from table1 a

join table2 b on a.id=b.id

where a.name=b.name

group by a.sex,b.city

having cnt>=2

order by a.sex,b.city

limit 10

--或者是

select distinct

a.sex, b.city, a.age

from table1 a

join table2 b on a.id=b.id

where a.name=b.name

order by a.sex,b.city

limit 10

--on 和where的先后顺序讨论

--下面用left join 各得到结果，结果不一样。

--下面可知，先执行on，再执行where

select *

from table1 a left join table2 b

on a.id=b.id

where a.name=b.name;

--下面的条数可能会比上面多。

select *

from table1 a left join table2 b

2. hive10题

先配置环境在pycharm或datagrip或idea中配置hive数据源。也可以配置一个sparkSQL数据源，来加快速度。如果配置hive数据源：需要提前启动hdfs和yarn，hive的metastore，hive的hiveserver2

on a.id=b.id

and a.name=b.name;

--下面用inner join 各得到结果，结果是一样的

select *

from table1 a

join table2 b

on a.id=b.id

where a.name=b.name;

select *

from table1 a

join table2 b

on a.id=b.id

and a.name=b.name;

#启动hdfs和yarn

start-all.sh

# hive的metastore

nohup /export/server/hive/bin/hive --service metastore 2>&1 > /tmp/hivemetastore.log &

#hive的hiveserver2 #hiveserver2开启后，等过2分钟后才能生效。

nohup /export/server/hive/bin/hive --service hiveserver2 2>&1 > /tmp/hivehiveserver2.log &

如果遇到下面的问题解决办法如果配置SparkSQL数据源需要提前启动hdfs，hive的metastore，Spark的Thriftserver服务。下面是spark3集成hive3需要的jar包，如果是spark2集成hive2，则jar包不一样。

hive/conf/hive-env.sh中加入

export HADOOP_CLIENT_OPTS=" -Xmx512m" export HADOOP_HEAPSIZE=1024

改完重启hiveserver2

#启动hdfs和yarn

start-all.sh

# hive的metastore

nohup /export/server/hive/bin/hive --service metastore 2>&1 > /tmp/hivemetastore.log &

#Spark的Thriftserver服务

/export/server/spark/sbin/start-thriftserver.sh \ --hiveconf hive.server2.thrift.port=10001 \ --hiveconf hive.server2.thrift.bind.host=node1 \ --master local[*]

show databases ;

create database if not exists test_sql; use test_sql;

-- 一些语句会走 MapReduce，所以慢。可以开启本地化执行的优化。

set hive.exec.mode.local.auto=true;-- (默认为false) --第1题：访问量统计

CREATE TABLE test_sql.test1 ( userId string, visitDate string, visitCount INT ) ROW format delimited FIELDS TERMINATED BY "\t";

INSERT overwrite TABLE test_sql.test1

VALUES

( 'u01', '2017/1/21', 5 ), ( 'u02', '2017/1/23', 6 ), ( 'u03', '2017/1/22', 8 ), ( 'u04', '2017/1/20', 3 ), ( 'u01', '2017/1/23', 6 ), ( 'u01', '2017/2/21', 8 ), ( 'u02', '2017/1/23', 6 ), ( 'u01', '2017/2/22', 4 );

select *, sum(sum1) over(partition by userid order by month1 /*rows between unbounded preceding and current row*/ ) as `累积` fro