Hive实验2

最新推荐文章于 2024-01-03 11:10:48 发布

QuYu~

最新推荐文章于 2024-01-03 11:10:48 发布

阅读量3.4k

点赞数 16

本文链接：https://blog.youkuaiyun.com/qq_41989845/article/details/103132409

版权

1.正确建表，导入数据（三张表，三份数据），并验证是否正确
create table course(cid int ,cname string) row format delimited fields terminated by ‘,’;
load data local inpath ‘/home/hadoop/file/hive/course.csv’ into table course;
select * from course;

create table student(sid int ,sname string,grade int ,class int) row format delimited fields terminated by ‘,’;
load data local inpath ‘/home/hadoop/file/hive/student.csv’ into table student;
select * from student;

create table score(sid int ,cid int ,score int) row format delimited fields terminated by ‘,’;
load data local inpath ‘/home/hadoop/file/hive/score.csv’ into table score;
select * from score;

2.查询所有学生的成绩信息：学生姓名、课程名、课程成绩。

select sname,cname,score from student,course,score where student.sid=score.sid and score.cid=course.cid;

3.查询编号为10的课程比编号为20的课程成绩高的学生的编号及课程分数

select distinct x.sid,x.score,y.score from score x ,score y where (x.cid=10 and y.cid=20) and (x.score>y.score) and x.sid=y.sid;

4.查询平均成绩大于等于60分的同学的学生编号和学生姓名和平均成绩
select x.sid,y.sname,avg(x.score) from score x,student y where x.sid=y.sid
group by x.sid,y.sname having avg (x.score)>=60 ;

5.查询并创建表格：从数据中获取学生的姓名和各科成绩，并将学生姓名，各科成绩的数组形式数据保存在temp表格中。
结果形如：Eric [53,29,33,27,22,43,55]
insert overwrite local directory ‘/home/hadoop/temp’
select sname,collect_list(score) from student,score where student.sid=score.sid group by sname;

6.查询编号为10的课程的平均分
select avg(score) from score where cid=10 group by cid;

7.查询每门课程的平均分（课程编号，课程名，平均分）
select score.cid,course.cname,avg(score.score) from score ,course where course.cid=score.cid group by score.cid,course.cname;

8.按照课程对学生的成绩进行顺序排序(课程编号，学生编号，成绩，排名)
select cid,sid,score,rank() over (partition by cid order by score desc ) rank from score ;

9.查询每门课程第一名（所有第一名）（课程名称，学生姓名，成绩）
Select * from (select course.cname,student.sname,score.score,rank() over(partition by score.cid order by score.score desc) seq from student,score,course where student.sid=score.sid and course.cid=score.cid) tab where tab.seq=1 ;

10.统计每门课程不及格的学生人数（课程编号，不及格人数）
select cid,count(sid) from score where score.score<60 group by cid ;

11.统计每门课程不及格的学生的姓名（课程编号，不及格学生姓名集合）形如： 20 [“Eric”,“Joy”]
select cid,collect_list(sname) from score,student where score.sid=student.sid and score<60 group by cid;

12.查询两门及其以上不及格课程的学生的学号，姓名
Select student.sid,student.sname from student , score where student.sid=score.sid and score.score<60 group by student.sid,student.sname having count(score.score)>=2;

13.查询学生的总成绩并进行排名（姓名总分名次）
Select sname,sum(score),rank() over(order by sum(score) desc) from student,score where student.sid=score.sid group by student.sid,student.sname;

14.按平均成绩从高到低显示所有学生的所有课程的成绩以及平均成绩（sid,cid,score,average）
Select sid,cid,score,avg(score) over (partition by sid ) average from score order by average desc;

15.查询各科成绩最高分、最低分和平均分（课程编号，最高分，最低分，平均分）
Select cid,max(score),min(score),avg(score) from score group by cid;