spark之Row

最新推荐文章于 2025-04-28 10:14:47 发布

转载最新推荐文章于 2025-04-28 10:14:47 发布 · 1.4k 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://blog.youkuaiyun.com/weixin_36630761/article/details/77101357

Spark 专栏收录该内容

49 篇文章

订阅专栏

hkl曰：直接搞过来官方的API文档，看不懂英文没关系，看它的实例就可以了。

http://spark.apache.org/docs/1.3.1/api/scala/index.html#org.apache.spark.sql.Row

注意：import org.apache.spark.sql._ 要先导入这个包。

Represents one row of output from a relational operator. Allows both generic access by ordinal, which will incur boxing overhead for primitives, as well as native primitive access.

It is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null.

To create a new Row, use RowFactory.create() in Java or Row.apply() in Scala.

A Row object can be constructed by providing field values. Example:

import org.apache.spark.sql._

// Create a Row from values.
Row(value1, value2, value3, ...)
// Create a Row from a Seq of values.
Row.fromSeq(Seq(value1, value2, ...))

A value of a row can be accessed through both generic access by ordinal, which will incur boxing overhead for primitives, as well as native primitive access. An example of generic access by ordinal:

import org.apache.spark.sql._

val row = Row(1, true, "a string", null)
// row: Row = [1,true,a string,null]
val firstValue = row(0)
// firstValue: Any = 1
val fourthValue = row(3)
// fourthValue: Any = null

For native primitive access, it is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null. An example of native primitive access:

// using the row from the previous example.
val firstValue = row.getInt(0)
// firstValue: Int = 1
val isNull = row.isNullAt(3)
// isNull: Boolean = true

In Scala, fields in a Row object can be extracted in a pattern match. Example:

import org.apache.spark.sql._

val pairs = sql("SELECT key, value FROM src").rdd.map {
  case Row(key: Int, value: String) =>
    key -> value
}

实在想看懂的，就直接用翻译插件吧。

这个都是它的具体的API,

还有一个要注意的是他的一个API isNullAt（）这个索引第一位的元素是 0 。

size 方法是显示row里元素的个数。

row的api很丰富，自己需要的话就查看文章顶部的链接。