Scala 学习<一>

import scala.collection.mutable
import scala.collection.immutable

object Test {
  def main(args: Array[String]): Unit = {
    
    // 创建字符串数组
    var arr = new Array[String](3)
    arr(0) = "jack"
    arr(1) = "james"
    //var arr = Array("I","Love","you")
    // 迭代遍历
    for(str <- arr){
      println(str)
    }
    // 下标遍历
    for(i <- 0 to 2){
      println(i + ":" + arr(i))
    }
    
    // List使用,更多方法见文档
    var list1 = List(1,2)
    var list2 = List(3,4,5)
    println(list1:::list2)
    println(6::list2)
    println(7::Nil)
    
    // tuple元组
    var pair = (12, "hello", 1.3)
    println(pair._1) 
    println(pair._2)
    println(pair._3)
  
    // set 集合
    var set1 = Set("jack","rose")
    set1 + "james"
    println(set1)
    
    var set2 = mutable.Set("jack","rose")
    set2 + "james"
    println(set2)
    
    var set3 = immutable.Set("jack","rose")
    set3 + "james"
    println(set3)

    val map1 = mutable.Map[Int, String]()
    map1 += (1 -> "hello")
    map1 += (2 -> "world")
    map1.foreach(println)
    
    val map2 = Map(1 -> "one", 2 -> "two")
    for(m <- map2){
      println(m._2)
    }
     
  }
}

scala> // 导入必要的包 scala> import org.apache.spark.sql.SparkSession import org.apache.spark.sql.SparkSession scala> import org.elasticsearch.spark.sql._ import org.elasticsearch.spark.sql._ scala> scala> // 创建SparkSession实例 scala> val spark = SparkSession.builder() spark: org.apache.spark.sql.SparkSession.Builder = org.apache.spark.sql.SparkSession$Builder@71e5cd05 scala> .appName("ElasticsearchReadExample") res0: org.apache.spark.sql.SparkSession.Builder = org.apache.spark.sql.SparkSession$Builder@71e5cd05 scala> .getOrCreate() res1: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@61267fa2 scala> scala> // 查看spark变量的类型,确保是SparkSession scala> println(spark.getClass) class org.apache.spark.sql.SparkSession$Builder scala> scala> val defaultQuery: String = "?q=phone_no:5143217" defaultQuery: String = ?q=phone_no:5143217 scala> val esTable = "mediamatch_usermsg" esTable: String = mediamatch_usermsg scala> val options = Map( | ("es.nodes", "master"), | ("es.port", "9200"), | ("es.read.metadata", "false"), | ("es.mapping.date.rich", "false"), | ("es.net.http.auth.user", "elastic"), | ("es.net.http.auth.pass", "i55on9YR90t+r8z8-OSpi"), | ("es.nodes.wan.only", "true") | ) options: scala.collection.immutable.Map[String,String] = Map(es.nodes.wan.only -> true, es.net.http.auth.user -> elastic, es.net.http.auth.pass -> i55on9YR90t+r8z8-OSpi, es.mapping.date.rich -> false, es.port -> 9200, es.read.metadata -> false, es.nodes -> master) scala> scala> val esDf = spark.esDF(esTable, defaultQuery, options) <console>:30: error: value esDF is not a member of org.apache.spark.sql.SparkSession.Builder val esDf = spark.esDF(esTable, defaultQuery, options) ^ scala> esDf.select("phone_no", "owner_name", "owner_code", "run_name", "run_time").show() <console>:27: error: not found: value esDf esDf.select("phone_no", "owner_name", "owner_code", "run_name", "run_time").show() ^ scala> val df = spark.read.format("org.elasticsearch.spark.sql") <console>:27: error: value read is not a member of org.apache.spark.sql.SparkSession.Builder val df = spark.read.format("org.elasticsearch.spark.sql") ^ scala> .options(options) <console>:28: error: value options is not a member of scala.collection.immutable.Map[String,String] options .options(options) ^ scala> .load(s"$esTable/$defaultQuery") <console>:30: error: value load is not a member of scala.collection.immutable.Map[String,String] options .load(s"$esTable/$defaultQuery") ^ scala> scala> df.select("phone_no", "owner_name", "owner_code", "run_name", "run_time").show() <console>:27: error: not found: value df df.select("phone_no", "owner_name", "owner_code", "run_name", "run_time").show() ^ scala> import org.elasticsearch.spark.sql._ // 确保此行有效 import org.elasticsearch.spark.sql._ scala> implicit none => _root_.org.elasticsearch.spark.sql.EsSparkSQL.registerFunctions(spark.sqlContext) <console>:1: error: expected start of definition implicit none => _root_.org.elasticsearch.spark.sql.EsSparkSQL.registerFunctions(spark.sqlContext) ^ scala>
03-26
从您提供的内容来看,似乎在尝试将Apache Spark与Elasticsearch集成时遇到了些问题。下面是对您的代码片段的分析以及可能的问题原因。 --- ### 问题分析 #### **1. `value esDF is not a member of org.apache.spark.sql.SparkSession.Builder`** 错误信息表明`esDF`并不是`SparkSession.Builder`的部分。这通常是因为: - 您对尚未完成构建的`SparkSession.Builder`对象调用了方法(如`.esDF()`),而不是已经通过`.getOrCreate()`生成的实际`SparkSession`对象。 解决办法:确保先创建好实际的`SparkSession`再对其操作。例如: ```scala val spark = SparkSession.builder() .appName("ElasticsearchReadExample") .getOrCreate() // 这里才真正得到个可用的 SparkSession 实例 // 接下来才能使用这个真正的 SparkSession 对象进行后续处理... ``` #### **2. `not found: value esDf` 和其他类似找不到变量的情况** 这是由于前面步骤出错导致`esDf`未能成功声明或初始化所引起的连锁反应。旦修复上述第步中的问题,这里自然也就迎刃而解了。 #### **3. 关于隐式转换失败的部分 (`implicit none => ...`)** 这里的语法看起来有些混淆不清,并且直接引用了内部API `_root_.org.elasticsearch.spark.sql.EsSparkSQL.registerFunctions(...)` 的形式,不太推荐这样做。正常情况下我们只需要简单地导入正确的包即可启用所需功能支持。即只要保证有这句就足够了: ```scala import org.elasticsearch.spark.sql._ ``` 它会自动加载所有必需的功能到环境中去。如果还有必要的话可以显式的注册次函数但不必采用如此复杂的表达方式。 --- ### 正确示例流程 下面是整理后的完整正确版本供参考对比学习: ```scala import org.apache.spark.sql.SparkSession import org.elasticsearch.spark.sql._ object ElasticsearchIntegration { def main(args: Array[String]): Unit = { // Step 1 - Create Spark Session with proper configurations for ES integration. val spark = SparkSession.builder() .appName("ES Integration Example") .config("spark.master", "local") // For testing purposes; change accordingly in production envs. .getOrCreate() import spark.implicits._ // This brings DataFrame DSL operations into scope. try{ // Define your connection parameters as key-value pairs within an immutable map structure. val optionsMap = Map( "es.nodes" -> "localhost", "es.port" -> "9200", "es.index.auto.create"-> "true", "path.growth" -> "/mydata/esindex/" ) // Specify the target index and any query string if applicable. val esIndexName = "mediamatch_usermsg" val queryString = "?q=phone_no:5143217" // Load data from elasticsearch using defined settings & show results selectively. val esDataframe = spark.read.format("org.elasticsearch.spark.sql").options(optionsMap).load(esIndexName + "/" +queryString) esDataframe.select($"phone_no",$"owner_name",$"owner_code",$"run_name",$"run_time").show(false) }finally{ spark.stop() // Cleanup resources after finishing job execution phase gracefully. } } } ``` --- ### 总结关键点 - 需要明确区分 `SparkSession.Builder` 和最终生成的具体 `SparkSession` 实体之间的区别; - 确保所有的依赖项都已正确定义并配置完毕后再开始执行业务逻辑部分; - 利用官方文档指导来设置连接选项和其他相关属性更为稳妥可靠;同时注意保持最新稳定版库文件兼容性以免出现不必要的麻烦。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值