这里写自定义目录标题
Spark在Zeppelin界面集成Couchbase
集成Couchbase Spark Connector
修改zeppelin-env.sh文件,添加对couchbase spark-connector的支持
export SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0,com.couchbase.client:spark-connector_2.11:2.2.0"
spark-shell默认启动spark、sc, 而包含couchbase session的相关代码在初始化spark的是后创建,所以需要首先关闭系统初始化的spark,然后重新初始化。
在Zeppelin Interpreter界面重启spark application
新建Note,在第一个paragraph运行spark.close()
创建Couchbase连接,并进行测试
测试数据采用Couchbase 默认数
import org.apache.spark.sql._
import org.apache.spark.sql.types._
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.sources._
import com.couchbase.spark._
import com.couchbase.spark.sql._
import com.couchbase.client.java.document.JsonDocument
import com.couchbase.client.java.document.json.{JsonArray, JsonObject}
import com.couchbase.client.java.query.N1qlQuery
import com.couchbase.client.java.view.ViewQuery
val spark = SparkSession.
builder().
appName("KeyValueExample").
master("spark://10.110.125.168:7077").
config("spark.couchbase.nodes", "10.110.124.28").
config("spark.couchbase.username", "Administrator").
config("spark.couchbase.password", "xxxxxx").
config("spark.couchbase.bucket.travel-sample", "").
getOrCreate()
/*val sc = spark.sparkContext
sc.couchbaseGet[JsonDocument](Seq("airline_10123", "airline_10748")).collect().foreach(println)
val schema = StructType(
StructField("META_ID", StringType) ::
StructField("type", StringType) ::
StructField("name", StringType) :: Nil
)
val dataFrame = spark.read.couchbase()*/
val airline = spark.read.couchbase(schemaFilter =EqualTo("type", "airline"))
//airline.printSchema()
airline.select("name", "callsign").sort(airline("callsign").desc).show(10)
运行结果