hadoop-on-mongo demo 迁移至 spark-on-hadoop-mongo 再迁移至 mongo-spark-connector
spark
通过 mongo-hadoop
分析 mongodb数据
java写成
mongodb 原始数据
{
“_id” : ObjectId(“54d83f3548c9bc218e056ce6”),
“apMac” : “aacc
ee:ff”,
“proto” : “http”,
“url” : “extshort.weixin.qq.com”,
“clientMac” : “ffdd
bb:aa”
}
输出结果
mvn clean scala:compile compile package
执行方式
spark-submit —class “sparkfisrttest.cdpspark.App” —packages org.mongodb.mongo-hadoop1.3.1,org.mongodb
3.0.1,org.mongodb
3.0.1 ~/hadoop-spark-mongo-examples.jar
demo早期基于hadoop 的 mongo driver
mongo-hadoop-core
近期发现有spark官方的connector
https://docs.mongodb.com/spark-connector/current/
便加了基于mongo-spark-connector的rdd示例(dataset和sql尚不熟悉)