Hi,
I am looking to query couchbase from Zeppelin using python. The query below is scala which works but I want to utilise the python libraries for analysing my data. What python spark libraries should I import so that I can query my data using N1QL?
http://developer.couchbase.com/documentation/server/4.5/connectors/spark-1.2/working-with-rdds.html
import com.couchbase.spark._
val query = “SELECT name FROM travel-sample
WHERE type = ‘airline’ ORDER BY name ASC LIMIT 10”
sc
.couchbaseQuery(Query.simple(query))
.collect()
.foreach(println)
Thanks,
Mark
daschl
October 18, 2016, 10:57am
2
@mark.mikolajczak as far as I know the easiest way is probably to use dataframes / datasets right away since they do the abstraction for you. I think you can plug in the couchbase-spark dependency into spark when using it from python, but I haven’t really tried yet.
Here is a gist I came up with some time ago, does that help? Couchbase Spark Samples · GitHub
Thanks @daschl .
I had been using that method but the issue I have is that the automatic schema does not work on my data correctly.
If I am to use this method how can I manually map the schema? Say the fields are:
account[0].contactProfile.address.addressLine1 and account[0].contactProfile.address.townCity
schema
root
|-- META_ID: string (nullable = true)
|-- account: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- consent: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- isAllowed: boolean (nullable = true)
| | | | |-- type: string (nullable = true)
| | |-- contactProfile: struct (nullable = true)
| | | |-- address: array (nullable = true)
| | | | |-- element: struct (containsNull = true)
| | | | | |-- addressLine1: string (nullable = true)
| | | | | |-- addressLine2: string (nullable = true)
| | | | | |-- area: string (nullable = true)
| | | | | |-- description: string (nullable = true)
| | | | | |-- houseNameNumber: string (nullable = true)
| | | | | |-- isPrimary: boolean (nullable = true)
| | | | | |-- lookupPostcode: string (nullable = true)
| | | | | |-- postalOutcode: string (nullable = true)
| | | | | |-- postcode: string (nullable = true)
| | | | | |-- townCity: string (nullable = true)