Spark connector couchbase using java API, N1QL LIMIT option

MisterQ · July 17, 2023, 10:03am

Hello,

I use the Java API documentation for build my application, but when a run the query, the LIMIT is not considering. By default, it is 1000 but I get all database. If I use the

.option(QueryOptions.InferLimit(),“1”);

it is the same result, I get all database.

When I run SELECT * FROM system:completed_requests in couchbase, I can see my request but without the limit option…

Do you have an idea why the limit do not work?

Couchbase version 6.6.1
spark-core_2.12 version 3.3.0
spark-connector_2.12 version 3.3.1

I hope my question is enought understandable.

graham.pople · July 17, 2023, 10:56am

Hi @MisterQ . That setting is only used when inferring the schema. If you want to apply a LIMIT to a regular query statement, you just add it to the statement itself. E.g. “SELECT default.* FROM default WHERE <…> LIMIT 1000”

MisterQ · July 17, 2023, 12:01pm

Thank you for the answer.
How can I write a full custom request ?

I use the exemple of the documentation java API

       DataFrameReader sources = spark.read()
                .format("couchbase.query")
                .option(QueryOptions.Filter(), String.format("type = 'airline'"))
                .option(QueryOptions.Bucket(), "mybucket")
                .option(QueryOptions.InferLimit(),"1");

graham.pople · July 17, 2023, 2:26pm

Ah I see, you’re working with DataFrames. Can you try .limit(1000) on the end of the DataFrame chain?

MisterQ · July 17, 2023, 2:39pm

The .limit(1000) is only available after de .load() (but not after the read() ) but I want to not get all database in one time and then limit at 1000 load, because there is too much result in the read() and it takes too long.
I want to request limit in the .read() for performance.

But if you know how to write full custom request in java API, I am listening.

graham.pople · July 17, 2023, 3:19pm

I see, so it’s not pushed down. In that case your other option would be to use RDDs directly:

spark
  .sparkContext
  .couchbaseQuery[JsonObject]("select `travel-sample`.* from `travel-sample` LIMIT 1000")
  .collect()
  .foreach(println)

MisterQ · July 17, 2023, 3:45pm

In java API, there is no function “couchbaseQuery” or “query” after the sparkContext().

Do you know the name of the function I should use?

graham.pople · July 17, 2023, 4:16pm

Ah - this is because in Scala it’s an implicit method, pulled in with import com.couchbase.spark._ That should just be a static method though, you should be able to call it from Java if you explore that namespace.

system · October 15, 2023, 4:17pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Spark Connector: No results returned when invoking sqlContext.read.couchbase() Spark Connector	6	2881	April 26, 2016
Low N1QL queries/sec rate Spark Connector spark , n1ql	2	2173	January 3, 2017
Create Spark dataset using N1QL Query in Java Spark Connector spark , n1ql	5	2789	January 5, 2017
How to bulk read the data from couchbase in spark? Java SDK spark , n1ql	1	2099	December 10, 2018
N1QL select limit? SQL++	18	4141	January 7, 2017

Spark connector couchbase using java API, N1QL LIMIT option

Related topics