from above config, default bucket is DataBucket-1, however the JSON doc created in DataBucket-3, and interestingly such behavior matches below 2 endpoint call
The root of all your questions is really why does Sync Gateway think my-db is defined in DataBucket-3 instead of DataBucket-1.
I suspect what’s happened is you have somehow got into the situation where your Couchbase Server buckets have multiple Sync Gateway databases with the same name and Sync Gateway is choosing the first one it finds.
Can you clarify what buckets you have, and what Sync Gateway databases you expect to have created on each?
If you don’t want Sync Gateway looking in other buckets, you can restrict the RBAC permissions used by Sync Gateway’s bootstrap user, or specify a set of bucket_credentials
Thanks @bbrks
Could you please kindly clarify the concept of “Sync Gateway databases”? My understanding is the Sync Gateway is a gateway service and the Couchbase server is the persistence layer, so Couchbase server whose holding the buckets are the Sync Gateway databases
I am upgrading Sync Gateway to 3.1, so the minimal bootstrap defines multiple CB servers for all sync gateway instance
Your bootstrap config will not define a bucket, instead only a Couchbase Cluster. This is defined in your startup configuration file where you run Sync Gateway.
Sync Gateway will automatically discover all of the accessible buckets and load any database configurations which are found in each bucket.
These database configuration documents are keyed like you’ve already identified (_sync:dbconfig:my-db:default), and in normal circumstances only ever refer to the same bucket the config document is found in.
It is however possible, if you’ve moved documents between buckets (e.g. with backup/restore or XDCR) to have database configuration documents that refer to a different bucket than the one the database configuration is stored in.
In this case, it’s expected that Sync Gateway will not load the database, because it detects there’s a mismatch. We fixed this in version 3.1.2
Is there a way to designate bucket to a sync gateway, not only database?
I assume you already saw my note 2 posts ago about providing bucket_credentials to limit which buckets are being accessed?
If you’re upgrading from 2.8 all of these configuration changes were made in 3.0 so it might be helpful for you to review the 3.0 documentation.
You can disable persistent config and run in legacy config mode to retain your 2.8 behaviour, but there’s going to be limited support for new features when running in this mode. I wouldn’t recommend it for a long period of time, just maybe as a stepping stone to get you upgraded first, and then think about changing your configuration.
@bbrks
Could you please clarify Sync Gateway databases? From my example there are multiple sync gateway instances accessing my-db, which is the Couchbase database.
One the other hand, from my 2nd post, there are a couple CB server for my-db, each CB server consists of 4 data bucket.
I am not sure where to find other config/info to associate speicifc databucket to sync gateway “database”
A Sync Gateway database is the term used for the application definition by Sync Gateway. A database defines a backing data store in Couchbase (a bucket, and optionally a set of collections within that bucket).
The same Sync Gateway database can be running on multiple Sync Gateway instances (i.e. you can scale the database horizontally). Separately, the backing data store in Couchbase can also be distributed across multiple Couchbase nodes. These two layers of scaling are independent - see the diagram here for an example:
In 3.0 and later Sync Gateway stores database configurations in the default collection of the associated bucket. Sync Gateway is started with a bootstrap configuration that connects to the Couchbase cluster, and then looks for any databases defined in each of the buckets that the bootstrap user has access to.
As Ben suggests, the behaviour you’re seeing could be explained if you’ve got a Sync Gateway database config defined with the same name in multiple buckets that the bootstrap user has access to. You can check for _sync:dbconfig:my-db:default in DataBucket-1 and DataBucket-3 and see if it exists in multiple places.
The set of buckets that Sync Gateway attempts to access is based on the access of the user defined in the bootstrap configuration. So as Ben mentions, if your intention is to have a specific Sync Gateway cluster only communicating with a single bucket, one approach is to change the permissions of the bootstrap user to only grant them the “Sync Gateway” security role for the desired bucket.
Thanks @bbrks & @adamf for the explanation, sorry I am still confused. is my-db from my case the Sync Gateway database, or is it something else?
the diagram from the link
shows “travel-sample” as the database for the 2 sync gateway nodes, and “travel-sample” are also bucket name in each couchbase server.
Is it a correct understanding that
“travel-sample” database for each sync gateway node is separate concept of “travel-sample” bucket in each Couchbase server, however the name just match
“travel-sample” database is the so-called sync gateway database, it has it’s own physical datastore other than Couchbase server, and in the diagram 2 sync gateway nodes are sharing/pushing data to the same “travel-sample” database
“travel-sample” database in the digram == “my-db” in my setup
Regarding the bucket_credentials setting, I don’t recall creating password for the bucket, can I use the same username/password from database_credentials ?
Correct - these are two separate things with matching names in the example.
No, the Sync Gateway database doesn’t have it’s own physical data store. The Couchbase bucket is the backing storage - the Sync Gateway database is the logical collection of data, configuration, users, roles and more for the application, but the data storage is the backing Couchbase Server bucket.
Correct.
If you’re using the community edition of Couchbase Server you may not have the ability to define fine grained bucket security. If this is the case, you’ll want to remove the _sync:dbconfig:my-db:default files manually from the unwanted buckets.
The username/password credentials defined in Sync Gateway’s bootstrap config (bucket_credentials) specify the Couchbase Server user that Sync Gateway uses to connect to Couchbase Server. This user is defined and managed on the Couchbase Server side, as described here:
Additional documentation on managing the user on Couchbase Server is available here:
Through that UI you can define the set of buckets that the user has the Sync Gateway role for.
I am working on upgrade to 3.1.10 for 2 sync gateway nodes to run inter replication to each other, in such case, each node should have separate backing CB data server, then I assume in each CB server as long as I created a CB user that limit the access to the DataBucket-1 (also in each server), the same bootstrap config can used for the 2 sync gateway node (the further specific config i.e. push / pull etc will rely on the REST API)
Yes, if you’ve defined the same user in both CB clusters then you can use the same bootstrap config for each. All the details around setting up and managing Inter-Sync Gateway replication are covered in the documentation.
Thanks @adamf & @bbrks
One last thing need your kind clarification, in the Bootstrap configuration documentation, the bucket_credential is part of long list of JSON document for all the configurable properties. And within the JSON document there is attribute specifically named “bootstrap”
bootstrap: {
// bucket_credential is a separate attribute, not in the bootstrap attribute
ca_cert_path: "string",
config_update_frequency: "10s",
group_id: "default",
password: "string",
server: "string",
server_tls_skip_verify: false,
use_tls_server: true,
username: "string",
x509_cert_path: "string",
x509_key_path: "string"
}
Could you please kindly confirm whether the bucket_credentials setting is part of bootstrap setup, or we have to leverage the REST API to set it up.
The main concern is to have Sync Gateway write to the correct bucket when it starts up (boostrap), not by the following REST API call if bucket_credentials cannot configure during the bootstrap stage, which may cause data discrepancy.
@bbrks Thanks yes I also see that in the document, my question is whether the entire JSON document can be used for bootstrap configuration, or only the “boostrap” attribute and associated values can be used for bootstrap configuration, sorry I just need clarification about it.