How do I enforce a unique constraint in CouchBase where the combined url and siteName values must be unique and the total length of url and siteName can be longer than the key length limit of CouchBase:
{
url: "http://google.com",
siteName: "google.com",
data:
{
//more properties
}
}
I currently have two solutions in mind but I think that both are not good enough.
Solution 1 Document key is the SHA1 hash of url + siteName.
Advantages: easy to implement
Disadvantages: collisions can occur
Solution 2 Document key is the hash(url + siteName) + index.
This is same as Solution 1 but key includes index in-case a collision occurs.
Retrieving a document by url + siteName takes the following steps:
- set index to 0
- get document by hash(url + siteName) + index
- Is document url + siteName same?
- If yes, return document.
- if no, increment index and go back to step 1
This is my favorite solution so far
Solution 3 Allow duplicates then just delete the duplicate at a later time.
In this solution, the unique constraints is moved to the application server. The key is just a GUID or timestamp and is NOT referenced by other documents.
- To add a document, the application server:
- Searches for existing documents that has the url and siteName. If a document is found, fail the operation.
- Insert the document
- To update a document, the application server:
- Searches for existing documents that has the url and siteName. If a document is not found, fail the operation.
- Update only the latest (last inserted) document
- To search for the document by url and site, the application server:
- only returns the latest (last inserted) document that has the url and siteName
- A background job regularly scans for added documents since X minutes ago then deletes the older duplicates.
I am a NoSQL n00b! How can I enforce unique constraints in CouchBase? Thanks