Simple Search Question

Hello

I am in the process of getting to grips with couchbase and python but I have hit a question that I would like some help with:

I want to check for the existance of a document element/value in couchbase before I decide to add a new document.

So for example in my case I am adding rss feeds I have parsed with python feed parser to couchbase. However next time i parse the feed I want to check if I already have this item in couchbase.

I am using the news items url as an indicator of this.

However I haven’t discovered fool proof way of doing this.

This is what I have done so far

Set up a view called check_if_new which returns a list of the urls

function (doc, meta) { if (doc.type == "feed_item") { emit(doc.item_link); } }

I then then run a test in python which returns the view with a query of the url appended

rows = self.cb.get_view(“items”,“check_if_new”,q) #search for the item by item_url
for row in rows: ##examine the returned data
if row.key == item_url: ##if we have a row back then the item exists
is_match = 1; ##so set match to 1

    print item_url + " item exists: "

However this seems like I am doing this in a very clunky way and i feel I am missing something

Thank you for reading

Cheers

If you want to make sure you don’t overwrite an existing document, you may use the

add()
method which will only succeed if the item does not yet exist. If you want to check against a specific field (i.e. only add if some document has a specific field somewhere) then your best bet would be to use views.

You can also get around this by using multiple lookups; for example for each new item, you would:

(1) Create the actual item, e.g.
cb.add(item.id, item)

(2) Create mappings for each URL to the relevant item; thus:

for url in item.urls:
cb.set(url, item.id)

You can get more creative if you think a URL might appear in more than a single feed.

Both the views and the multi-index approach have their benefits and drawbacks