python - select randow row from cassandra -


i have following table:

create table prosfiles (   name_file text,   beginpros timestamp,   humandate timestamp,   lastpros timestamp,   originalname text,   pros int,   uploaded int,   uploader text,   primary key (name_file) ) create index prosfiles_pros_idx on prosfiles (pros); 

in table keep location of several csv files wich processed python script, have several scripts running @ same time processing files, use table keep control , avoid 2 scripts start processing same file @ same time (in 'pros' colum 0 means file has not being processed, 1 processed files , 1010 files being processed script)

each file runs following query pick file process:

"select name_file prosfiles pros = 0 limit 1" 

but returns first row of files condition

i run query returns randow row ones pros = 0.

in mysql i've used "order rand()" in cassandra don't know how random sort results.

looks you're using cassandra queue , it's not best usage pattern it, use rabbitmq/sqs/any-other-queue-service. cassandra not support sorting @ all, , it's done idea that:

  • sort require lot of computations inside database if trying sort 1b of rows.
  • sort not easy task in distributed environment: have ask nodes holding data perform it.

but if know doing, can revisit database schema more suitable type of workload:

  • split source table 2 different tables: first 1 full file information , second 1 queue containing ids of files process.
  • your worker process reads random row queue table (see below how read ~random row cassandra primary key)
  • worker deletes target id queue , updates targets table processing information.

this way of doing things lead possible errors:

  • multiple workers can same target @ once.
  • if have lot of workers , targets, cassandra's compaction process kill performance of diy queue.

to read pseudo-random row table it's primary key can use query: select * some_table token(id_column)>some_random_long_value limit 1, have it's cons:

  • if have small set of targets, sporadically return empty result because some_random_long_value higher token of existing key.

Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -