python - pandas - drop rows under Datetime criteria -


i'm working on dataframe df:

datetime,user 2013-12-04 08:00:01,111 2013-12-04 09:00:02,111 2013-12-04 10:00:03,111 2013-12-04 09:00:04,112 2013-12-04 10:00:05,112 2013-12-04 11:00:06,112 2013-12-04 11:00:07,113 2013-12-04 11:00:08,113 2013-12-04 11:00:09,113 2013-12-04 13:00:10,114 2013-12-04 13:00:11,113 2013-12-04 12:01:11,115 2013-12-04 12:01:11,115 2013-12-04 12:01:11,115 2013-12-04 12:01:11,115 2013-12-04 12:01:11,115 2013-12-04 12:01:11,115 2013-12-04 12:01:11,115 

with user - datetime information. drop users under datetime criteria, instance when present more than, let's say, 3 or more times in same minute of same hour of same day. under condition, users 113 , 115 should dropped out of dataframe. far tried groupby user column , information datatime object, no results.

there nicer way this, that's how it:

import pandas pd  # first set dataframe     datetime = ['2013-12-04 08:00:01',             '2013-12-04 09:00:02',             '2013-12-04 10:00:03',             '2013-12-04 09:00:04',             '2013-12-04 10:00:05',             '2013-12-04 11:00:06',             '2013-12-04 11:00:07',             '2013-12-04 11:00:08',             '2013-12-04 11:00:09',             '2013-12-04 13:00:10',             '2013-12-04 13:00:11',             '2013-12-04 12:01:11',             '2013-12-04 12:01:11',             '2013-12-04 12:01:11',             '2013-12-04 12:01:11',             '2013-12-04 12:01:11',             '2013-12-04 12:01:11',             '2013-12-04 12:01:11']  user = [111, 111, 111, 112, 112, 112, 112, 113, 113, 113, 114, 113, 115, 115, 115,         115, 115, 115]  datetime = [pd.to_datetime(t) t in datetime]  df = pd.dataframe(data={'user':user}, index=datetime) df['count_user'] = 1 df['hour'] = df.index.hour df['min'] = df.index.minute df['time'] = df.index df = df.groupby(['hour', 'min', 'user', 'time']).sum() df = df[df.count_user < 3] df.reset_index(inplace=true) df = df.set_index('time') df.drop(['count_user', 'hour', 'min'], 1, inplace=true) print df                      user time                      2013-12-04 08:00:01   111 2013-12-04 09:00:02   111 2013-12-04 09:00:04   112 2013-12-04 10:00:03   111 2013-12-04 10:00:05   112 2013-12-04 11:00:06   112 2013-12-04 11:00:07   112 2013-12-04 11:00:08   113 2013-12-04 11:00:09   113 2013-12-04 12:01:11   113 2013-12-04 13:00:10   113 2013-12-04 13:00:11   114 

Comments

Popular posts from this blog

Email notification in google apps script -

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -