python - `numpy.tile()` sorts automatically - is there an alternative? -

February 15, 2015

i'd initialize pandas dataframe can populate multiple time series.

import pandas pd import numpy np string import ascii_uppercase dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'),                         end   = pd.tseries.tools.to_datetime('2014-12-28'),                         freq  = 'd') df = pd.dataframe(index = xrange(len(dt_rng) * 10),                   columns = ['product', 'dt', 'unit_sales']) df.product = sorted(np.tile([chr chr in ascii_uppercase[:10]], len(dt_rng))) df.dt = np.tile(dt_rng, 10) df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)

however, when check first few values of df.dt, see values in field have been sorted, e.g. df.dt[:10] yields 2012-12-31 ten times. i'd have output 2012-12-31, 2013-01-01, ..., 2013-01-08, 2013-01-09 (first ten values).

in general, i'm looking behavior similar r's "recycling".

a combination of reduce() , append() method of pandas.tseries.index.datetimeindexobject did trick.

import pandas pd import numpy np string import ascii_uppercase dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'),                         end   = pd.tseries.tools.to_datetime('2014-12-28'),                         freq  = 'd') df = pd.dataframe(index = xrange(len(dt_rng) * 10),                   columns = ['product', 'dt', 'unit_sales']) df.product = sorted(np.tile([chr chr in ascii_uppercase[:10]], len(dt_rng))) df.dt = reduce(lambda x, y: x.append(y), [dt_rng] * 10) df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)

Search This Blog

Lix

python - `numpy.tile()` sorts automatically - is there an alternative? -

Comments

Post a Comment

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -

php - How can I echo out this array? -