python - `numpy.tile()` sorts automatically - is there an alternative? -
i'd initialize pandas
dataframe can populate multiple time series.
import pandas pd import numpy np string import ascii_uppercase dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'), end = pd.tseries.tools.to_datetime('2014-12-28'), freq = 'd') df = pd.dataframe(index = xrange(len(dt_rng) * 10), columns = ['product', 'dt', 'unit_sales']) df.product = sorted(np.tile([chr chr in ascii_uppercase[:10]], len(dt_rng))) df.dt = np.tile(dt_rng, 10) df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)
however, when check first few values of df.dt
, see values in field have been sorted, e.g. df.dt[:10]
yields 2012-12-31
ten times. i'd have output 2012-12-31
, 2013-01-01
, ..., 2013-01-08
, 2013-01-09
(first ten values).
in general, i'm looking behavior similar r
's "recycling".
a combination of reduce()
, append()
method of pandas.tseries.index.datetimeindex
object did trick.
import pandas pd import numpy np string import ascii_uppercase dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'), end = pd.tseries.tools.to_datetime('2014-12-28'), freq = 'd') df = pd.dataframe(index = xrange(len(dt_rng) * 10), columns = ['product', 'dt', 'unit_sales']) df.product = sorted(np.tile([chr chr in ascii_uppercase[:10]], len(dt_rng))) df.dt = reduce(lambda x, y: x.append(y), [dt_rng] * 10) df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)
Comments
Post a Comment