python - `numpy.tile()` sorts automatically - is there an alternative? -
i'd initialize pandas dataframe can populate multiple time series.
import pandas pd import numpy np string import ascii_uppercase dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'), end = pd.tseries.tools.to_datetime('2014-12-28'), freq = 'd') df = pd.dataframe(index = xrange(len(dt_rng) * 10), columns = ['product', 'dt', 'unit_sales']) df.product = sorted(np.tile([chr chr in ascii_uppercase[:10]], len(dt_rng))) df.dt = np.tile(dt_rng, 10) df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10) however, when check first few values of df.dt, see values in field have been sorted, e.g. df.dt[:10] yields 2012-12-31 ten times. i'd have output 2012-12-31, 2013-01-01, ..., 2013-01-08, 2013-01-09 (first ten values).
in general, i'm looking behavior similar r's "recycling".
a combination of reduce() , append() method of pandas.tseries.index.datetimeindexobject did trick.
import pandas pd import numpy np string import ascii_uppercase dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'), end = pd.tseries.tools.to_datetime('2014-12-28'), freq = 'd') df = pd.dataframe(index = xrange(len(dt_rng) * 10), columns = ['product', 'dt', 'unit_sales']) df.product = sorted(np.tile([chr chr in ascii_uppercase[:10]], len(dt_rng))) df.dt = reduce(lambda x, y: x.append(y), [dt_rng] * 10) df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)
Comments
Post a Comment