Reading Large CSV File in Python Panda -


i have large data set 4 gb in csv format. not need whole data set, need specific column. possible read specific column instead of reading whole data set using python panda? increase speed of reading file?

thank in advance suggestion.

if have 4 gb of memory, don't worry (the time take program less memory intensive solution not worth it). read entire dataset in using pd.read_csv , subset column need. if don't have enough memory , need read file line line (i.e. row row), modify this code keep column of interest in memory.

if have plenty of memory , problem have multiple files in format, recommend using multiprocessing package parallelize task.

from muliprocessing import pool pool = pool(processes = your_processors_n) dataframeslist = pool.map(your_regular_expression_readin_func, [df1, df2, ... dfn]) 

Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -