java - How to save a sparse dataset to be used by scikit-learn? -
i'm writing java text mining tool. want test dataset scikit-learn classifiers. i'm creating feature vectors on fly java , vectors sparse. want export sparse vectors/dataset format can usable scikit-learn easily. wrote export function in java export dataset in arff format, found there no way read scikit-learn. there python parsers arff files don't support sparse datasets.
so how export dataset format usable scikit-learn? i.e format?!
a sub-optimal simple approach use libsvm / svmlight format plain text format using
label feature_index:feature_value feature_index:feature_value
this can work fine if data not large. can read sklearn.datasets.load_svmlight_file.
i'm bit surprised arff readers in python don't support sparse data. have tried scipy.io.arff.loadarff ?
Comments
Post a Comment