hadoop - Thorn character delimiter is not recognized in Hive -
as mentioned in post using icelandic thorn character delimiter in hive thorn character delimiter not recognized in hive
sample table
create external table if not exists zzzzz_raw ( spot_id int, activity_type_id int, activity_type string, activity_id int, activity_sub_type string, report_name string, tag_method_id int ) partitioned ( dt date ) row format delimited fields terminated '\-2' lines terminated '\n' stored textfile location '/raw/data/networkmatchtablesactivity/activity_cat';
output
select * activity_cat_raw limit 1;
4552126þ805759þeaasv101þ2275868þbfeaac01þbf_ea access_info pageþ2 null null null null null null 2015-03-24 am missing something?
i found answer. instead of '-2' (thorn delimiter) , used '-61' delimiter substring remove additional symbol, below
create external table if not exists ssssss ( spot_id string, activity_type_id string, activity_type string, activity_id string, activity_sub_type string, report_name string, tag_method_id string ) partitioned ( dt string ) row format delimited fields terminated '\-61' lines terminated '\n' stored textfile location 'ssssss';
and use substring remove other symbols
insert overwrite table vvvvvv partition (dt) select spot_id string, substr(activity_type_id,2), dt sssss
hope helps..
Comments
Post a Comment