hadoop - MapReduce read file from classpath in Tasks -
i have bundled fat jar file "xxx.txt.gz"
i need reference file inside each yarn container inside each map task.
so if inside jar:
you see xxx.txt.gz*
i trying access file via
file mappingfile = new file(getclass().getclassloader().getresource("xxx.txt.gz").getfile())
however, @ run time following error logs task attempts
java.io.filenotfoundexception: file:/local/hadoop/1/yarn/local/usercache/user/appcache/application_1431608807540_0071/filecache/10/job.jar/job.jar!/xxx.txt.gz (no such file or directory)
in other words, though fat jar had file, thejob.jar
not.
how can remedy this?
thanks lot in advance.
there way of accessing file mappers/reducers. hope idea might ideal in mapreduce.
you can use distributed cache option available in mapreduce. way can make hadoop distribute file containers on job's mappers/reducers execute.
Comments
Post a Comment