java - Multiple files as input on Amazon Elastic MapReduce -


i'm trying run job on elastic mapreduce (emr) custom jar. i'm trying process 1000 files in single directory. when submit job parameter s3n://bucketname/compressed/*.xml.gz, "matched 0 files" error. if pass absolute path file (e.g. s3n://bucketname/compressed/00001.xml.gz), runs fine, 1 file gets processed. tried using name of directory (s3n://bucketname/compressed/), hoping files within processed, passes directory job.

at same time, have smaller local hadoop installation. in that, when pass job wildcards (/path/to/dir/on/hdfs/*.xml.gz), works fine , 1000 files listed correctly.

how emr list files?

i don't know how emr lists files, here's piece of code works me:

        filesystem fs = filesystem.get(uri.create(args[0]), job.getconfiguration());         filestatus[] files = fs.liststatus(new path(args[0]));         for(filestatus sfs:files){             fileinputformat.addinputpath(job, sfs.getpath());         } 

it list files in input directory, , can will


Comments

Popular posts from this blog

c# - How to set Z index when using WPF DrawingContext? -

razor - Is this a bug in WebMatrix PageData? -

visual c++ - Using relative values in array sorting ( asm ) -