memory management - Reading a large input files(10gb) through java program -
i working 2 large input files of order of 5gb each.. output of hadoop map reduce, not able dependency calculations in map reduce, switching optimized loop final calculations( see previous question on map reduce design recursive calculations using mapreduce
i have suggestion on reading such huge files in java , doing basic operations, writing out data of order of around 5gb..
i appreciate
if files have properties described, i.e. 100 integer values per key , 10gb each, talking large number of keys, more can feasibly fit memory. if can order files before processing, example using os sort utility or mapreduce job single reducer, can read 2 files simultaneously, processing , output result without keeping data in memory.
Comments
Post a Comment