Tuesday, September 9, 2014

What's the difference between RDD's map,FlatMap and mapPartitions method?

What's the difference between RDD's map,FlatMap and mapPartitions method?

map converts each element of the source RDD into a single element of the result RDD by applying a function. mapPartitions converts each partition of the source RDD into into multiple elements of the result (possibly none). FlatMap works on a single element (as map) and produces multiple elements of the result (as mapPartitions).
RDD -> Map -> Single Element(one to one)
RDD -> MapPartition -> MultipleElements (single) for Each Partition
RDD -> FlatMap -> Multiple Elements(one to many)

These are the same in functioning
a.map(r => r * 2)
a.mapPartitions(iter => iter.map(r => r *2))

1 comment:

  1. Your advice is practical and actionable. I appreciate the positive vibes you had given. color blind test for kids guide them towards a colorful future. I hope you also visit my blog and give us a good opinion.

    ReplyDelete