How to flatten (or explode) the data along with row in a Dataframe, based on column data?

Multi tool use
How to flatten (or explode) the data along with row in a Dataframe, based on column data?
Data frame should explode based on SPC column. Below is example
My Input DataFrame.
ID Name Level SPC Rating salry
23 sam 3 HBS 3.5 4000
43 Nair 4 KSTk 4 5000
56 Rom 5 MNC 3 3000
My output should be:
ID Name level SPC Rating Salary
23 sam 3 H 3.5 4000
23 sam 3 B 3.5 4000
23 sam 3 S 3.5 4000
43 Nair 4 K 4 5000
43 Nair 4 S 4 5000
43 Nair 4 T 4 5000
43 Nair 4 k 4 5000
How can I resolve this problem in Scala or Java code?
2 Answers
2
If you have a dataframe/dataset as
+---+----+-----+----+------+------+
|ID |Name|Level|SPC |Rating|salary|
+---+----+-----+----+------+------+
|23 |sam |3 |HBS |3.5 |4000 |
|43 |Nair|4 |KSTk|4.0 |5000 |
|56 |Rom |5 |MNC |3.0 |3000 |
+---+----+-----+----+------+------+
then you can write a udf
function to convert the SPC
column string values to array of each characters as string and then use explode
function as
udf
SPC
explode
import org.apache.spark.sql.functions._
def flattenStringUdf = udf((spc: String) => spc.toList.map(_.toString))
df.withColumn("SPC", explode(flattenStringUdf(col("SPC")))).show(false)
which should give you
+---+----+-----+---+------+------+
|ID |Name|Level|SPC|Rating|salary|
+---+----+-----+---+------+------+
|23 |sam |3 |H |3.5 |4000 |
|23 |sam |3 |B |3.5 |4000 |
|23 |sam |3 |S |3.5 |4000 |
|43 |Nair|4 |K |4.0 |5000 |
|43 |Nair|4 |S |4.0 |5000 |
|43 |Nair|4 |T |4.0 |5000 |
|43 |Nair|4 |k |4.0 |5000 |
|56 |Rom |5 |M |3.0 |3000 |
|56 |Rom |5 |N |3.0 |3000 |
|56 |Rom |5 |C |3.0 |3000 |
+---+----+-----+---+------+------+
I hope the answer is helpful
Try the flatMap method.
Example (haven't checked if this compiles):
val output = input.flatMap(row =>
row.SPC.toList.map(ch =>
new MyRow(row.ID, row.Name, row.level, ch, row.Rating, row.Salaray))
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
It is working for me.Thank you very much ramesh Maharajan.
– Siddesh H K
Jul 1 at 7:12