Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.8k views
in Technique[技术] by (71.8m points)

Spark Scala: functional difference in notation using $?

Is there a functional difference between the following two expressions? The result looks the same to me but curious if there's an unknown unknown. What does the $ symbol indicate/how is it read?

df1.orderBy($"reasonCode".asc).show(10, false)
    
df1.orderBy(asc("reasonCode")).show(10, false)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Those two statements are equivalent and will lead to the identical result.

The $ notation is special for Scala Spark and is referring to an implicit StringToColumn method which interprets the subsequent string "reasonCode" as a Column

implicit class StringToColumn(val sc: StringContext) {
  def $(args: Any*): ColumnName = {
    new ColumnName(sc.s(args: _*))
  }
}

In Scala Spark you have many ways to select a column. I have written down a full list of syntax varieties in another answer on select specific columns from spark dataframe.

Using different notations do not have any impact on the performance as they all get translated to the same set of RDD instructions through Spark's Catalyst optimizer.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...