Scala — How to Calculate Running Total Or Accumulative Sum in DataBricks

Ryan Arjun
3 min readMar 3, 2024

In this tutorial, you will learn “How to calculate Running Total Or Accumulative Sum by using Scala” in DataBricks.

Scala is a computer language that combines the object-oriented and functional programming paradigms. Martin Odersky invented it, and it was initially made available in 2003. “Scala” is an abbreviation for “scalable language,” signifying the language’s capacity to grow from simple scripts to complex systems.

Scala is a language designed to be productive, expressive, and compact that can be used for a variety of tasks, from large-scale corporate applications to scripting. It has become more well-liked in sectors like banking, where its robust type system and expressive syntax are very helpful.

To compute a running total in Scala using a DataFrame in Apache Spark, you can use the Window function along with sum aggregation.

To compute a running total within groups in a DataFrame using Scala and Apache Spark, you can still utilize the Window function, but you’ll need to partition the data by the group column.

Steps to be followed -

💎 Import necessary classes and functions from Apache Spark.

// import libraries 
import org.apache.spark.sql.{SparkSession, Row}…

--

--

Ryan Arjun

BI Specialist || Azure || AWS || GCP — SQL|Python|PySpark — Talend, Alteryx, SSIS — PowerBI, Tableau, SSRS