In Spark SQL, function FIRST_VALUE (FIRST) and LAST_VALUE (LAST) can be used to to find the first or the last value of given column or expression for a group of rows. If parameter isIgnoreNull
is specified as true, they return only non-null values (unless all values are null).
Function signature
first(expr[, isIgnoreNull])
first_value(expr[, isIgnoreNull])
last(expr[, isIgnoreNull])
last_value(expr[, isIgnoreNull])
Code snippets
first\_value (first)
SELECT ACCT, first_value(AMT,true) FROM VALUES
(101,10.01, DATE'2021-01-01'),
(101,102.01, DATE'2021-01-01'),
(102,93., DATE'2021-01-01'),
(103,913.1, DATE'2021-01-02'),
(102,913.1, DATE'2021-01-02'),
(101,900.56, DATE'2021-01-03')
AS TXN(ACCT,AMT, TXN_DT)
group by ACCT;
Output:
ACCT first_value(AMT)
101 10.01
103 913.10
102 93.00
last\_value (last)
SELECT ACCT, last_value(AMT,true), last(AMT,true) FROM VALUES
(101,10.01, DATE'2021-01-01'),
(101,102.01, DATE'2021-01-01'),
(102,93., DATE'2021-01-01'),
(103,913.1, DATE'2021-01-02'),
(102,913.1, DATE'2021-01-02'),
(101,900.56, DATE'2021-01-03')
AS TXN(ACCT,AMT, TXN_DT)
group by ACCT;
Ouput:
ACCT last_value(AMT) last(AMT)
101 900.56 900.56
103 913.10 913.10
102 913.10 913.10