J for Finance people - Running SQL query within J script
#J #jlang #J4finance
SQL is simple but at same time a very powerful programming language especially when you have to deal with lots of data. Data are stored in database and then retrieved using SQL command “Select?from <table name>”. My intent is not to cover SQL in depth but to share some of my related experiences and then explain a bit about - how to use SQL in J programming.?
In my early role as a risk manager, I have used SQL along with VBA quite lot of times for quick validation of risk numbers. Tasks like (1) quick computation of bucket-wise BPVs for the purpose of Desk PnL Attribution or VaR Backtesting (2) Stress Testing & Scenario analysis (3) VaR validation became easier for me when I learnt few advance features of SQL. Obviously, when I started to learn, that was completely unstructured and purely out of chance. Don’t remember - how many web pages I might have searched and read to learn about these advanced concepts. Much of these readings I did during my local train travelling from Home to Office in morning time and that way utilised my travelling time. I enjoyed all these in those days. This is my story. I am not sure how others manage to learn new things despite their busy schedule. If you also have similar interesting story to tell and like to share that - please do so as it may inspire others.
I am sharing below some of advance SQL concepts so you can pick them one by one and then apply in your area of interest.?
1) Common table expression or CTE acts like temporary views that exist only for the duration of a single SQL statement. It’s alternate of using sub-queries. Sub-queries also serve your job well but sometime lead to messy SQL statements. CTE comes as rescue to that problem. I had used it extensively in past and it was quite productive to develop SQL queries using CTE approach.?Few relevant sources you can go through:?https://modern-sql.com/feature/with?,?https://mjk.space/advanced-sql-cte/?
2) Window functions - It allows us to perform calculation over a set of rows (partitioned into various blocks based on values in other columns) instead normal calculation over entire data-table. Believe me these are very powerful functions as you can have column details along-with aggregated result/s.?To use it, you need to define the window using the OVER() clause - which is usually followed by PARTITION BY clause (Defines window partitions to form groups of rows) and ORDER BY clause (Orders rows within a partition). If you wanna see how exactly various SQL Window functions work then I will recommend you to see https://www.sqlshack.com/use-window-functions-sql-server/ .?Yes it’s possible to accomplish the same thing with other methods, but the performance in case of window functions is far better with readability as bonus for you. Two more useful links to help you https://mjk.space/advanced-sql-window-functions/ https://community.modeanalytics.com/sql/tutorial/sql-window-functions/?
3) Pivot function -?If you collect your data in an SQL database, the best way to analyse it is using SQL server pivot tables - which is like Excel Pivot in visualisation. Some useful source to learn how exactly it works:?
https://sqlhints.com/2014/03/10/pivot-and-unpivot-in-sql-server/?
https://jackworthen.com/2016/03/14/converting-rows-to-columns-using-pivot-in-sql-server/
https://www.techonthenet.com/sql_server/pivot.php
领英推荐
https://www.essentialsql.com/create-dynamic-pivot-table-sql-server/
After going through above, now you would be comfortable taking a real world example from Finance (Computing Correlation from daily historical volatility of a Risk factor say FX) which uses all above three advance concepts viz CTE, a Window Functions and Pivot Table in a single SQL query: -
use database_name
go
?
with ???-- CTE clause starts from here
?
-- selecting last 252 working days for retrieving time series of market data
ReportDate (date)?
as
(SELECT?? distinct? top (252)
Date
FROM???????? positionTable
order by Date)
?
,
?
--? Retrieving time series of market data ( i.e. FX volatility ) for selected dates
VolsHist ( Date, Currency,? Vols)
as
(SELECT???? Date, Currency, Vols
FROM???????? VolsTimeSeries
where Date in (select date from ReportDate)
)
?
,
?
--? Computing Correlation
step1 ( ID ,[Row],[Column],n,Sx,Sy,Sxx,Syy,Sxy,SxSx,SySy,SxSy)
as
(select ROW_NUMBER() OVER(ORDER BY tab1.Currency, tab2.Currency) ,?? -- ROW_NUMBER? - a window function
tab1.Currency,
tab2.Currency,
count(*),?????????????????????????-- total number of observations for a particular combination of x & y
sum(tab1.vols),?? ????????????????--Total of X? ??????????????????? ?????Sx
sum(tab2.Vols) , ?????????????????--Total of y?? ?????????????????? ?????Sy
sum(tab1.vols*tab1.Vols) , ???????-- Total of square of X ?? ????????????Sxx
sum(tab2.vols*tab2.Vols)? , ??????-- Total of square of Y ?? ????????????Syy
sum(tab1.vols*tab2.Vols), ????????-- Total of product of X & Y ??????????Sxy
square(sum(tab1.vols)), ??????????-- Square of Total of X??? ????????????SxSx
square(sum(tab2.Vols)), ??????????-- Square of Total of Y??? ????????????SySy
sum(tab1.vols)* sum(tab2.Vols) ???-- Product of Total of X & Total of Y? SxSy
?
from VolsHist tab1 join VolsHist tab2
??????????????????? on Tab1.Date=Tab2.Date and Tab1.Currency >= Tab2.Currency
?
group by tab1.Currency, tab2.Currency
)
,
Step2 as
(
select * , dbo.uFnMin ( round(((n · Sxy - SxSy) / sqrt( (n· ?Sxx ?- SxSx) * (n· Syy - SySy))),4),1.00) as Correlation
from Step1
)
?
--Creating dynamic pivot table step by step
-- 1. Create a Column List
DECLARE @Columns as VARCHAR(MAX)
SELECT @Columns =
COALESCE(@Columns + ', ','') + QUOTENAME(Column)
FROM
?? (SELECT DISTINCT Column
??? FROM?? Step2
?? ) AS B
ORDER BY B.Column
?
-- 2. Construct a pivot table as SQL Statement
DECLARE @SQL as VARCHAR(MAX)
SET @SQL =? ‘select Row, ‘ + @columns + ‘?
from (
select Row, Column, Correlation from Step2
)
PIVOT
(
avg (correlation)
FOR column IN
(' + @Columns + ')?
) AS pvt
order by? Row’
?
-- 3.?? Execute the Statement
EXEC(@SQL);
Note: Correlation for a sample is computed using formula and same is applied in Step2 table of above SQL
Hope, you would find above advanced concepts relevant and would like to apply in your daily crunching of numbers from database.?If you are new to SQL and willing to learn basic SQL then I am sure there are lots of good tutorials on web (like 1. www.tutorialspoint.com/sql/ ,?2. www.SQL-Tutorial.net?- even this has a good one page cheat sheet on SQL) for you to refer them. Your clear understanding on “Set theory” (which you might have studied in 6 or 7 grade during schooling) will accelerate your SQL learning. If you are wondering why I am saying this then please go thru’ this link https://www.sqlshack.com/mathematics-sql-server-fast-introduction-set-theory/ .?
Now coming to “Using SQL within J script”,???
J programming language, which is really strong for handling vectors, matrices and multi-dimensional arrays,?has few addons like data/ODBC, data/MySQL, data/sqlite for retrieving from and storing data on database. I wanted to try some codes and chose sqlite for this purpose.?
Thank you for reading.