rel:: [[Programming Languages]] rel:: [Against SQL](x-devonthink-item://C236F225-B503-41F6-AA74-CEB634A77D15) # PRQL > **P**ipelined **R**elational **Q**uery **L**anguage, pronounced "Prequel". > > [...] Unlike SQL, it forms a logical pipeline of transformations, and supports abstractions such as variables and functions. It can be used with any database that uses SQL, since it transpiles to SQL. ## Principles > PRQL is intended to be a modern, simple, declarative language for transforming data, with abstractions such as variables & functions. It's intended to replace SQL, but doesn't have ambitions as a general-purpose programming language. While it's at a pre-alpha stage, it has some immutable principles: > > - _Pipelined_ — PRQL is a linear pipeline of transformations — each line of the query is a transformation of the previous line's result. This makes it easy to read, and simple to write. This is also known as "[point-free style](https://en.wikipedia.org/w/index.php?title=Point-free_programming)". > - _Simple_ — PRQL serves both sophisticated engineers and analysts without coding experience. By providing simple, clean abstractions, the language can be both powerful and easy to use. > - _Compatible_ — PRQL transpiles to SQL, so it can be used with any database that uses SQL, and with any existing tools or programming languages that manage SQL. PRQL should allow for a gradual onramp — it should be practical to mix SQL into a PRQL query where PRQL doesn't yet have an implementation. Where possible PRQL can unify syntax across databases. > - _Analytical_ — PRQL's focus is analytical queries; we de-emphasize other SQL features such as inserting data or transactions. > - _Extensible_ — PRQL can be extended through its abstractions, and can evolve without breaking backward-compatibility, because its queries can specify their PRQL version. ## Reference - [source](https://github.com/max-sixty/prql) - [hacker news discussion](https://news.ycombinator.com/item?id=30060784#30062329) - [similar projects](https://github.com/max-sixty/prql#similar-projects) ### Syntax Example ```SQL SELECT TOP 20 title, country, AVG(salary) AS average_salary, SUM(salary) AS sum_salary, AVG(salary + payroll_tax) AS average_gross_salary, SUM(salary + payroll_tax) AS sum_gross_salary, AVG(salary + payroll_tax + benefits_cost) AS average_gross_cost, SUM(salary + payroll_tax + benefits_cost) AS sum_gross_cost, COUNT(*) as count FROM employees WHERE salary + payroll_tax + benefits_cost > 0 AND country = 'USA' GROUP BY title, country ORDER BY sum_gross_cost HAVING count > 200 ``` In PRQL this would be expressed: ```elm from employees filter country = "USA" # Each line transforms the previous result. derive [ # This adds columns / variables. gross_salary: salary + payroll_tax, gross_cost: gross_salary + benefits_cost # Variables can use other variables. ] filter gross_cost > 0 aggregate by:[title, country] [ # `by` are the columns to group by. average salary, # These are aggregation calcs run on each group. sum salary, average gross_salary, sum gross_salary, average gross_cost, sum_gross_cost: sum gross_cost, ct: count *, ] sort sum_gross_cost filter ct > 200 take 20 ``` Note `derive` variables. ### Syntax example with functions ```sql SELECT date, -- Can't use a `WHERE` clause, as it would affect the row that the `LAG` function referenced. IF(is_valid_price, price_adjusted / LAG(price_adjusted, 1) OVER (PARTITION BY sec_id ORDER BY date) - 1 + dividend_return, NULL) AS return_total, IF(is_valid_price, price_adjusted_usd / LAG(price_adjusted_usd, 1) OVER (PARTITION BY sec_id ORDER BY date) - 1 + dividend_return, NULL) AS return_usd, IF(is_valid_price, price_adjusted / LAG(price_adjusted, 1) OVER (PARTITION BY sec_id ORDER BY date) - 1 + dividend_return, NULL) - interest_rate / 252 AS return_excess, IF(is_valid_price, price_adjusted_usd / LAG(price_adjusted_usd, 1) OVER (PARTITION BY sec_id ORDER BY date) - 1 + dividend_return, NULL) - interest_rate / 252 AS return_usd_excess FROM prices ``` In PRQL this would be expressed ```elm prql version:0.1 db:snowflake # Version number & database name. func lag_day x = ( window x by sec_id sort date lag 1 ) func ret x = x / (x | lag_day) - 1 + dividend_return func excess x = (x - interest_rate) / 252 func if_valid x = is_valid_price ? x : null from prices derive [ return_total: prices_adj | ret | if_valid # `|` can be used rather than newlines. return_usd: prices_usd | ret | if_valid return_excess: return_total | excess return_usd_excess: return_usd | excess ] select [ date, sec_id, return_total, return_usd, return_excess, return_usd_excess, ] ```