Day 3 - PSQL Commands

In PostgreSQL (often abbreviated as psql), the GROUP BY, HAVING, subquery, and JOIN clauses are essential tools for manipulating and analyzing data. Below is an explanation of each concept along with examples to help you understand how they work. 1. GROUP BY Clause The GROUP BY clause is used to group rows that have the same values into summary rows, like "total count", "average", or "sum". It’s often used with aggregate functions such as COUNT(), SUM(), AVG(), MAX(), and MIN(). Example: Imagine you have a sales table: CREATE TABLE sales ( id SERIAL PRIMARY KEY, product_name VARCHAR(50), amount INT, sale_date DATE ); To find the total sales amount for each product, you can use GROUP BY. SELECT product_name, SUM(amount) AS total_sales FROM sales GROUP BY product_name; In this query: SUM(amount) is an aggregate function calculating the total sales per product. GROUP BY product_name groups the sales by the product. 2. HAVING Clause The HAVING clause is used to filter groups after the GROUP BY operation. It’s like a WHERE clause, but for groups of data. The WHERE clause filters rows before grouping, whereas HAVING filters the results after grouping. Example: To find products where the total sales exceed 500 units: SELECT product_name, SUM(amount) AS total_sales FROM sales GROUP BY product_name HAVING SUM(amount) > 500; In this query: HAVING SUM(amount) > 500 filters the results after the grouping to show only those products with total sales greater than 500. 3. Subquery (Nested Query) A subquery is a query within another query. Subqueries can be used in the SELECT, FROM, WHERE, and HAVING clauses. Example: Suppose you want to find the product with the highest total sales: SELECT product_name, SUM(amount) AS total_sales FROM sales GROUP BY product_name HAVING SUM(amount) = ( SELECT MAX(total_sales) FROM ( SELECT SUM(amount) AS total_sales FROM sales GROUP BY product_name ) AS product_sales ); Explanation: The subquery inside the HAVING clause calculates the maximum sales using a subquery. The outer query compares the total sales of each product against the maximum sales to find the product with the highest total sales. 4. JOIN Clause The JOIN clause combines rows from two or more tables based on a related column between them. Common types of joins are INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. Example with INNER JOIN: Suppose you have two tables: orders and customers. You want to find the list of customers and their orders. CREATE TABLE customers ( customer_id SERIAL PRIMARY KEY, customer_name VARCHAR(50) ); CREATE TABLE orders ( order_id SERIAL PRIMARY KEY, customer_id INT, order_date DATE, amount INT, FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ); To get the customer name and order details, you can use an INNER JOIN: SELECT c.customer_name, o.order_id, o.amount FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id; Explanation: INNER JOIN returns only the rows where there is a match in both tables (customers and orders). ON c.customer_id = o.customer_id specifies the condition for the join. Example with LEFT JOIN: If you want to get a list of all customers and their orders, including customers with no orders, you can use a LEFT JOIN: SELECT c.customer_name, o.order_id, o.amount FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id; Explanation: LEFT JOIN returns all rows from the left table (customers) and the matching rows from the right table (orders). If there’s no match, the result will still show the customer, but with NULL for the order_id and amount. Summary of SQL Clauses: GROUP BY: Groups rows that have the same values into summary rows. HAVING: Filters groups after the GROUP BY operation. Subquery: A query within another query, often used for filtering or complex data extraction. JOIN: Combines rows from two or more tables based on a related column. Example Query Combining All Concepts: Let’s say you want to get the total sales per product and only show products with sales above 1000, ordered by the total sales, but you also want to include customer data by joining the sales and customers tables. SELECT c.customer_name, s.product_name, SUM(s.amount) AS total_sales FROM sales s INNER JOIN customers c ON s.customer_id = c.customer_id GROUP BY c.customer_name, s.product_name HAVING SUM(s.amount) > 1000 ORDER BY total_sales DESC; In this example: INNER JOIN combines the sales and customers tables based on customer_id. GROUP BY groups sales by product_name and customer_name. HAVING filters the result to only show products with sales greater than 1000. ORDER BY sorts the results by total_sales in descending order. This demonstrates

Mar 24, 2025 - 15:52
 0
Day 3 - PSQL Commands

In PostgreSQL (often abbreviated as psql), the GROUP BY, HAVING, subquery, and JOIN clauses are essential tools for manipulating and analyzing data. Below is an explanation of each concept along with examples to help you understand how they work.

1. GROUP BY Clause

The GROUP BY clause is used to group rows that have the same values into summary rows, like "total count", "average", or "sum". It’s often used with aggregate functions such as COUNT(), SUM(), AVG(), MAX(), and MIN().

Example:

Imagine you have a sales table:

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    product_name VARCHAR(50),
    amount INT,
    sale_date DATE
);

To find the total sales amount for each product, you can use GROUP BY.

SELECT product_name, SUM(amount) AS total_sales
FROM sales
GROUP BY product_name;

In this query:

  • SUM(amount) is an aggregate function calculating the total sales per product.
  • GROUP BY product_name groups the sales by the product.

2. HAVING Clause

The HAVING clause is used to filter groups after the GROUP BY operation. It’s like a WHERE clause, but for groups of data. The WHERE clause filters rows before grouping, whereas HAVING filters the results after grouping.

Example:

To find products where the total sales exceed 500 units:

SELECT product_name, SUM(amount) AS total_sales
FROM sales
GROUP BY product_name
HAVING SUM(amount) > 500;

In this query:

  • HAVING SUM(amount) > 500 filters the results after the grouping to show only those products with total sales greater than 500.

3. Subquery (Nested Query)

A subquery is a query within another query. Subqueries can be used in the SELECT, FROM, WHERE, and HAVING clauses.

Example:

Suppose you want to find the product with the highest total sales:

SELECT product_name, SUM(amount) AS total_sales
FROM sales
GROUP BY product_name
HAVING SUM(amount) = (
    SELECT MAX(total_sales)
    FROM (
        SELECT SUM(amount) AS total_sales
        FROM sales
        GROUP BY product_name
    ) AS product_sales
);

Explanation:

  • The subquery inside the HAVING clause calculates the maximum sales using a subquery.
  • The outer query compares the total sales of each product against the maximum sales to find the product with the highest total sales.

4. JOIN Clause

The JOIN clause combines rows from two or more tables based on a related column between them. Common types of joins are INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.

Example with INNER JOIN:

Suppose you have two tables: orders and customers. You want to find the list of customers and their orders.

CREATE TABLE customers (
    customer_id SERIAL PRIMARY KEY,
    customer_name VARCHAR(50)
);

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    amount INT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

To get the customer name and order details, you can use an INNER JOIN:

SELECT c.customer_name, o.order_id, o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;

Explanation:

  • INNER JOIN returns only the rows where there is a match in both tables (customers and orders).
  • ON c.customer_id = o.customer_id specifies the condition for the join.

Example with LEFT JOIN:

If you want to get a list of all customers and their orders, including customers with no orders, you can use a LEFT JOIN:

SELECT c.customer_name, o.order_id, o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

Explanation:

  • LEFT JOIN returns all rows from the left table (customers) and the matching rows from the right table (orders). If there’s no match, the result will still show the customer, but with NULL for the order_id and amount.

Summary of SQL Clauses:

  • GROUP BY: Groups rows that have the same values into summary rows.
  • HAVING: Filters groups after the GROUP BY operation.
  • Subquery: A query within another query, often used for filtering or complex data extraction.
  • JOIN: Combines rows from two or more tables based on a related column.

Example Query Combining All Concepts:

Let’s say you want to get the total sales per product and only show products with sales above 1000, ordered by the total sales, but you also want to include customer data by joining the sales and customers tables.

SELECT c.customer_name, s.product_name, SUM(s.amount) AS total_sales
FROM sales s
INNER JOIN customers c ON s.customer_id = c.customer_id
GROUP BY c.customer_name, s.product_name
HAVING SUM(s.amount) > 1000
ORDER BY total_sales DESC;

In this example:

  • INNER JOIN combines the sales and customers tables based on customer_id.
  • GROUP BY groups sales by product_name and customer_name.
  • HAVING filters the result to only show products with sales greater than 1000.
  • ORDER BY sorts the results by total_sales in descending order.

This demonstrates the power of combining GROUP BY, HAVING, subqueries, and JOINs to analyze and manipulate data effectively in PostgreSQL!