Back to Papers and Articles
SQL Cheat Sheet: Query By Example
Copyright 2005 Paragon Corporation   ( July 02, 2005)

Why Use SQL Over procedural?

Structured Query Language (SQL) is a set-based language as opposed to a procedural language. It is the defacto language of relational databases.

The difference between a set-based language vs. a procedural language is that in a set-based language you define what set of data you want or want to operate on and the atomic operation to apply to each element of the set. You leave it up to the Database process to decide how best to collect that data and apply your operations. In a procedural language, you basically map out step by step loop by loop how you collect and update that data.

There are two main reasons why SQL is often better to use than procedural code.

  • It is often much shorter to write - you can do an update or summary procedure in one line of code that would take you several lines of procedural.
  • For set-based problems - SQL is much faster processor-wise and IO wise too because all the underlining looping iteration is delegated to a database server process that does it in a very low level way and uses IO/processor more efficiently and knows the current state of the data - e.g. what other processes are asking for the data.

Example SQL vs. Procedural

If you were to update say a sales person of all customers in a particular region - your procedural way would look something like this

do until eof
    if rs("state") = "NH" then
        rs("salesperson") = "Mike"
    end if
    rs.next
loop


The SQL way would be:
UPDATE customers SET salesperson = "Mike" WHERE state = "NH"

If you had say 2 or 3 tables you need to check, your procedural quickly becomes difficult to manage as you pile on nested loop after loop.

In this article we will provide some common data questions and processes that SQL is well suited for and SQL solutions to these tasks. Most of these examples are fairly standard ANSI-SQL so should work on most relational databases such as IBM DBII, PostGreSQL, MySQL, Microsoft SQL Server, Oracle, Microsoft Access, SQLite with little change. Some examples involving subselects or complex joins or the more complex updates involving 2 or more tables may not work in less advanced relational databases such as MySQL, MSAccess or SQLite. These examples are most useful for people already familiar with SQL. We will not go into any detail about how these work and why they work, but leave it up to the reader as an intellectual exercise.

List all records from one table that are in another table

What customers have bought from us?

SELECT DISTINCT customers.customer_id, customers.customer_name
FROM customers INNER JOIN orders ON customers.customer_id = orders.customer_id


What items are in one table that are not in another table?

Example: What customers have never ordered anything from us?

SELECT customers.* FROM customers LEFT JOIN orders ON customers.customer_id = orders.customer_id WHERE orders.customer_id IS NULL

More advanced example using a complex join: What customers have not ordered anything from us in the year 2004 - this one may not work in some lower relational databases (may have to use an IN clause)
SELECT customers.* FROM customers LEFT JOIN orders ON (customers.customer_id = orders.customer_id AND year(orders.order_date) = 2004) WHERE orders.order_id IS NULL

Please note that year is not an ANSI-SQL function and that many databases do not support it, but have alternative ways of doing the same thing.
  • SQL Server, MS Access, MySQL support year().
  • PostGreSQL you do date_part('year', orders.order_date)
  • SQLite - substr(orders.order_date,1,4) - If you store the date in form YYYY-MM-DD
  • Oracle - EXTRACT(YEAR FROM order_date) or to_char(order_date,'YYYY')
Note: You can also do the above with an IN clause, but an IN tends to be slower
Same question with an IN clause
SELECT customers.* FROM customers WHERE customers.customer_id NOT IN(SELECT customer_id FROM orders WHERE year(orders.order_date) = 2004)

Fun with Statistics - Aggregates

How many customers do we have in Massachusetts and California?
SELECT customer_state As state, COUNT(customer_id) As total FROM customers WHERE customer_state IN('MA', 'CA') GROUP BY customer_state

What states do we have more than 5 customers?
SELECT customer_state, COUNT(customer_id) As total
FROM customers
GROUP BY customer_state
HAVING COUNT(customer_id) > 5


How many states do we have customers in?
SELECT COUNT(DISTINCT customer_state) AS total
FROM customers

Note the above does not work in Microsoft Access or SQLite - they do not support COUNT(DISTINCT ..)

Alternative but slower approach for the above - for databases that don't support COUNT(DISTINCT ..), but support derived tables
SELECT count(customer_state) FROM (SELECT DISTINCT customer_state FROM customers); 
List in descending order of orders placed customers that have placed more than 5 orders
SELECT customer_id, customer_name, COUNT(order_id) as total
FROM customers INNER JOIN orders ON customers.customer_id = orders.customer_id
GROUP BY customer_id, customer_name
HAVING COUNT(order_id) > 5
ORDER BY COUNT(order_id) DESC

How do you insert records in a table?

Value Insert

INSERT INTO customers(customer_id, customer_name)
VALUES('12345', 'GIS Experts')


Copy data from one table to another table
INSERT INTO customers(customer_id, customer_name)
SELECT cus_key, cus_name
FROM jimmyscustomers WHERE customer_name LIKE 'B%'


Creating a new table with a bulk insert from another table

SELECT *  INTO archivecustomers
FROM jimmyscustomers WHERE active = 0

How do you update records in a table?

Update from values
UPDATE customers SET customer_salesperson = 'Billy' WHERE customer_state = 'TX'

Update based on information from another table
UPDATE customers SET rating = 'Good' FROM orders WHERE orderdate > '2005-01-01' and orders.customer_id = customers.customer_id
Please note the date format varies depending on the database you are using and what date format you have it set to.

Update based on information from a derived table
    UPDATE customers
        SET totalorders = ordersummary.total
        FROM (SELECT customer_id, count(order_id) As total 
FROM orders GROUP BY customer_id) As ordersummary
        WHERE customers.customer_id = ordersummary.customer_id
Please note the update examples involving additional tables do not work in MySQL, MSAccess, SQLite.

MS Access Specific syntax for doing multi-table UPDATE joins
UPDATE customers INNER JOIN orders ON customers.customer_id = orders.customer_id SET customers.rating = 'Good'

MySQL 5 Specific syntax for doing multi-table UPDATE joins
UPDATE customers, orders SET customers.rating = 'Good' WHERE orders.customer_id = customers.customer_id
Articles of Interest
PostgreSQL 8.3 Cheat SheetSummary of new and old PostgreSQL functions and SQL constructs complete xml query and export, and other new 8.3 features, with examples.
SQLiteIf you are looking for a free and lite fairly SQL-92 compliant relational database, look no further. SQLite has ODBC drivers, PHP 5 already comes with an embedded SQLite driver, there are .NET drivers, freely available GUIs , and this will run on most Oses. All the data is stored in a single .db file so if you have a writeable folder and the drivers, that’s all you need. So when you want something lite and don't want to go thru a database server install as you would have to with MySQL, MSSSQL, Oracle, PostgreSQL, or don't have admin access to your webserver and you don't need database group user permissions infrastructure, this is very useful. It also makes a nice transport mechanism for relational data as the size the db file is pretty much only limited to what your OS will allow for a file (or 2 terabytes which ever is lower).
PostgreSQL Date FunctionsSummary of PostGresql Date functions in 8.0 version
The Future of SQL by Craig MullinsProvides a very good definition of what set-based operations are and why SQL is superior for these tasks over procedural, as well as a brief history of the language.
Summarizing data with SQL (Structured Query Language)Article that defines all the components of an SQL statement for grouping data. We wrote it a couple of years ago, but it is still very applicable today.
Procedural Versus Declarative LanguagesProvides some useful anlaogies for thinking about the differences between procedural languages and a declarative language such as SQL
PostgreSQL Cheat SheetCheat sheet for common PostgreSQL tasks such as granting user rights, backing up databases, table maintenance, DDL commands (create, alter etc.), limit queries
MySQL Cheat SheetCovers MySQL datatypes, standard queries, functions
Comparison of different SQL implementationsThis is a great summary of the different offerings of Standard SQL, PostGreSQL, DB2, MSSQL, MySQL, and Oracle. It demonstrates by clear example how each conforms or deviates from the ANSI SQL Standards with join syntax, limit syntaxx, views, inserts, boolean, handling of NULLS



Back to Papers and Articles