The Excel Pages |
* Page # 3 - Excel as a Database -
(Adapted from http://www.excel-vba.com)
The tips offered below are:
Indispensable
Very useful
Interesting
Excel is not a database application.
Excel is the best data analysis tool on the market,
Excel is the best reporting tool on the market,
Excel is the best front end for any database application and for any ERP system.
Fundamental DATABASE Notions
A DATABASE is a set of data which reading is convivial for a
MACHINE. The data are stored in TABLES
(customers, products, employees, sales, etc...). Each table comprises FIELDS
(names, dates, amounts, quantities, customer numbers, product numbers (SKUs), etc..) and RECORDS.
Picture the fields as columns and the records as lines or rows.
A REPORT is a subset of data which reading is convivial for a
HUMAN (usually a Decision Maker). Reports usually consists in
TABLES and CHARTS.
Very rarely will you be able to organize data so that reading them is convivial to both machines and humans.
That is why database applications are usually very poor reporting applications and Excel is very poor as a database.
In between the report and the database is the ANALYSIS that is performed by an Analyst
and not by a machine.
Nobody analyses all the data. Analysts with the help of the database
administrator (DBA) select, extract and organize a subset of data to be analyzed. This operation is usually performed by
developing a QUERY (using the language SQL).
Once the query has been developed, it can be executed each time the analyst needs the subset of data generated by it.
For reasons of security, the DBA is the ultimate authority on managing the queries.
Because all these queries require a lot of resources from the database (computing time), they are often executed
overnight, the end product being a DATAMART.Datamarts can be huge tables stored in
another database (Oracle, Sybase, SQL Server, Access) or as data cubes (Essbase, PowerPlay, Business Objects, etc..) but
they can also take the form of text files (txt, csv, log) or of Excel files (xls).
(see the page on Excel as OLAP to learn about the difference between data
cubes and databases).
Datamarts may or may not be summaries (sub-totals by date, by costumer, by product, etc..). Datamarts may or may not
include calculated fields (variances, ratios, etc...). As a matter of fact, whenever possible, have the database perform
summaries and repetitive calculations before importing the data within your analysis tool (Excel). Your gain in performance
will be interesting.
At this point, the analyst or the reporter will bring the data within Excel
(see the page on external data).
Once the data is in Excel, you have to create a DATA UNIT to benefit from all
the analysis functionalities of Excel (sorting, filtering, subtotals, outlines, pivot tables, etc..).
The DATA UNIT
VERY IMPORTANT NOTE:
The DATA UNIT is a set of columns and rows that does not include an empty row or an empty column.
The DATA UNIT must be surrounded by empty
rows and columns (row "1" and column "A" of the worksheet are considered preceded by an empty row or
column).
If the DATA UNIT has column's titles it must
be a SINGLE row of title cells. Use "Text Wrap" in Format/Cells/Alignment to write more than line of text in one
cell. If you neer to have two or more title cells, insert an empty row before the last row of titles and hide it.
Select a different format for the title cells as oppose to the other cells of the table so that Excel understands that it
is working with a DATA UNIT. I use bold font in the title cells and I add a border at the bottom of the cells.
Filters
Discover the
AUTOFILTER: you place the cursor within the data unit and go to the menu
"Data/Filter/AutoFilter" and little arrows appear in the title cell of each field. You can select many different
views using the autofilter (on top of all the data and the ten highest values), you can create custom filters (up to two
criteria for each field).
If you want these dropdown menus to
appear only for certain fields, select the desired column or columns before you go to the menu bar to use
"Data/Filter/AutoFilter".
Advanced Filters
When you need to use filters on many fields, or more than two criteria on one or many fields or when you need to use
complex filters, leave the Autofilter aside and use the Advanced Filters.
Advanced filters are one of the most important tool to be used with databases. Here are a few tips on these filters.
Let's say we have a database with the following fields:
First Name | Name | Age | City | Sex | Weight |
The criteria range to filter members by the name of Thomas would be:
Name |
Thomas |
The criteria range to filter the women of Montreal would be:
Sex | City |
W | Montreal |
The criteria range to filter the women of Montreal and of Toronto would be:
Sex | City |
W | Montreal |
W | Toronto |
The criteria range to filter men between ages of 50 and 59 would be:
Sex | Age | Age | |
M | >=50 | <60 |
The criteria range to filter men between ages of 50 and 59 and all women would be:
Sex | Age | Age |
M | >=50 | <60 |
W |
Other Formulas and tips Useful with Excel as a Database
If you want to use
dates and numbers that are in text format for calculations, select any empty cell, and use
Copy and PasteSpecial/Add on the text cells. Your dates and numbers will become usable in calculations.
SUBTOTAL
The function SUBTOTAL allows (among other operations) to count, to sum or to calculate
the average of non-filtered elements of a database. The function requires two arguments, the first is a number between
"1" and "11" that specifies the operation to be performed (for ex. "1" is for average,
"2" is for count and "9" is for sum) and, the second is the range covered by the function.
=SUBTOTAL(9,B2:B45)
This formula will sum the values in range (B2:B45) for records that are not
filtered when a filter is applied on the database.