Sunday, February 16, 2014

EXCEL: Lookups, Charts, Statistics, and Pivot Tables

Geek School 5

Having reviewed basic functions, cell references, and date and time functions, we now dive into some of the more advanced features of Microsoft Excel. We present methods to solve classic problems in finance, sales reports, shipping costs, and statistics.
These functions are important to business, students, and those who are just want to learn more.


Here is an example to illustrate vertical lookup (VLOOKUP) and horizontal lookup (HLOOKUP) functions. These functions are used to translate a number or other value into something which is understandable. For example, you can use VLOOKUP to take a part number and return the item description.
To investigate this, let’s go back to our “Decision Maker” spreadsheet in Part 4, where Jane is trying to decide what to wear to school. She is no longer interested in what she wears, since she has landed a new boyfriend, so she will now wear random outfits and shoes.
In Jane’s spreadsheet, she lists outfits in vertical columns and shoes, horizontal columns.
She opens the spreadsheet and the function RANDBETWEEN(1,3) generates a number between or equal to 1 and 3 corresponding to the 3 types of outfits she can wear.
She uses the function RANDBETWEEN(1,5) to pick among five types of shoes.
Since Jane cannot wear a number we needs to convert this to a name, so we use lookup functions.
We use the VLOOKUP function to translate the outfit number to outfit name. HLOOKUP translates from shoe number to the various types of shoes in the row.
The spreadsheet works like this for outfits:
Excel picks a random number from 1 to 3, since she has three outfit options.
Next the formula translates the number to text using =VLOOKUP(B11,A2:B4,2) which uses random number the value from B11 to look in the range A2:B4. It then gives the result (C11) from the data listed in the second column.
We use the same technique to pick shoes, except this time we use VOOKUP instead of HLOOKUP.

Example: Basic Statistics

Almost everyone knows one formula from statistics – average – but there is another statistic that is important for business: standard deviation.
For example, many a person who has gone to college has agonized over their SAT score. They might want to know how they rank compared to other students. Universities want to know this too because many universities, especially prestigious ones, turn down students with low SAT scores.
So how would we, or a university measure and interpret SAT scores? Below are SAT scores for 5 students ranging from 1,870 to 2,230.
The important numbers to understand are:
Average – Average is also referred to as the “mean”.
Standard Deviation (STD or σ) – This number shows how widely dispersed a set of numbers are. If the standard deviation is large, then the numbers are far apart and if it is zero, all the numbers are the same. You could say that the standard deviation is the average difference between the average value and the observed value, i.e. 1,998 and each SAT score. Please note, it is common to abbreviate standard deviation using the Greek symbol sigma “σ”.
Percentile Rank – When a student receives a high score, they can brag that they are in the top 99% percentile or something like that. “Percentile rank” means the percentage of scores are lower than one particular score.
Standard deviation and probability are closely-linked. You can say that for each standard deviation, the probability or likelihood that that number is inside that number of standard deviations is:
STDPercentage of scoresRange of SAT scores
As you can see, the chance that any SAT score is outside 3 STDs is practically zero, because 99.73% of the scores are within 3 STDs.
Now let’s look at the spreadsheet again and explain how it works.
Now we explain the formulas:
The average of all the scores over the range B2:B6. Specifically, the sum of all the scores divided by the number of people who took the test.
The standard deviation over the range B2:B6. The “.P” means STDEV.P is used over all the scores, i.e., the entire population and not just a subset.
This calculates the cumulative percentage over the range B2:B6 based upon the SAT score, in this case B2. For example, 83% of the scores are below Walker’s score.

Graphing the Results

Putting the results in a graph makes it easier to understand the results, plus you can show it in a presentation to make your point more clearly.
Students are on the horizontal axis and their SAT scores are shown as a blue bar graph on a scale (vertical axis) from 1,600 to 2,300.
The percentile ranking is the right-hand vertical axis from 0% to 90%, and is represented by the grey line.

How to Create a Chart

Creating a chart is a topic unto itself, however we will explain briefly how the above chart was created.
First, select the range of cells to be in the chart. In this case A2 to C6 because we want the numbers as well as the student’s names.
From the “Insert” menu select “Charts” -> “Recommended Charts”:
The computer recommends a “Clustered-Column, Secondary Axis” chart. The “Secondary Axis” part means it draws two vertical axes. In this case, this chart is the one we want. We don’t have to do anything else.
You can use move the chart around and resize it until you have it as the size and in the position you want. Once you’re satisfied you can save the chart in the spreadsheet.
If you right-click the chart, then “Select Data”, it shows you what data is selected for the range.
The “Recommended Charts” feature usually bails you out from having to deal with such complicated details as determining what data to include, how to assign labels, and how to assign the left and right vertical axes.
In the “Select Data Source” dialog, click “score” under “Legend Entries (Series)” and press “Edit”, and change it to say “Score”.
Then change series 2 (“percentile”) to “Percentile”.
Return to your chart and click on the “Chart Title” and change it to “SAT Scores”. Now we have a complete chart. It has two horizontal axes: one for SAT score (blue) and one for cumulative percentage (orange).

Example: The Transportation Problem

The transportation problem is a classic example of a type of mathematics called “linear programming”. This lets you maximize or minimize a value subject to certain constraints. It has many applications to wide array of business problems, so it is useful to learn how it works.
Before we get started with this example we have to enable the “Excel Solver”.

Enable Solver Add-In

Select “File” -> “Options” -> “Add-ins”. At the bottom of the add-ins options, click the “Go” button next to “Manage: Excel Add-ins”.
On the resulting menu, click the check box to enable “Solver Add-in” and click “OK”.

Example: Calculate the Lowest iPad Shipping Costs

Suppose we are shipping iPads and we are trying to fill our distribution centers using the lowest transportation costs possible. We have an agreement with a trucking and airline company to ship iPads from Shanghai, Beijing, and Hong Kong to the distributions centers shown below.
The price to ship each iPad is the distance from the factory to the distribution center to the plant divided by 20,000 kilometers. For example, it is 8,024 km from Shanghai to Melbourne which is 8,024/20,000 or $.40 per iPad.
The question is how do we ship all these iPads from these three plants to these four destinations at the lowest possible cost?
As you can imagine, figuring out this could be very difficult without some formula and tool. In this case we have to ship 462,000 (F12) total iPads. The plants have a limited capacity of 500,250 (G12) units.
In the spreadsheet, so that you can see how it works, we have typed 1 into cell B10 meaning we want to ship 1 iPad from Shanghai to Melbourne. Since transportation costs along that route are $0.40 per iPad the total cost (B17) is $0.40.
The number was calculated using the function =SUMPRODUCT(costs,shipped) “costs” are the ranges B3:E5.
And “shipped” are the range B9:E11:
SUMPRODUCT multiplies “costs” times the range “shipped” (B14). That is called “matrix multiplication”.
In order for SUMPRODUCT to work properly, the two matrices – costs and shipped – have to be the same size. You can get around this limitation by making extra costs and shipping columns and rows with zero value so that the arrays are the same size and there is no impact on the total costs.

Using the Solver

If all we had to do was multiply the matrices “costs” times “shipped” that would not be too complicated, but we have to deal with constraints there as well.
We have to ship what each distribution center requires. We put that constant into the solver like this: $B$12:$E$12 >= $B$13:$E$13. This mean the sum of what is shipped, i.e., the totals in cells $B$12:$E$12, must be greater than or equal to what each distribution center requires ($B$13:$E$13).
We cannot ship more than we produce. We write that constraints like this: $F$9:$F$11 <= $G$9:$G$11. Put another way, what we ship from each plant $F$9:$F$11 cannot exceed (must be less than or equal to) the capacity of each plant: $G$9:$G$11.
Now go to the “Data” menu and press the “Solver” button. If the “Solver” button is not there, you need to enable the Solver add-in.
Type in the two constraints detailed earlier and select the “Shipments” range, which is the range of numbers that we want Excel to calculate. Also pick the default algorithm “Simplex LP” and indicate that we want to “minimize” the cell B15 (“total shipping costs”), where it says “Set Objective”.
Press “Solve” and Excel saves the results into the spreadsheet, which is what we want. You can also save this so you can play around with other scenarios.
If the computer says it cannot find a solution, then you have done something that is not logical, for example, you may have requested more iPads than the plants can produce.
Here Excel is saying that it found a solution. Press “OK” to keep the solution and return to the spreadsheet.

Example: Net Present Value

How does a company decide whether to invest in a new project? If the “net present value” (NPV) is positive, they will invest in it. This is a standard approach taken by most financial analysts.
For example, suppose the Codelco mining company wants to expand the Andinas copper mine. The standard approach to determine whether to move ahead with a project is to calculate the net present value. If the NPV is bigger than zero, then the project will be profitable given two inputs (1) time and (2) cost of capital.
In plain English, cost of capital means how much would that money would earn if they just left it in the bank. You use the cost of capital to discount cash values to present value, in other words $100 in 5 years might be $80 today.
In the first year, $45 million is set aside as capital to finance the project. The accountants have determine that their cost of capital is 6%.
As they start mining, the cash starts coming in as the company finds and sells the copper they produce. Obviously, the more they mine, the more money they make and their forecast shows their cash flow increasing until it reaches $9 million per year.
After 13 years, the NPV is $3,945,074 USD, so the project will be profitable. According to Financial analysts the “pay-back period” is 13 years.

Creating a Pivot Table

A “pivot table” is basically a report. We call them pivot tables because you can easily switch them one type of report to another without having to make an entire new report. So theypivot in place. Let’s show a basic example that teaches the basic concepts.

Example: Sales Reports

Sales people are very competitive (that’s part of being a salesman) so they naturally want to know how they fare against one another at the end of the quarter and end of the year, plus how much their commissions will be.
Suppose we have three sales people – Carlos, Fred, and Julie – all selling petroleum. Their sales in dollars per fiscal quarter for the year 2014 is shown in the spreadsheet below.
To generate these reports, we create a pivot table:
Select “Insert ->Pivot Table, it is on the left-hand side of the toolbar:
Select the all the rows and columns (including the salesman name) as shown below:
The pivot table dialog box appears to the right-hand side of the spreadsheet.
If we click all four fields in the pivot table dialog box (Quarter, Year, Sales, and Salesperson) Excel adds a report to the spreadsheet that makes no sense, but why?
As you can see, we have selected all four fields to add to the report. Excel’s default behavior is to group rows by text fields and then sum all the rest of the rows.
Here it gives us the sum of the year 2014 + 2014 + 2014 + 2014 = 24,168, which is nonsense. Also it gave is the sum of the quarters 1 + 2 + 3 + 4 = 10 * 3 =3 0. We don’t need this information so we deselect these fields to remove them from our pivot table.
“Sum of Sales” (total sales) is pertinent however, so we’ll fix that.

Example: Sales by Salesman

You can edit “Sum of Sales” that to say “Total Sales”, which is clearer. Also, you can format the cells as currency just like you would format any other cells. First click on “Sum of Sales” and select “Value Field Settings”.
On the resulting dialog, we change the name to “Total Sales” then click “Number Format” and change it to “Currency”.
You can then see your handiwork in the pivot table:

Example: Sales by Salesman and Quarter

Now let’s add subtotals for each quarter. To add subtotals just left-click on the “Quarter” field and hold and drag it to the “rows” section. You can see the result on the screenshot below:
While we’re at it, let’s remove the “Sum of Quarter” values. Simply click on the arrow and click “Remove Field”. In the screenshot, you can now see we’ve added the “Quarter” rows, which breaks down each salesperson’s sales by quarter.
With these skills fresh in mind, you can now create pivot tables from your own data!


Wrapping up, we have shown you some of the features of Microsoft Excel’s formulas and functions that you can apply Microsoft Excel to your business, academic, or other needs.
As you have seen, Microsoft Excel is an enormous product with so many features that most people, even advanced users, do not know all of them. Some people might say that makes it complicated; we feel it’s more comprehensive.
Hopefully, by presenting you lots of real-life examples, we have demonstrated not only the functions available in Microsoft Excel but have taught you something about statistics, linear programming, creating charts, using random numbers, and other ideas that you can now adopt and use in your school or where you work.
Remember, if you want to go back and take the class again, you can start fresh with Lesson 1!