Data Analysis Using SQL and Excel
Format: PDF / Kindle (mobi) / ePub
Useful business analysis requires you to effectively transform data into actionable information. This book helps you use SQL and Excel to extract business information from relational databases and use that data to define business dimensions, store transactions about customers, produce results, and more. Each chapter explains when and why to perform a particular type of business analysis in order to obtain useful results, how to design and perform the analysis using SQL and Excel, and what the results should look like.
than once in a dataflow, there is a separate READ for each occurrence. OUTPUT: Outputting a Table (or Chart) The OUTPUT operator creates desired output, such as a table in a row-column format or some sort of chart based on the data. The OUTPUT operator does not have any outputs, but accepts inputs. It also accepts parameters describing the type of output. SELECT: Selecting Various Columns in the Table The SELECT operator chooses one or more columns from the input and passes them to the output.
effects. 99513c02.qxd:WileyRed 8/27/07 Chapter 2 11:41 AM ■ Page 55 What’s In a Table? Getting Started with Data Exploration 80,000 $8,000,000 70,000 $7,000,000 60,000 $6,000,000 50,000 $5,000,000 40,000 $4,000,000 30,000 $3,000,000 20,000 $2,000,000 10,000 $1,000,000 0 Revenue Number of Orders Number of Orders and Revenue — Overlapping on Two Axes $0 ?? AE DB MC OC VI Figure 2-5: Showing the number of orders and revenue requires using two axes. The first step
is opened, though, there are only two doors remaining, so the probability of either door having the prize is 50%. Because the probabilities are the same, switching does not make a difference. It is equally likely that the prize is behind either door. Although an appealing and popular analysis, this is not correct for a subtle reason that involves a distinction similar to the distinction between confidence and probability: Just because there are two doors does not mean that the probabilities are
the z-score is ($84.48 – $73.98)/$1.48, which comes to 7.8 standard deviations away. That is a long way away, and it is very, very, very unlikely that the average order size for California is due to nothing more than chance. T I P The z-score measures how far away an observed value is from the mean, in units of standard deviations. It is the difference divided by the standard deviation. The z-score can be turned into a probability using the Excel formula 2*MIN(1-NORMSDIST(z-score),
Page 143 Where Is It All Happening? Location, Location, Location databases that use special-purpose data structures to store geographic information and make such queries much more feasible; however, these are not part of standard SQL. Finding Nearest Zip Code in Excel This section does a very similar calculation in Excel, finding the nearest zip code to a given zip code. The Excel spreadsheet consists of the following areas: ■■ The input area is for typing in a zip code. ■■ The output area