I'm trying to create a sub-table from another table of all the last name fields sorted A-Z which have a phone number field that isn't null. I could do this pretty easy with SQL, but I have no clue how to go about running a SQL query within Excel. I'm tempted to import the data into postgresql and just query it there, but that seems a little excessive.
For what I'm trying to do, the SQL query SELECT lastname, firstname, phonenumber WHERE phonenumber IS NOT NULL ORDER BY lastname
would do the trick. It seems too simple for it to be something that Excel can't do natively. How can I run a SQL query like this from within Excel?
I found great example where you can treat worksheets just like tables.
video link
I suggest you to have a look at the MySQL csv storage engine which essentially allows you to load any csv file (easily created from excel) into the database, once you have that, you can use any SQL command you want.
It's worth to have a look at it.
tl;dr; Excel does all of this natively - use filters and or tables
(http://office.microsoft.com/en-gb/excel-help/filter-data-in-an-excel-table-HA102840028.aspx)
You can open excel programatically through an oledb connection and execute SQL on the tables within the worksheet.
But you can do everything you are asking to do with no formulas just filters.
have a play around.. some things to note:
DO it with filters unless you are going to do it a lot or you want to automate importing data somewhere or something.. but for completeness:
A c# option:
a handy place to start is to take a look at the schema as there may be more there than you think:
then when you want to query a sheet:
NOTE - Use Tables in excel!:
Excel has "tables" functionality that make data behave more like a table.. this gives you some great benefits but is not going to let you do every type of query.
http://office.microsoft.com/en-gb/excel-help/overview-of-excel-tables-HA010048546.aspx
For tabular data in excel this is my default.. first thing i do is click into the data then select "format as table" from the home section on the ribbon. this gives you filtering, and sorting by default and allows you to access the table and fields by name (e.g. table[fieldname] ) this also allows aggregate functions on columns e.g. max and average
You can experiment with the native DB driver for Excel in language/platform of your choice. In Java world, you can try with http://code.google.com/p/sqlsheet/ which provides a JDBC driver for working with Excel sheets directly. Similarly, you can get drivers for the DB technology for other platforms.
However, I can guarantee that you will soon hit a wall with the number of features these wrapper libraries provide. Better way will be to use Apache HSSF/POI or similar level of library but it will need more coding effort.
If you need to do this once just follow Charles' descriptions, but it is also possible to do this with Excel formulas and helper columns in case you want to make the filter dynamic.
Lets assume you data is on the sheet DataSheet and starts in row 2 of the following columns:
You need two helper columns on this sheet.
=if(A2 = "", 1, 0)
, this is the filter column, corresponding to your where condition=if(D2 <> 1, "", sumifs(D$2:D$1048576, A$2:A$1048576, "<"&A2) + sumifs(D$2:D2, A$2:A2, A2))
, this corresponds to the order byCopy down these formulas as far as your data goes.
On the sheet which should display your result create the following columns.
=match(A2, DataSheet!$E$2:$E$1048576, 0)
, this is the row of the corresponding data=iferror(index(DataSheet!A$2:A$1048576, $B2), "")
, this is the actual data or empty if no data existsCopy down the formulas in B2 and C2 and copy-past column C to D and E.
Might I suggest giving QueryStorm a try - it's a plugin for Excel that makes it quite convenient to use SQL in Excel.
Also, it's freemium. If you don't care about autocomplete, error squigglies etc, you can use it for free. Just download and install, and you have SQL support in Excel.
Disclaimer: I'm the author.