MCS-274 Lab 1: Basic SQL (Spring 2013)

Due March 12, 2013

Background

In this lab, you will use the Oracle database to try out basic SQL queries on tables I have downloaded from the Minnesota Secretary of State's election results from the November 6, 2012, general election. (In case that web site changes, I've also got local copies of the files I downloaded. The data tables came from Amendment_Results.dsv, Candidates.dsv, Counties.dsv, Parties.dsv, Precincts.dsv, State_House_Results.dsv, State_Senate_Results.dsv, US_House_Results.dsv, US_President_Results.dsv, and US_Senate_Results.dsv. The format is described in file_format.txt.) I closely modeled my database tables on the files.

All the work for this lab can be most easily done on one of the MCS lab Macs running OS X, since we have the Oracle SQL Developer client software installed on those machines. However, you could also download and install this software on your own machine and still connect to our server, thebe.virtual.gac.edu. I don't promise to provide support for the technical difficulties you may encounter on your own machine. Also, there are some firewall hassles to overcome if you ever want to connect to our server from off campus.

You may do this lab work either with a partner or alone. Depending on how many people opt to use their own computers, we may need the majority of you to work in pairs.

I'll demonstrate running SQL Developer in class. In the following instructions, I'll pretend your username is jstudent; substitute your real one. You should create a connection called jstudent-thebe, specify the username jstudent and the password I give you, and set the hostname to thebe.virtual.gac.edu.

Because the password I issue you isn't very secure, you should probably start by changing your password, which you can do with an SQL command like

alter user jstudent identified by myNewSecretPassword;

To find the tables you will be using in SQL Developer, in the lefthand pane expand out Other Users, under it SOS, and under that Tables. The tables are named AMENDMENT_RESULTS, CANDIDATES, COUNTIES, PARTIES, PRECINCTS, STATE_HOUSE_RESULTS, STATE_SENATE_RESULTS, US_HOUSE_RESULTS, US_PRESIDENT_RESULTS, and US_SENATE_RESULTS. In your SQL queries, you can refer to these with names like SOS.CANDIDATES, which tell Oracle to look for the table in the SOS schema. (When we discuss granting of privileges, you will see how I granted you all permission to select data from these tables but not modify them.) You can also type the names in lower case. I am attaching the CREATE TABLE statements for your reference.

Expectations

For this lab, you need not write a true lab report. Instead, just turn in your SQL queries and the results they produced. For each one, give my English statement of the problem, your SQL query, and the resulting table, in that order. If you save out SQL Developer's SQL Worksheet and Script Output tabs, you'll need to interleave these two files in your editing process. Alternatively, you can put the command

set echo on

at the beginning of your worksheet and then your queries will be echoed into the Script Output. If in any case you have reason to doubt that the result is correct, but haven't been able to track down what is wrong with your SQL query or fix it, you should indicate what is wrong with the result.

Be sure to answer each question using a single SQL query that directly produces the desired result. It is not acceptable to issue one query, manually extract some information from the result, and use that to type in a second query, which produces the desired result.

The desired result should also be immediately apparent; what follows are two examples of how you could fail to meet this expectation. Example 1: if I ask you how many of something exists, your output should be a single row showing the number, not a bunch of rows, where the number of rows is the answer. Example 2: if I ask you what is the largest value of something, your output should just be that largest value, not an ordered list from which the largest value can be read off.

Your queries should not produce results with duplicate rows.

Try not to make your queries unnecessarily complex. In particular, don't mention a table if it isn't needed to find the result.

Specific problems

  1. Find the names of candidates who ran for an office ending in '19A'.

  2. Provide an alphabetical list of the parties' full names.

  3. Find the candidates who ran for the office of 'U.S. Senator'. For each, list the candidate's and party's full names. Hint: join tables using the party abbreviation, not the party ID. Later you will see why.

  4. Find a list of candidates' names for those candidates for US office who received a majority of the votes in at least one precinct in 'Nicollet' county.

  5. In which precinct (or precincts) did 'TERRY MORROW' receive his largest vote count? Give your answer as a precinct name (or names, if there is a tie).

  6. In which counties did 'TERRY MORROW' have election results? Give your result as an alphabetic list of county names.

  7. In how many precincts did 'TIM WALZ' have results?

  8. In how many precincts did 'TIM WALZ' receive a majority of the votes?

  9. Find any rows of the candidates table that contain a party ID number not present in the table of parties. For each, show the candidate's name and party ID number.

  10. Find any candidates with a party ID that does not match the one shown in the parties table for the candidate's party abbreviation. For each, show the candidate's name, candidate ID, party ID number, party abbreviation, and the party ID number that is listed in the parties table as corresponding to the abbreviation.

  11. Find any counties where the number of precincts shown in the counties table doesn't match how many are in the precincts table.

  12. Produce a table of state-wide votes for the US Senate race, showing in each row the candidate's name and the corresponding number of votes. The table should be in descending order of votes.

  13. What percentage of 'YES' votes did 'Constitutional Amendment 1' receive in each county? List the county names with their percentages (which should be labeled as such). Give the list in ascending order of percentage. Hint 1: The PERCENTAGE column in the AMENDMENT_RESULTS table is not helpful for this. Hint 2: The TOTAL_VOTES column in the AMENDMENT_RESULTS table is the total number of ballots cast in the same precinct (including those that contained neither a 'YES' nor a 'NO' vote on the amendment). The percentage of 'YES' votes should be calculated relative to the total ballots cast because a Minnesota amendment needs approval by a majority of all ballots cast, not just all votes on the amendment.

Added later: some of you are checking whether you get the same end results (output tables) as other students or student pairs. That strikes me as a smart move. Another alternative would be to see if you get the same results as I do. I'm not convinced that's any better, but for what it's worth, here is my output.