library(odbc)
library(DBI)
<- dbConnect(
con drv = odbc(),
driver = "SQLite3 Driver",
database = ':memory:',
server = 'localhost',
uid = 'user_name',
pwd = 'my_password',
port = 5432
)## Error in `dbConnect()`:
## ! ODBC failed with error 00000 from [unixODBC][Driver Manager].
## ✖ Can't open lib 'SQLite3 Driver' : file not found
## ℹ From 'nanodbc/nanodbc.cpp:1150'.
Getting started with SQL with R
In today’s blog post, I’m showing you how to work with databases from within R. As always, you can find the video version of this blog post on YouTube:
Establish a connection
Let us start by setting up a database connection.
If there is an error, find out if you have the ODBC driver installed on your computer. You can check that with
odbcListDrivers()
## name attribute value
## 1 SQLite Description SQLite ODBC Driver
## 2 SQLite Driver libsqliteodbc.so
## 3 SQLite Setup libsqliteodbc.so
## 4 SQLite UsageCount 1
## 5 SQLite3 Description SQLite3 ODBC Driver
## 6 SQLite3 Driver libsqlite3odbc.so
## 7 SQLite3 Setup libsqlite3odbc.so
## 8 SQLite3 UsageCount 1
## 9 PostgreSQL ANSI Description PostgreSQL ODBC driver (ANSI version)
## 10 PostgreSQL ANSI Driver psqlodbca.so
## 11 PostgreSQL ANSI Setup libodbcpsqlS.so
## 12 PostgreSQL ANSI Debug 0
## 13 PostgreSQL ANSI CommLog 1
## 14 PostgreSQL ANSI UsageCount 1
## 15 PostgreSQL Unicode Description PostgreSQL ODBC driver (Unicode version)
## 16 PostgreSQL Unicode Driver psqlodbcw.so
## 17 PostgreSQL Unicode Setup libodbcpsqlS.so
## 18 PostgreSQL Unicode Debug 0
## 19 PostgreSQL Unicode CommLog 1
## 20 PostgreSQL Unicode UsageCount 1
Aha! The driver name seems to be different. Using the correct driver name, the previous code works nicely.
<- dbConnect(
con drv = odbc(),
driver = "SQLite3",
database = ':memory:',
server = 'localhost',
uid = 'user_name',
pwd = 'my_password',
port = 5432
)
con## <OdbcConnection> :memory:
## Database: :memory:
## SQLite Version: 3.36.0
Cool, we have a connection but what if there wasn’t a SQLite driver installed on my system? In that case, I could install it the driver by first checking what the name for the installation is. For Ubuntu users that would be
apt-cache search sqlite odbc
## fp-units-db - Free Pascal - database-library units dependency package
## fp-units-db-3.2.2 - Free Pascal - database-library units
## libcppdb-dev - SQL Connectivity Library (development files)
## libcppdb-mysql0 - SQL Connectivity Library (MySQL backend)
## libcppdb-odbc0 - SQL Connectivity Library (odbc backend)
## libcppdb-postgresql0 - SQL Connectivity Library (PostgreSQL backend)
## libcppdb-sqlite3-0 - SQL Connectivity Library (sqlite3 backend)
## libcppdb0 - SQL Connectivity Library (core library)
## libghc-hdbc-sqlite3-dev - Sqlite v3 HDBC (Haskell Database Connectivity) Driver for GHC
## libghc-hdbc-sqlite3-doc - Sqlite v3 HDBC (Haskell Database Connectivity) Documentation
## libghc-hdbc-sqlite3-prof - Sqlite v3 HDBC Driver for GHC; profiling libraries
## libpoco-doc - Documentation for POCO - The C++ Portable Components
## libreoffice-base-drivers - Database connectivity drivers for LibreOffice
## libsqliteodbc - ODBC driver for SQLite embedded database
## php-db - Database Abstraction Layer
## ruby-sequel - Simple, flexible, and powerful SQL database access toolkit for Ruby
And from there I could identify that libsqliteodbc
is the thing I need to install. On Ubuntu, that can be done with
sudo apt-get install libsqliteodbc
Anyway, enough with the technical stuff.
Use database tables
Let’s check what tables we have in our database right now.
|>
con dbListTables()
## character(0)
Well, that’s boring. No tables yet. Let’s create one. Let’s use the penguins
from the {palmerpenguins}
package and throw that into a table called palmerpenguins
.
|>
con dbCreateTable(
name = 'palmerpenguins',
fields = palmerpenguins::penguins
)
Now, we can see that we have a table in our database.
|>
con dbListTables()
## [1] "palmerpenguins"
And we could fetch the data as well.
|>
con dbReadTable('palmerpenguins') |>
::as_tibble() # nicer output with tibbles
tibble## # A tibble: 0 × 8
## # ℹ 8 variables: species <chr>, island <chr>, bill_length_mm <dbl>,
## # bill_depth_mm <dbl>, flipper_length_mm <int>, body_mass_g <int>, sex <chr>,
## # year <int>
Looks like the table is still empty and only the columns (and their type) were initialized. Time to append the actual data.
|>
con dbAppendTable(
name = 'palmerpenguins',
value = palmerpenguins::penguins
)
Once that is executed, we should now see data inside of our table.
<- con |>
penguins_db dbReadTable('palmerpenguins') |>
::as_tibble()
tibble
penguins_db## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
Excellent. We have our data right here. And once we have the full data locally, we can operate on it like we normally would, e.g.
|>
penguins_db ::filter(!is.na(sex), bill_length_mm > 50) |>
dplyr::select(bill_length_mm, sex)
dplyr## # A tibble: 52 × 2
## bill_length_mm sex
## <dbl> <chr>
## 1 50.2 male
## 2 59.6 male
## 3 50.5 male
## 4 50.5 male
## 5 50.1 male
## 6 50.4 male
## 7 54.3 male
## 8 50.7 male
## 9 51.1 male
## 10 52.5 male
## # ℹ 42 more rows
Get data with SQL
But what if we wanted to have only have the data return already the filtered data? That way, we don’t have to possibly download loads and loads of data from the database only to filter for and select specific data locally.
Well, we could just use a simple SQL statement to get the data that we want. The key function to make that possible is the dbGetQuery()
function.
|>
con dbGetQuery(
'SELECT bill_length_mm, sex
FROM palmerpenguins
WHERE bill_length_mm > 50 and sex IS NOT NULL;'
|>
) ::as_tibble()
tibble## # A tibble: 52 × 2
## bill_length_mm sex
## <dbl> <chr>
## 1 50.2 male
## 2 59.6 male
## 3 50.5 male
## 4 50.5 male
## 5 50.1 male
## 6 50.4 male
## 7 54.3 male
## 8 50.7 male
## 9 51.1 male
## 10 52.5 male
## # ℹ 42 more rows
dtplyr
Another possibility is to use the {dbplyr}
package. This package lets you use the syntax from the {dplyr}
package and translates that automatically to SQL. You just have to know three particular things to use that:
You need to tell
{dbplyr}
which table you want to use on the database side withdplyr::tbl()
.You can have a look at the SQL code with
dplyr::show_query()
.If you want to have the full query results locally (and not just a preview), then you need to collect the data using
dplyr::collect()
.
<- dplyr::tbl(con, 'palmerpenguins')
tbl_palmerpenguins <- tbl_palmerpenguins |>
extraction ::filter(!is.na(sex), bill_length_mm > 50) |>
dplyr::select(bill_length_mm, sex)
dplyr|> dplyr::show_query()
extraction ## <SQL>
## SELECT "bill_length_mm", "sex"
## FROM "palmerpenguins"
## WHERE (NOT(("sex" IS NULL))) AND ("bill_length_mm" > 50.0)
|> dplyr::collect()
extraction ## # A tibble: 52 × 2
## bill_length_mm sex
## <dbl> <chr>
## 1 50.2 male
## 2 59.6 male
## 3 50.5 male
## 4 50.5 male
## 5 50.1 male
## 6 50.4 male
## 7 54.3 male
## 8 50.7 male
## 9 51.1 male
## 10 52.5 male
## # ℹ 42 more rows
This also works with mutate()
calls that change the returned data on the database side befor downloading it.
|>
tbl_palmerpenguins ::filter(!is.na(sex), bill_length_mm > 50) |>
dplyr::select(bill_length_mm, sex) |>
dplyr::mutate(
dplyrsex = if_else(sex == 'male', 'MALE', 'FEMALE')
|>
) ::show_query()
dplyr## <SQL>
## SELECT
## "bill_length_mm",
## CASE WHEN ("sex" = 'male') THEN 'MALE' WHEN NOT ("sex" = 'male') THEN 'FEMALE' END AS "sex"
## FROM "palmerpenguins"
## WHERE (NOT(("sex" IS NULL))) AND ("bill_length_mm" > 50.0)
So in this case you would get
|>
tbl_palmerpenguins ::filter(!is.na(sex), bill_length_mm > 50) |>
dplyr::select(bill_length_mm, sex) |>
dplyr::mutate(
dplyrsex = if_else(sex == 'male', 'MALE', 'FEMALE')
|>
) ::collect()
dplyr## # A tibble: 52 × 2
## bill_length_mm sex
## <dbl> <chr>
## 1 50.2 MALE
## 2 59.6 MALE
## 3 50.5 MALE
## 4 50.5 MALE
## 5 50.1 MALE
## 6 50.4 MALE
## 7 54.3 MALE
## 8 50.7 MALE
## 9 51.1 MALE
## 10 52.5 MALE
## # ℹ 42 more rows
But it is important to note that this does not actually update the underlying data. You see, if you connect the data once more (without the mutate()
call), then you get the original data:
|>
tbl_palmerpenguins ::filter(!is.na(sex), bill_length_mm > 50) |>
dplyr::select(bill_length_mm, sex) |>
dplyr::collect()
dplyr## # A tibble: 52 × 2
## bill_length_mm sex
## <dbl> <chr>
## 1 50.2 male
## 2 59.6 male
## 3 50.5 male
## 4 50.5 male
## 5 50.1 male
## 6 50.4 male
## 7 54.3 male
## 8 50.7 male
## 9 51.1 male
## 10 52.5 male
## # ℹ 42 more rows
Programatically assembling SQL queries with {glue}
Another way to assemble SQL queries is using the glue_sql()
function from the {glue}
package. This can be particularly useful when you are already familiar with SQL and don’t want to learn the {dtplyr}
syntax.
For example, if you have vectors that describes particular columns and values your interested in. For our penguins, this could be something like this:
<- 'palmerpenguins'
selected_tbl <- c('bill_length_mm', 'sex')
selected_cols <- c('Adelie', 'Chinstrap') selected_species
Then with the glue_sql()
command you can easily throw these variables into a SQL-query that you write yourself.
::glue_sql(
glue'SELECT {selected_cols}
FROM {selected_tbl}
WHERE species IN ({selected_species})',
.con = con
)## <SQL> SELECT 'bill_length_mm'
## FROM 'palmerpenguins'
## WHERE species IN ('Adelie')
## <SQL> SELECT 'sex'
## FROM 'palmerpenguins'
## WHERE species IN ('Chinstrap')
Well, this doesn’t look right. It’s two SQL queries instead of one. But the nice thing about glue_sql()
is that you can collapse vectors with the *
character.
::glue_sql(
glue'SELECT {selected_cols*}
FROM {selected_tbl}
WHERE species IN ({selected_species*})',
.con = con
)## <SQL> SELECT 'bill_length_mm', 'sex'
## FROM 'palmerpenguins'
## WHERE species IN ('Adelie', 'Chinstrap')
But if you know SQL, you know that this code is not quite right. Check out what happens if you run this query.
<- glue::glue_sql(
incorrect_query 'SELECT {selected_cols*}
FROM {selected_tbl}
WHERE species IN ({selected_species*})',
.con = con
)|>
con dbGetQuery(incorrect_query) |>
::as_tibble()
tibble## # A tibble: 220 × 2
## `'bill_length_mm'` `'sex'`
## <chr> <chr>
## 1 bill_length_mm sex
## 2 bill_length_mm sex
## 3 bill_length_mm sex
## 4 bill_length_mm sex
## 5 bill_length_mm sex
## 6 bill_length_mm sex
## 7 bill_length_mm sex
## 8 bill_length_mm sex
## 9 bill_length_mm sex
## 10 bill_length_mm sex
## # ℹ 210 more rows
That’s not quite right, is it? Well, the thing is: In our SQL-query the quotation marks are not correct. Have a look:
incorrect_query## <SQL> SELECT 'bill_length_mm', 'sex'
## FROM 'palmerpenguins'
## WHERE species IN ('Adelie', 'Chinstrap')
For the things in the SELECT and FROM part we shouldn’t use single quotes '
. In SQLite, these are reserved for literal strings like 'Adelie'
and 'Chinstrap'
. If you want to use identifiers like column and tables names, then you either don’t use any quotation marks are double quotation marks "
. You can tell glue_sql()
about the difference by wrapping identifiers into backticks `.
<- glue::glue_sql(
correct_query 'SELECT {`selected_cols`*}
FROM {`selected_tbl`}
WHERE species IN ({selected_species*})',
.con = con
)
correct_query## <SQL> SELECT "bill_length_mm", "sex"
## FROM "palmerpenguins"
## WHERE species IN ('Adelie', 'Chinstrap')
So, with this query, you can now run the code from before and get the right results.
|>
con dbGetQuery(correct_query) |>
::as_tibble()
tibble## # A tibble: 220 × 2
## bill_length_mm sex
## <dbl> <chr>
## 1 39.1 male
## 2 39.5 female
## 3 40.3 female
## 4 NA <NA>
## 5 36.7 female
## 6 39.3 male
## 7 38.9 female
## 8 39.2 male
## 9 34.1 <NA>
## 10 42 <NA>
## # ℹ 210 more rows
Transactions
Now, let me teach you one more thing that’s essential to working with databases. Namely, transactions. You see, you can often want to have changes to your database run transactionally.
This means that if you update multiple rows, you want to be sure that either all rows were updated successfully or no rows were changed. That’s what transactions do. Here’s how they work:
First, you begin a transaction:
|> dbBegin() con
Then, you make changes to the database like appending more rows to a table.
|>
con dbAppendTable(
'palmerpenguins',
::penguins
palmerpenguins )
And now you could see your changes:
|>
con dbReadTable('palmerpenguins') |>
::as_tibble()
tibble## # A tibble: 688 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 678 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
See how you have double the amount of rows now? This means you have successfully appended more rows to the database table. But not so fast. You can revert all your changes since starting your transaction with dbBegin()
by calling dbRollback()
.
|> dbRollback() con
And now if you grab the table again, you can see that you’re back where you started.
|>
con dbReadTable('palmerpenguins') |>
::as_tibble()
tibble## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
Of course, there’s also a way to make changes permanent. And that happens with dbCommit()
.
# Start transaction
|> dbBegin()
con # Make changes
|>
con dbAppendTable(
'palmerpenguins',
::penguins
palmerpenguins
)# Commit to changes
|> dbCommit()
con # Changes are permanent now.
|>
con dbReadTable('palmerpenguins') |>
::as_tibble()
tibble## # A tibble: 688 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 678 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
You might be wondering how this could be useful. So let me show you how.
Imagine that we have a constraint on the species column inside of our data set. We could make that happen by creating a new table palmerpenguins_new
with a bit of SQL.
<- c('Adelie', 'Chinstrap', 'Gentoo')
allowed_species <- glue::glue_sql(
query "CREATE TABLE palmerpenguins_new (
species TEXT NOT NULL CHECK(species IN ({allowed_species*})),
island TEXT,
bill_length_mm REAL,
bill_depth_mm REAL,
flipper_length_mm REAL,
body_mass_g REAL,
sex TEXT,
year INTEGER
);",
.con = con
)|> dbExecute(query)
con ## [1] 0
Clearly, we could append the original palmerpenguins data set to this table (as it fulfills the constraint).
|>
con dbAppendTable('palmerpenguins_new', value = palmerpenguins::penguins)
|>
con dbReadTable('palmerpenguins_new') |>
::as_tibble()
tibble## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
But what about a data set that has a row in it with a wrong species. Like this one here:
<- dplyr::bind_rows(
fake_dat ::penguins[1,],
palmerpenguins::tibble(species = 'NEW SPECIES')
tibble
)
fake_dat## # A tibble: 2 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torger… 39.1 18.7 181 3750
## 2 NEW SPECIES <NA> NA NA NA NA
## # ℹ 2 more variables: sex <fct>, year <int>
Well, appending that should work for the first row but not for the second one. Let’s see what happens if we try anyway:
# Will work
|>
con dbAppendTable(
'palmerpenguins_new',
value = fake_dat[1, ]
)# Will fail
|>
con dbAppendTable(
'palmerpenguins_new',
value = fake_dat[2, ]
)## Error in eval(expr, envir, enclos): nanodbc/nanodbc.cpp:1783: 00000
## [SQLite]CHECK constraint failed: species IN ('Adelie', 'Chinstrap', 'Gentoo') (19)
Nice. We get an error because the constrains were not fulfilled for the second row. As we should. But because we the first operation was successful, the first (valid) row was still appended. Notice how there are 345 rows in the database table now as opposed to 344 rows from the original data set.
|>
con dbReadTable('palmerpenguins_new') |>
::as_tibble()
tibble## # A tibble: 345 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 335 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
That’s unfortunate. We now have to figure out what operations worked and which didn’t. In real-world settings where we often try to string together multiple operations, this can become tediuous quite fast. That’s why it’s common to enforce that either all opeartions work or everything is reverted to normal.
And that’s where transactions come in. Together with tryCatch()
they are an invaluable tool to ensure database consistency.
|> dbBegin()
con tryCatch({
# Will work
|>
con dbAppendTable(
'palmerpenguins_new',
value = fake_dat[1, ]
)# Will fail
|>
con dbAppendTable(
'palmerpenguins_new',
value = fake_dat[2, ]
)|> dbCommit() # Commit to changes if no error until here
con ::cli_alert_success("Data uploaded successfully")
clierror = function(e) {
}, |> dbRollback() # Rollback if there was an error
con ::cli_abort("Data couln't be uploaded successfully. There was the following error {e$message}")
cli
})## Error in `value[[3L]]()`:
## ! Data couln't be uploaded successfully. There was the following error
## nanodbc/nanodbc.cpp:1783: 00000 [SQLite]CHECK constraint failed: species IN
## ('Adelie', 'Chinstrap', 'Gentoo') (19)
Now you’ll see that we are still left with the 345 rows instead of adding yet another row (even though the first operation worked.)
|>
con dbReadTable('palmerpenguins_new') |>
::as_tibble()
tibble## # A tibble: 345 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 335 more rows
## # ℹ 2 more variables: sex <chr>, year <int>
And with that we have concluded our little trip into databases with R. Of course there’s lots more to learn about SQL and database interactions from within R. But for now, I think this is a good place to stop.