Sylvain Mareschal, Ph.D.
Bioinformatics engineer
June 17, 2011 at 18:42
ODB
The goal of this package is to access OpenOffice databases via R, to process data stored in it or to automatize their building from scratch or updating. It can be downloaded on the CRAN website, or installed directly in R with install.packages("ODB").



The package provides 5 main functions :
- odb.create, to create a new .odb file from a template.
- odb.open, to produce an "odb" connection to a temporary copy of the .odb file.
- odb.close, to close the connection and update the .odb file.
- odb.read, to import data from the database to R via "SELECT" SQL queries built by the useR.
- odb.write, to update the database via "INSERT" or "CREATE" SQL queries built by the useR.

A few other functions are also provided to manage .odb specifics such as comments on table fields and stored queries. Some wrappers are also provided to insert directly a data.frame in a database table without writing the SQL query, list the table names and fields or export the database in a .sql file.

Other wrappers may be added in future versions to help users not familiar with the SQL language.


Typical use

connection = odb.open("file.odb")
data = odb.read(connection, "SELECT * FROM table WHERE id < 15")
odb.write(connection, "UPDATE table SET field='peach' WHERE id = 5")
odb.close(connection)



Technical considerations

.odb files, as any other OpenDocument files, are ZIP archives containing the HSQL files. To establish the connection, the .odb file is unzipped via the "zip" bash command if available, and the connection is made via the RJDBC interface. The "odb" object produced inherits from the "DBIConnection" class, thus all functions provided in the DBI packages may be used directly on it to manage the database. The odb.read and odb.write functions are only wrappers to such DBI functions, handling frequent issues such as charset or factors considerations.

Notice the database files are copied in a temporary directory, thus any update made to the database is not written in the .odb file until the odb.close call, so simultaneous access to a same database (via R and OpenOffice) should not be considered.