SIARD Suite 2.2 - Invocation of Command-Line Tools

The download and upload functionality of SIARD Suite can be called from the command line:

Top of page

SiardFromDb

SiardFromDb is a command-line program which extracts a database to a SIARD archive. One can use SiardFromDb to create

  • a SIARD archive (meta data and primary data) based on the database (option -s), and/or
  • SIARD meta data XML, containing a definition of the database schema (option -e).

Top of page

Invocation

java -cp <siardpath>/lib/siardcmd.jar ch.admin.bar.siard2.cmd.SiardFromDb [-h] |
  [-o] [-v] [-l=<login timeout>] [-q=<query timeout>] [-i=<import meta data>] [-x=<external lob folder>] [-m=<mime type>]
  -j=<JDBC URL> -u=<database user> -p=<database password> -s=<siard file> -e=<export meta data>"
                

Specify <siardpath> as the folder where SIARD Suite is installed. The file siardcmd.jar is in the lib subfolder with its class ch.admin.bar.siard2.cmd.SiardFromDb, whose main() is invoked with java (it is better to use javaw under Windows).

The call syntax is displayed if the -h (help) option is entered on the command line.

Top of page

Arguments

ArgumentMeaning
-ooverwrite output file(s) if they exist
-varchive views as tables
<login timeout>timeout in seconds for login (0 for unlimited)
<query timeout>timeout in seconds for query (0 for unlimited)
<import meta data>name of meta data XML file to be used as a template
<external LOB folder>folder for storing the data of the largest LOB column of database externally (contents will be deleted if they exist!)
<mime type>MIME type of data in the largest LOB column of database (influences file extension of externally stored LOBs)
<JDBC URL>JDBC URL of database to be downloaded
e.g.
for MS Access
jdbc:access:D:\Projekte\SIARD2\JdbcAccess\testfiles\testdb.mdb
for DB/2
jdbc:dbserver.enterag.ch:50000/testdb
for MySQL
jdbc:mysql://dbserver.enterag.ch:3306/testdb
for Oracle
jdbc:oracle:thin:@dbserver.enterag.ch:1521:orcl
for Postgres
jdbc:postgresql://dbserver.enterag.ch:5432/testdb
for SQL Server
jdbc:sqlserver://dbserver.enterag.ch\testdb:1433
(in bash shell the latter must be quoted with duplicated backslash:
"jdbc:sqlserver://dbserver.enterag.ch\\testdb:1433")
<database user>database user
<database password>database password
<siard file>name of .siard file to be written
<export meta data>name of meta data .xml file to be exported

Top of page

Notes

Either the SIARD file or the export meta data file or both must be given.

The SiardFromDb program should be used against a database snapshot which doesn't change during the archiving process.

The option archive views as tables usually leads to data duplication and is therefore not recommended. However, it is useful when the archival user has read access to the views but not to the base tables.

The archiving process either completely succeeds or completely fails.

For large databases, it is recommended to download just the meta data XML beforehand. This gives insight into all the meta data and table sizes, which helps to estimate the download time needed. Furthermore, one should use the -q 0 option for large tables as it is not possible to estimate how many seconds a size query will take.

The conversion of TIMEs and TIMESTAMPs in the database depends on the local time zone. If the time 15:30 is stored in Zurich, the UTC time value 14:30 will be stored in the SIARD file – in winter. To suppress this conversion one must start SiardFromDb with the option

-Duser.timezone=GMT

which tells SIARD to interpret all database times as UTC times.

Top of page

Archiving Database User

It is generally inadvisable to use the database administrator user (DBA, root, dbo, ...) for downloading a SIARD archive. The extent of the SIARD archive is defined by the objects to which the archiving database user has read-access. The global DBA usually has read access to all databases on the system as well as numerous system tables that should not be archived. Therefore it is important that a suitable archiving user be created for the download if one does not exist.

Top of page

SiardToDb

SiardToDb is a command-line program which loads a SIARD archive into a database for research purposes.

Haut de page

Invocation

java -cp <siardpath>/lib/siardcmd.jar ch.admin.bar.siard2.cmd.SiardToDb [-h]
  [-o] [-l=<login timeout>] [-q=<query timeout>]
  -s=<siard file> -j=<JDBC URL> -u=<database user> -p=<database password>
  [<schema> <mappedschema>]*

Specify <siardpath> as the folder where SIARD Suite is installed. The file siardcmd.jar is in the lib subfolder with its class ch.admin.bar.siard2.cmd.SiardToDb, whose main() should be invoked with java (it is better to use javaw under Windows).

The call syntax is displayed if the -h (help) option is entered on the command line.

Top of page

Arguments

ArgumentMeaning
-ooverwrite types and/or tables in the database if they exist
<login timeout>timeout in seconds for login (0 for unlimited)
<query timeout>timeout in seconds for query (0 for unlimited)
<siard file>name of .siard file to be uploaded
<JDBC URL>JDBC URL of the target database
e.g.
for MS Access
jdbc:access:D:\Projekte\SIARD2\JdbcAccess\testfiles\testdb.mdb
for DB/2
jdbc:dbserver.enterag.ch:50000/testdb
for MySQL
jdbc:mysql://dbserver.enterag.ch:3306/testdb
for Oracle
jdbc:oracle:thin:@dbserver.enterag.ch:1521:orcl
for Postgres
jdbc:postgres://dbserver.enterag.ch:5432/testdb
for SQL Server
jdbc:sqlserver://dbserver.enterag.ch\testdb:1433
(in bash shell the latter must be quoted with duplicated backslash:
"jdbc:sqlserver://dbserver.enterag.ch\\testdb:1433")
<database user>database user
<database password>database password
<schema>schema name in SIARD file
<mappedschema>schema name to be used in database

Top of page

Notes

As older databases are not SQL:2008 compliant, a considerable amount of manual configuration effort is unavoidable in preparation for the upload. There are no Schema objects in MS Access. User and Schema objects are not separate in Oracle. Schemas and databases are not distinct in MySQL. Therefore target schemas must be created before upload. Also the database user must have the right to create tables and types in those schemas. Because this is not always easily possible, SIARD schemas are mapped to database schemas according to the list of schema mappings on the command line.

Uploading only creates tables and types and attempts to enable unique and foreign key constraints. No other database objects are created. If the constraints cannot be enabled a warning is issued but the upload is nevertheless considered to be successful. Even without constraints SQL SELECT queries can be issued against the database.

Furthermore, certain sacrifices are made. In MS Access, all tables end up in the same MDB/ACCDB. In Oracle, all names longer than 30 characters are abbreviated. To avoid collisions, table and column names are extended by a counter. (E.g. "A far too long a table name for Oracle" becomes "A far too long a table name01".)

Where the maximum precision and the maximum number of decimals (for instance MS Access) are smaller than required, the values are uploaded with less precision. SIARD helps as much as is possible in the target database system. Consulting the database meta data via SiardGui allows the correct assignment of designations and values.

The conversion of TIMEs and TIMESTAMPs in the database depends on the local time zone. The UTC time 14:30 in the SIARD file is uploaded in Zurich as the local time 15:30 to the database – in winter. To suppress this conversion one must start SiardToDb with the option

-Duser.timezone=GMT

which tells SIARD to interpret all database times as UTC times.