NAME

schemaball - visualize an SQL database schema in a circular composition


SYNOPSIS

  # all parameters read from default configuration file
  schemaball [-debug [-debug] ]

  # read parameters from specific configuration file
  schemaball -conf etc/schemaball.conf 

  # override database connection settings
  schemaball [-database DATABASE] [-host HOST] [-user USER] [-password PASSWORD]

  # dump configuration only
  schemaball -dumpconfig

  # read documentation
  schemaball -man


DESCRIPTION

Schemaball generates high-level database schema diagrams. The schema diagram is formatted along a circular track. Tables are represented by glyphs along the circumference and relationships between tables (constraints) are depicted by lines that join the table glyphs. Schemaball can follow these constraints from table to table and automatically annotate the image to help you identify tables participating in a constraint chain. Diagram elements can be flexibly hidden or highlighted to draw attention to specific parts of the schema.

Schemaball is primarily designed to help visualize the relationships between tables, rather than table structure. Currently, the schema diagrams do not indicate which field in a table participates in a constraint relationship and details of internal table structure are not shown.

The database structure can be obtained by connecting to a live server or by reading SQL schema dumps or flat files. The properties of the schema diagram are manipulated by setting parameter values stored in a configuration file.

Up-to-date documentation, code and FAQ can be viewed at http://mkweb.bcgsc.ca/schemaball.


Looking at Database Schemas

When database schemas are small, generating effective illustrations is very easy. Diagrams can be drawn manually without much effort. As the number of tables and constraints grows, it becomes more difficult and time consuming to depict the schema in a compact and informative manner. The schema diagram becomes very complex and automated drawing tools have difficulty in optimizing the placement and routing of the glyphs and lines in the schema diagram. In a highly connected schema, it may not be possible to draw the table glyphs in clusters to limit the total length of constraint lines. As a result, constraint paths are long and tracing them is awkward.

One way to limit the complexity inherent in the routing problem of the constraint lines is to use a pre-determined layout for the table glyphs. By placing these along a circle, constraint lines are easy to generate and follow. The circular schema diagram scales well with schema size - the diagram remains compact and light on the eyes.

Database schemas, especially for large databases, are usually examined in a piece-meal fashion. You don't look at the entire schema all at once - you examine groups of tables and constraints which form a functional group. By decomposing the schema in this manner, the data, objects and interactions modeled by the database can be more easily understood. Schemaball is ideally suited for such decomposition - the colour scheme of arbitrary elements (tables or constraints) can be adjusted. The colour scheme can be propagated along constraints, thereby generating highlighted paths through the schema.


Applications

Schemaball generates static images. Currently only bitmap images (PNG) can be generated. However, coordinates of each element can be obtained and may be used to generated vector images (e.g. SVG). The properties of the schema diagrams can be tuned flexibly making publication-quality output possible.


DATA ELEMENTS

To draw the schema diagram, a list of tables and constraints is required. In diagrams generated by Schemaball, internal table structure is not displayed. Tables and constraints may be loaded from (a) live database (b) SQL schema dump, or (c) list of table or constraint names.


Database Support

Schemaball needs to know how to communicate with your database to be able to

To connect to a live database, the <db> block in the configuration file stores the server name, database name and user authentication tokens. Interface to the database is provided by DBI and the DBI driver is specified in 'dbtype' in the <sqldb> block. This block additionally defines all the SQL commands which are used to poll database structure. The <sql> block is described further on.

In addition to polling a live database, Schemaball can collect schema information from an SQL schema dump or a flat file which lists the table names. The <db> block is used to specify the data source.


Defining SQL commands

Named <sql> blocks store SQL commands. You may need to adjust the value of the 'sql' variable in these commands to make them compatible with your server, if you are not using MySQL. The name of the block should not be changed. Each SQL command is defined in the following way

  <sql NAME>
    sql = SQLCOMMAND
    colnum = COLNUM
  </sql>

where NAME is a unique identifier, SQLCOMMAND is the SQL query and COLNUM is the column in the query result that contains the desired information. The value of COLNUM depends on what the query is designed to achieve.

The SQL command may be applied to specific databases and tables by using the strings _DB_ and _TABLE_ in the query string. These strings will be substituted with the current database and table name. The following <sql> blocks are defined by default and are required to extract the required schema information.


Creating a Table List

Schemaball generates its table list from the data source defined in the <db> block. When the <db> block defines a database, the SQL query defined in the <sql listtables> block is used to generate the table list. When the <db> block defines an SQL schema dump, the CREATE TABLE statements in the dump are parsed to extract table names.


Determining Foreign Keys

MySQL does not support foreign keys currently for MyISAM tables (as of v4.0, see http://www.mysql.com/doc/en/ANSI_diff_Foreign_Keys.html). Support for constraints may be implemented by using (a) InnoDB tables which support foreign keys, or (b) application logic. To help automate application logic, foreign key fields may be named using a specific convention, which can be used to work out the target table and field for a given foreign key name. One such convetion is

  FKtype_tablename__field

where type is some arbitrary string to differentiate of the foreign key. This field is specific to an application, which associates different TYPEs with specific business logic. The TYPE makes it possible to have multiple foreign keys pointing to the same field in the same table. The TABLE is the name of the table that the foreign key references and the FIELD is the name of the FIELD in that table. If the canonical name of the referenced field is in the format TABLE_FIELD, then the FIELD string in the FK name does not include the TABLE_ string. For example, if a Person table has a foreign key which references the field Personality_ID in the table Personality, then the name of the key would be FK_Personality__ID (not FK_Personality__Personality_ID).

The <linkrule> block in Schemaball's configuration file specifies how to detect constraint relationships. There are three ways in which constraints may be determined.


Making Table and Constraint Dumps

Once Schemaball has analyzed the data source (database, SQL dump, or table list) and compiled a list of the constraints (database, SQL dump, or constraint list), it can generate a dump of what it has parsed. If you toggle debugging (-debug), the dump is reported in lines starting with ``schemadump''. These data can be used to verify that Schemaball has correctly parsed your database structure and to create flat files of table and constraint names.


CONFIGURATION


Syntax

Schemaball gets most of its runtime parameters from an Apache-like configuration file. The Config::General module is required to parse the file. For the syntax of the configuration file, refer to the man page of Config::General.


Configuration File Location

If you do not specify the configuration file, Schemaball checks a few default locations before giving up. The default name of the configuration file is schemaball.conf, or SCRIPTNAME.conf where SCRIPTNAME is the name of the schemaball script. The following paths are checked for the SCRIPTNAME.conf file, in order. Relative paths are relative to the location of the script.


Configuration Blocks

Blocks in the configuration file group parameters that control disparate aspects of the schema diagram. These blocks are


Image

The image block stores the following entries

  <image>
    file
    image_element_dump
    size
    glyphradius
    annotation
    background
    foreground
    <elements> ... </elements>
  </image>


Image :: Elements :: Link

The link block controls how constraint lines should be drawn.

  <link>
   show_links
   color
   color_hi
   stroke
   stroke_hi
   colour_from
   stroke_from
   foot_fraction
   anchor
   type
   link_to_invisible
   hide_file
   <hide_rx> ... </hide_/rx>
   highlight_file
   <highlight_rx> ... </highlight_rx>
  </link>


Image :: Elements :: Bezier

  <bezier>
   resolution
   radius
  </bezier>

The <bezier> block controls properties of the bezier curves, if this constraint line type is chosen.


Image :: Elements :: Label

Properties of text and text labels.

  <label>
   font
   size
   margin
  </label>


Image :: Elements :: Table

Table glyph properties.

  <table>
   size
   fill
   fill_hi
   stroke
   outline
   outline_hi
   label
   anonymize
   hide_file
   <hide_rx> ... </hide_rx>
   invisible_hide
   highlight_file
   <highlight_rx> ... </highlight_rx>
  </table>


Image :: Elements :: Highlight

Controls how highlighting is inherited across constraint lines.

  <highlight>
   highlight_by_link
   highlight_by_table
   highlight_by_table_forward
   highlight_by_table_reverse
   highlight_by_iterations
   fade_factor_table
   fade_factor_link
  </highlight>

The inheritance of highlighting helps to follow contraint relationships between tables. For example, by first highlighting a specific table (see table :: highlight_rx)

  <table>
   <highlight_rx>
    mytable
   </highlight_rx>
  </table>

you can ask that all constraint lines originating from and ending at this table are highlighted too.

  highlight_by_table         = yes
  highlight_by_table_forward = yes
  highlight_by_table_reverse = yes
  highlight_by_iterations    = 1

Similarly, if you highlight a constraint line, you can highlight all the tables that participate in the lines

  <link>
   <highlight_rx>
   ^table1__
   <highlight_rx>
  </link>
 
  highlight_by_link = yes
  highlight_by_terations = 1

The effect will be to highlight all tables which participate in a constraint relationship with table1.

The parameter 'highlight_by_link' toggles whether tables should inherit link highlights. Similarly, links inherit table highlights if 'highlight_by_table' is set.


DEBUGGING

Schemaball has two debug levels, specified by the number of -debug command-line parameters. Level 1, set by -debug, will make printdebug() produce output. Level 2, set by -debug -debug, will in addition dump the structure of the $CONF variable.


HISTORY


BUGS

Please report bugs and suggestions to martink@bcgsc.ca.


AUTHOR

Martin Krzywinski, martink@bcgsc.ca

$Id: schemaball,v 1.5 2003/10/27 20:25:11 martink Exp $


CONTACT

  Martin Krzywinski
  Genome Sciences Centre
  Vancouver BC Canada
  www.bcgsc.ca
  martink@bcgsc.ca