Applibase DataCaster is an Enterprise Information Integration (EII) server, designed to bring the power of web services mashups and other new technologies to address data integration needs in the enterprise. Enterprises face a growing need to manage and exploit the volumes of data in the enterprise for business intelligence and other applications. Applibase DataCaster makes it easy to collect, manage and use real-time, integrated information from a diverse set of data sources. It is used to integrate data across application instances, for example sites running applications that can benefit from sharing data with applications on other sites. DataCaster includes a comprehensive standards-based Relational/SQL database engine and JDBC driver that provides a familiar interface for applications, but at the same time provides access to remote data on other sites and servers running DataCaster servers.
To begin, install and setup the DataCaster server by downloading the package and installing it as specified in the installation page. This will allow you to start the DataCaster server and get access to the WebAdmin client. The WebAdmin client provides a convenient interface to use and administer the DataCaster server, including databases, users, remote access, run SQL statements and scripts, run Jython scripts and much more.
You can do all of the following with the Console and the command line tools as well, although it is easier to get started with WebAdmin as described below.
The admin user is the only user on a new server, and it is advisable to setup separate users for each person using the server. This will avoid confusion and minimize conflicts in many instances of using the server. In addition, you may not want to allow admin access to all users for security reasons.
Use the WebAdmin Manage Users page to add new users. When adding users, you can specify if a user has administrative privileges by checking the Is SuperUser checkbox.
In the default configuration, databases are created automatically when used, if they do not exist. New database creation will take a minute or two, before you can use the database. So in order to use a new database, simply invoke the statement, or SQL script, or Java program, or Jython script for the desired database, and the database will be created for you.
If you decide to turn off the default database creation, you will need to use the command line console to create a new database.
Since DataCaster is primarily a distributed database, you will often be creating database tables, views, etc., to create and use application data. In a few cases you may be working purely with remote data, in which case you will not be creating your own data and not require table creation, etc. However, in most cases you will want to create schemas, tables and other database objects as required.
DataCaster closely follows the SQL standard with regard to the
organization of database objects. Each database contains catalogs,
which contain schemas, which contain tables, views, and other database
objects. Every database has a home catalog, and each user
has a schema with the same name as their username.
Users can then create schemas, catalogs, and other ways of organizing database objects. Schema owners can grant permissions to other users to allow them to use, create, alter and drop schema objects. Please see the Database Objects section for more information.
Use SQL statements to create tables, views and other database objects within your own schema, or other schema in which you are authorized to create database objects. Admins have complete access and can create, alter and drop any objects.
Using WebAdmin, you can run SQL statements and scripts from the web client. Go to the SQL Commands section and select the Run Query page. There you can execute SQL statements to create tables and other objects and insert data into the tables.
You have a number of tools to create, modify and query data in the database. SQL, Java (JDBC or Direct User API) and Jython are some of the tools available. WebAdmin provides a convenient way to interact with the server directly through the WebAdmin and run SQL statements and queries, etc. The RMI server, which is not started by default, can also be used with remote JDBC programs to interact with the server.
As a distributed database, DataCaster allows users on remote DataCaster servers or access tables and views on this server. For example, this allows users or applications on remote web sites to access data on your site, and vice-versa.
As a first step, you need to setup permissions for remote users to access your server. Access can be setup for remote users at the user-level or server-level. With user-level access, an individual user can access this server with the specified username and password. With server-level access, any authenticated user on the remote server can access this server.
To setup access for remote users to access data on your server, you need to use the WebAdmin utility for this purpose. This is the Remote Users page under the Remote Clients section of WebAdmin. Use this page to allow access to this server for remote users.
Before remote users can access tables and views from this server, they will need to setup their clients to use this username and password for user-level access to this server. Access to specific tables and views on this server is subject to the necessary SQL authorization with GRANT statements. Remote users will have access to specific tables or views if granted for the specific user, or public access is granted.
In addition, remote access can be setup at a server-level, so that all users on a specified remote client system have acceess to this server. This is the Remote Clients page under the Remote Clients section of WebAdmin. Use this page to allow access to this server for all users on a remote server. Before remote users can access tables and views from this server, they will need to setup their clients to use the servername and password for server-level access to this server. Access to specific tables and views on this server is subject to the necessary SQL authorization with GRANT statements. Remote users will have access to specific tables or views if granted for the specific user, or when public access is granted.
With the DataCaster distributed database, you can access remote tables and views in SQL statements, scripts, applications or User API programs. Sometimes anonymous access will be permitted and used for this purpose. In other cases, you will setup authentication beforehand so it will be used when accessing data from remote servers.
DataCaster allows users on this server to access tables and views on remote DataCaster servers with eitehr server-level or user-level access. With user-level access, the specified local user on this server can access tables and views on the remote server. With server-level access, any authenticated user on this server can access the specified remote server.
To setup user-level access, go to the WebAdmin page User-level Access under the Access Remote Serves section. Use this page to setup access to a specified remote server for a given user on this server.
Before local users can access tables and views from the remote server, access will need to setup on the remote server, and the same username and password authorized for user-level access. In addition, access to specific tables and views on the remote server is subject to the necessary SQL authorization with GRANT statements. Remote users will have access to specific tables or views if granted for the specific user, or when public access is granted.
In addition, remote access can be setup at a server-level, so that all users on this server have acceess to a specified remote server. Use this page to allow access to a specified remote server for all users on this server. Before local users can access tables and views from the remote server, access will need to setup on the remote server, and the same servername and password authorized for server-level access.
Once remote access has been setup, both on your server and the remote
server, you will be able to execute distributed queries. Instead of
simple table names, you will use URLs to refer to remote tables and
views. For example, the following query uses a products
table in a database named storedb on a remote server
named acmeserver, where DataCaster is running on the
default Tomcat HTTP port 8080.
SELECT * FROM http://acmeserver:8080/Table/storedb/products
This query will work in a similar manner as a query on a local table. URLs can be used in queries in this fashion for multiple tables on multiple remote servers.
Distributed queries are highly unoptimized in DataCaster, so your mileage will vary greatly in terms of performance as well as what works and doesn't, due to timeouts and other issues, when it comes to distributed queries. It is often adviable to break up queries to optimize the processing of results on individual servers, as well as use views to control the query processing behavior.
For many applications on the web, it seems likely that materialized distributed views have greater value in building applications with distributed data. It is usually necessary to provide users good response times, even when using remote data in the application. Materialized views essentially cache the data from the remote server(s) in a view that can be used just like a local table by the application.
The goal with distribuetd views is to have the required data local as far as possible, so that applications can use remote data without paying any serious performance penalty. Furthermore, it may be that the remote sites or servers are temporarily unavailable. At such times, it is desirable to have the application continue working even while the remote sites and servers are unavailable.
To use distributed materialized views, SQL statements or the User
API can be used to create views. For example, the following SQL
statement creates a materialied SQL view. Like the earlier query
example, ituses a products table in a database named storedb
on a remote server named acmeserver, where DataCaster is
running on the default Tomcat HTTP port 8080.
CREATE MATERIALIZED PERIODIC REFRESH VIEW acme_products AS SELECT * FROM
http://acmeserver:8080/Table/storedb/products
Now you can use acme_products in queries and other
programs, and it will perform much like a local table. This CREATE
statement specified the view should be periodically refreshed, and this
refresh is performed by polling the remote server at fixed intervals to
get all the changes from the remote server.
For anonymous or authenticated remote access, particularly across administrative domains, it is essential to control what a database user is allowed to do. Since SQL queries can easily be written to hog resources on the server among other things, there is a need to limit the resources used by any given remote user.
DataCaster supports multiple types of resource usage control for both local and remote users, including anonymous users. Four kinds of usage control are provided:
Using these attributes, usage can be controlled for any user as required, to minimze the damage from intentional or inadvertent excess resource consuption by local or remote users.
The primary purpose of DataCaster is to build new kinds of distributed applications that can bring users new benefits from easier sharing of data across web sites and applications on the web. While it can be used as a standalone database, that is not what DataCaster is intended for. Therefore, we need to look at what kinds of distributed applications can take advantage of this kind of tool.
Although we built the technology with some ideas in mind, we at Applibase have a limited perspective when it comes to distributed applications that can benefit from such a tool. Hence, we can only describe a small subset of potential applicatons for such technology. Users of such technology we expect will find much more useful applications for this and similar technology that we can not even dream about.
Applibase has built an application to take advantage of this kind of distributed data. Our Xprss application is a feed reader tool that combines RSS aggregation with authenticated blog tools, to give users an easy way to share ideas and information as they read. One of the useful features for such a tool is to provide ranking of articles based on uage statistics as well as other personalization data. For small sites running the tool, the amount of usage data available is limited, which limits the value of the tool. Our goal is to use DataCaster to enable sharing of usage staistics across sites, so that all sites get access to the collective usage data. In this way, useful ranking of articles can be provided to all users for the content of interest to them. It remains to be seen how well this will work, but we do see a lot of potential for this sharing of usage data to be useful.
There are many other potential applications, which we are unable to pursue due to lack of time. One of them involves connecting auction sites so that auction sites can share data on auctions at other sites. One of the primary challenges for any marketplace, including auction sites, is generating sufficient volume to make it attractive for buyers to visit the site and make it likely they will find what they are looking for. Creating a network of sites, where all the listings from all sites are instantly available for purchase/bid to a buyer at any site, clearly enhances the attractiveness of any of the sites. DataCaster is designed to make it easier for such a network of sites, so that they can share data and allow any buyer to search and get to a product on any site.
As mentioned above, any of the examples we can envison will definitely be dwarfed by what users will conceive and create with such tools. Weplan to continue to improve and enhance these tools to make it easier for these new applications to be realized and be successful.