Distributed Queries

DataCaster enables distributed SQL queries that incorporate data from multiple DataCaster databases, spread across multiple sites on the web, and across administrative boundaries. These distributed SQL queries require that the permissions for all the dtaa used by the queries is first setup, using the network authentication setup in DataCaster.

To begin use of distributed queries, first setup remote access to the required data as described in the sections below.

Data Publishing Server Setup

A distributed query setup has at least one DataCaster server publishing data to be accessed by other DataCaster server or client systems. As a distributed database, DataCaster allows users on remote DataCaster servers or access tables and views on a given server. As a first step, you need to setup permissions for remote users to access your server. Access can be setup for remote users at the user-level or server-level. With user-level access, an individual user can access this server with the specified username and password. With server-level access, any authenticated user on the remote server can access this server.

To setup access for remote users to access data on your server, you need to use the WebAdmin utility for this purpose. This is the Remote Users page under the Remote Clients section of WebAdmin. Use this page to allow access to this server for remote users. Access to specific tables and views on this server is subject to the necessary SQL authorization with GRANT statements. Remote users will have access to specific tables or views if granted for the specific remote user, or if public access has been granted.

In addition, remote access can be setup at a server-level, so that all users on a specified remote client system have access to this server. This is the Remote Clients page under the Remote Clients section of WebAdmin. Use this page to allow access to this server for all users on a remote server. Before remote users can access tables and views from this server, they will need to setup their clients to use the servername and password for server-level access to this server. Access to specific tables and views on this server is subject to the necessary SQL authorization with GRANT statements. Remote users will have access to specific tables or views if granted for the specific user, or when public access is granted.

Setup Client Where Query is Launched

A distributed SQL query is launched from a DataCaster server or client, which needs to be authorized before the query can work. With the DataCaster distributed database, you can access remote tables and views in SQL statements, scripts, applications or User API programs. Sometimes anonymous access will be permitted and used for this purpose. In other cases, you will need to setup authentication beforehand so it will be used when accessing data from remote servers.

To setup user-level access, go to the WebAdmin page User-level Access under the Access Remote Serves section. Use this page to setup access to a specified remote server for a given user on this server. Access to specific tables and views on the remote server is subject to the necessary SQL authorization with GRANT statements. Remote users will have access to specific tables or views if granted for the specific user, or when public access is granted.

In addition, remote access can be setup at a server-level, so that all users on this server have acceess to a specified remote server. Use this page to allow access to a specified remote server for all users on this server. Before local users can access tables and views from the remote server, access will need to setup on the remote server, and the same servername and password authorized for server-level access.

Using Distributed Queries

Once remote access has been setup, both on your server and the remote server, you will be able to execute distributed queries. The goal for distributed queries is to make all tables and views, and many other database objects, remotely accessible over the web. Each table and view therefore needs to be identifiable and we use URLs to identify tables in SQL queries when dealing with remote tables.

Instead of simple table names, you will use URLs to refer to remote tables and views. The URL to be used depends on the following:

  • the name of the database that contains the target table
  • the server on which the database to be used resides
  • the port number used to connect to the DataCaster server
  • the name of the target table

For example, the following query uses a products table in a database named storedb on a remote server named acmeserver, where DataCaster is running on the default Tomcat HTTP port 8080.

SELECT * FROM http://acmeserver:8080/Table/storedb/products

This query will work in a similar manner as a query on a local table. URLs can be used in queries in this fashion for multiple tables on multiple remote servers.

Distributed queries are highly unoptimized in DataCaster, so your mileage will vary greatly in terms of performance as well as what works and doesn't, due to timeouts and other issues, when it comes to distributed queries. It is often adviable to break up queries to optimize the processing of results on individual servers, as well as use views to control the query processing behavior.