Most Read This Week
Connecting the Java World to Grid-Enabled Databases
Consolidate IT resources and optimize usage
By: Kuassi Mensah
Sep. 7, 2004 12:00 AM
Grid computing is not necessarily a new concept; however, its adoption within the enterprise has given birth to a new concept called enterprise grid computing, which is being embraced by the entire IT industry. Enterprise grid computing aims to consolidate IT resources - including both infrastructure software and applications - and optimize their usage, cutting costs substantially along the way. Since Java and J2EE are widely used as enterprise software platforms, how do they align with this vision?
This article outlines a set of challenges that JDBC faces as the database connectivity layer within enterprise grid environments and illustrates how the Oracle Database 10g JDBC driver addresses these challenges. First, I'll introduce the concept of enterprise grid computing; then I'll examine how Java and J2EE operate in grid environments and identify their database connectivity requirements; and finally, I'll discuss the features of the Oracle Database 10g JDBC driver that address those requirements.
Enterprise Grid Computing
The convergence of recent hardware and software advances has made resource virtualization possible and allowed the enterprise grid to be constructed. On the hardware side, these advances include networked storage devices (like storage grids) and low-cost, modular hardware components (such as blades); on the software side, they include improvements in networking, Web services, databases, application servers, and management frameworks.
Although no specific enterprise grid computing standards have been established, there is a general move toward the concept of service-oriented architectures (SOA), which are based on and make extensive use of existing Web services standards and specifications. SOA makes it possible to construct architectures where client applications can simply register, discover, and use the services deployed over the grid. This move is spearheaded by academic and research proposals such as the Open Grid Service Architecture (Global Grid Forum) and middleware vendors through participation in Web services standards bodies. A new consortium, Enterprise Grid Alliance, has been formed with the goals of developing enterprise grid solutions and accelerating the deployment of grid computing in enterprises.
Figure 1 illustrates a typical enterprise grid platform. While a complete enterprise grid computing discussion is beyond the scope of this article, I hope I've given enough information to encourage you to learn more about it.
Java and J2EE Containers in the Enterprise Grid
In recognition of this, vendors, such as Oracle with its Oracle Application Server 10g, are actively supplementing their infrastructure with provisioning, monitoring, registering, discovering, and manageability mechanisms. This allows them to be used in the more complex system architectures required by modern SOA-based business applications.
Similarly, the J2EE world, realizing that it needs to broaden its support to interoperate with other applications and environments in a loosely coupled manner, is aligning with enterprise grid computing and SOA concepts. It's doing this by including core Web services standards (such as SOAP, WSDL, UDDI, and JAX-RPC) and numerous other Web services specifications including WS-I, WSIF, security, transaction, reliable messaging, events, peer-discovery, policy, orchestration, choreography, provisioning, and service-level agreement.
In the scientific and research worlds, the Globus Toolkit, from the Globus Alliance, is a reference implementation of the Open Grid Services Infrastructure (OGSI). OGSI lets Java components be exposed as OGSI services. OGSA-DAIS (Data Access and Integration) is another scientific and research proposal aimed at addressing data services requirements in grid environments; leading database vendors are participating in this initiative.
Enterprise grid computing creates an extremely dynamic environment where resources are automatically augmented or diminished based on business priorities captured in allocation policies. The current version of the JDBC specification (and implementations based on it) makes working in such environments impractical and reduces the ability of Java and J2EE applications to achieve all the benefits of enterprise grid computing. To be more explicit, let's look at some of the issues faced when working in an enterprise grid computing environment:
Virtualizing the Database as a Service
As shown in Figure 1, clusters are pervasive at every level within the enterprise grid: disk and network storage servers, database servers, and application servers. Most database vendors now allow a single database to be concurrently managed by multiple servers or instances.
Although database clusters are easily understood, there are fundamental differences between the architectures supported by the various current implementations; for example, a shared nothing approach and a shared disk approach provide different levels of capabilities. This article focuses on the Oracle Database Real Application Cluster (RAC) model in which any database server instance can address the entire database space, hosted on shared networked storage.
The typical JDBC approach to specifying how to connect to a database is to provide the details for the target database in the form of a tuple containing <host>:<port>:<sid>. This is set as the URL attribute of a data source definition or directly in the connect string when establishing the JDBC connection.
In the Oracle Database RAC model, the physical server hosts and database instances that make up the cluster are represented as a single logical entity known as a service. A client application connects to the logical database service without any real knowledge of the actual host or database server instance that's used in the cluster. By connecting to the service, as illustrated by the Enterprise Grid Model in Figure 2, and not to an actual physical server, clients can be insulated from changes made to the physical elements of the cluster as well as from node failures.
To use a logical database entity in Java applications, JDBC connection strings must be able to accept service-based connect strings instead of physical host details. In the Oracle Database RAC model, services are represented as "/<service-name>".
The Connection Cache Manager
The Connection Cache Manager plays two major roles: it manages cache and binds a connection to the data source.
Managing and Maintaining Cache
The Connection Cache Manager supports the coexistence of more than one cache. Each cache is identified by a unique name and is then tightly bound to a data source. Each cache is either created transparently when getConnection() requests are made on a cache-enabled data source, or is created explicitly in the middle tier via the Connection Cache Manager API. Once a cache is created, it may either be explicitly removed via the Connection Cache Manager, or when the data source is closed.
Caches can be monitored using the Cache Manager APIs, enabling access to information such as the number of connections checked out and the number of connections available in the cache.
Binding a Connection Cache to the Data Source
Fast Application Notification (FaN)
To allow application clients in an enterprise grid environment to function in the most efficient manner, a mechanism must be provided that performs the following tasks:
Let's look at how these pieces work together. When a new instance is added to a database cluster, or when an instance is retrenched from a cluster (instance stopped or node dies), Oracle Database RAC generates an event that indicates what happened. The Oracle Notification Service (ONS) detects and distributes this event to components that have registered as interested in such events.
In Oracle Database 10g JDBC, Fast Application Notification is important: the connection caches are alerted when database instances are affected and can take preventive courses of action.
The mechanism is enabled on a cache-enabled data source by setting the data source property FastConnectionFailoverEnabled to true (see Listing 1).
With Fast Application Notification, event detection and notification is nearly instantaneous, occurring in seconds rather than minutes with traditional timeout mechanisms.
Adding Database Instances: Load Balancing
Let's consider a basic example in which we have:
Retrenching Database Instances (or Node Failure): High-Availability
Continuing with the previous example:
The Fast Connection Fail-over mechanism transparently ensures reliable, highly available Java database connections in RAC and grid environments.
Simplifying JDBC Connection Caching
Transparent Access to the Cache
A JDBC 10g Connection cache can be created either implicitly by the first invocation of the getConnection() method or explicitly by using the Connection Cache Manager API. The Connection Cache Manager is especially useful for developers working with J2EE containers and ERP frameworks since it shields the infrastructure from the complexity of managing connection cache.
// This call would create a "MyCache" cache and a
Subsequent getConnection() invocations will either create a new connection (if the cache was not previously initialized) or retrieve an existing connection from the cache. Once the connection is retrieved, you can proceed with statement creation.
// Create a Statement Statement stmt = conn.createStatement (); ... // Close the Statement stmt.close(); stmt = null;
The cache can be populated in one of two ways: by preinitializing it using the Cache Manager APIs or, incrementally, upon the release of connection(s) back to the cache. When returning a connection to the cache, applications can save current connection attributes settings for future use (see attribute-based retrieval, below).
// return this connection to the cache
Caching Multiple Identities
The Implicit Connection Cache can handle any combination of user-authenticated connections. For example, a joe.johnson connection can coexist very well with a sue.miller connection in the same connection cache.
Connection Retrieval Based on User-Defined Attributes
Connection striping, or labeling, consists of applying user-defined attributes to a connection and making their state persist when the connection is returned to the cache. This speeds up future connection retrieval since cached connections don't have to reinitialize a block of state every time. These attributes can later be used to retrieve the same connection from the cache, as follows.
Example 1: Retrieving a connection based on the NLS_LANG attribute:
// get a connection from the cache with NLS_LANG attribute
Example 2: Retrieving a connection based on the isolation-level attribute:
java.util.Properties connAttr = null;
Example 3: Retrieving a connection based on the connection tag and connection attribute:
java.util.Properties connAttr = null;
Applying Connection Attributes to a Cached Connection
One approach is to call the applyCon-nectionAttributes(java.util.properties connAttr) API on the connection object. This simply sets the supplied attributes on the connection object. It's possible to apply attributes incrementally using this API, letting users apply connection attributes over multiple calls. For example, NLS_LANG may be applied by calling this API from module A. The next call from module B can then apply the TXN_ISOLATION attribute, and so on.
A second approach is to call the close(java.util.properties connAttr) API on the connection object. This API closes the logical connection and then applies the supplied connection attributes on the underlying PooledConnection (physical connection). The attributes set via this close() API override any attributes set using the applyConnectionAttri-butes() API.
The following example shows a call to the close(connectionAttributes) API on the connection object that lets the cache apply the matched connectionAttributes back on the pooled connection before returning it to the cache. This ensures that when a subsequent connection request with the same connection attributes is made, the cache will find a match.
// Sample connection request and close
Connection Retrieval Based on Attributes and Weights
Weights are assigned to each key in a ConnectionAttribute in a one-time operation that also changes cache properties. The cache property CacheAttributeWeights is one of the java.util.Properties that allows the setting of attribute weights. Each weight is an integer value that defines how expensive the key is in terms of resources.
Once the weights are specified on the cache, connection requests are made on the data source by calling getConnection(connectionAttributes). The connectionAttributes argument refers to keys and their associated values.
The connection retrieval from the cache involves searching for a connection that satisfies a combination of the following:
Once the weights are set, a connection request could be made as in Listing 3.
The getConnection() request tries to retrieve a connection from the MyCache cache. In connection matching and cache retrieval, one of two things can happen:
Whether you're using JDBC directly in applications via a run-time container such as an EJB container or through an O/R mapping framework such as Toplink, the latest Oracle JDBC drivers offer reliable and highly available data sources in RAC and grid environments and significantly simplify a range of data source connectivity issues for Java developers.
Oracle has proposed these new features to the JSR 221 (JDBC 4.0 specification) expert group. It hopes to see JDBC Connection Pool management and high-availability support for data sources within enterprise grid environments become part of the JDBC 4.0 standard.
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Today's Top Reads