Most Read This Week
Macromedia's Data File Access API Architecture Unleashed
Macromedia's Data File Access API Architecture Unleashed
By: Simon Horwith
Jun. 16, 2003 12:00 AM
Like its predecessors, Macromedia's most recent installment of the Devnet Resource Kit (DRK 3) is stocked with many excellent utilities for Flash developers. Unlike previous releases, DRK 3 aims to make the lives of ColdFusion developers easier by including many applications and development tools for use in CFMX applications. One of these is an Application Programming Interface I developed called the Data File Access API (DFA API).
When I set out to design and develop the DFA API, the goal was to develop an API that would allow developers to store data in text files as either XML or CSV text, and to access and manipulate that data as easily as if it were stored in a database. With the goal of that core functionality in mind, the primary objective was to invent an architecture that would perform as optimally as possible.
In addition, two other primary objectives were to make the API very easy to use and to make it flexible enough for developers to extend or implement in any way they might need. Ideally, as developers become more familiar with the API, they will be inspired to use it as the backbone of more creative solutions to meet their applications' needs. Let's examine how the features of ColdFusion MX were used to meet these objectives.
The first thing I needed to consider was how to architect the API not only to define and store data, but also to make this data available for very fast filtering and retrieval. What I decided was to create a ColdFusion Component that houses all of the methods for working with the data and that stores all of the data in memory (as needed) in a proprietary XML DOM format, whether the data came from CSV or XML text. I refer to these as "data tables" and think of them as being analogous to database tables cached in memory.
The API needs to be flexible enough to allow data to be retrieved and filtered using XPath or SQL, so component methods exist to determine whether the query passed is XPath or SQL. XPath is applied directly to the XML DOM in memory and in order to use SQL, the XML DOM is first converted to a ColdFusion query object and then queried using Query of Queries. Any call to the API to extract data can retrieve that data as XML or as a ColdFusion query.
Another challenge in developing the API was how to physically define and store the data used in applications. I broke the data table definition task between three XML files: one that defines table definitions (column names, data types, default values, etc.); one that maps the definitions to the actual storage locations (as relative or absolute path or URL) so that the API would know where to find the data; and one XML file that contains the data itself. I chose this architecture so that developers can easily write validation routines (the data type and required properties of data table columns aren't actually used), share data table definitions between applications, etc.
In addition to the ability to retrieve data mentioned above, methods were also written to parse SQL statements that perform INSERT, UPDATE, or DELETE operations on data tables. Unlike when a SELECT statement is passed to the API, converting the XML data table to a ColdFusion query object will do no good for INSERT, UPDATE, and DELETE commands, as ColdFusion does not support these SQL statements in a Query of Queries.
Instead, the SQL is broken into its various components and then executed against the appropriate XML nodes directly. In the case of DELETE commands, working with the data directly as XML proved more efficient than converting the XML to a query object, retrieving the data not being deleted, and converting the new query object back to XML.
In addition to working with XML, the API needed to support CSV so a method was added to parse CSV text and convert it to an XML table in memory. The first row of values in the CSV content is used to create a data table definition, and all other rows populate that definition. Other methods were also added for validation of various entities being used, to make debugging easier, etc. Two other major concerns while developing the API were how to create an easy way for all developers to use the API, and how to handle concurrency issues.
In order to deal with data table memory and physical file concurrency issues, all data retrieval is performed within "read-only" named locks. When row(s) of data are inserted, updated, or deleted from a data table, the data table in memory is first manipulated within an "exclusive" named lock. Afterwards, the entire data table is written to file as an XML string, also within an "exclusive" named lock. This approach minimizes locking on the server, and prevents developers from having to lock API access in their applications, because the API is handling all of the locking - local to the code blocks that require locks.
To make the API easier for developers to use, I wrote a custom tag "wrapper" for the API CFC. The idea behind the custom tag was to give users the ability to use syntax similar to what they are already used to with the <cfquery> tag in order to query the DFA API data for their application. Like <cfquery>, when retrieving data a "name" attribute is passed to assign a name to the result set returned (may be in query or XML format). A "returntype" attribute is used to specify whether to return the results of a query as XML or a ColdFusion query object.
Also similar to <cfquery>, the SQL to SELECT data is passed as the contents of the tag. An XPath attribute is used to select data using an XPath query. Rather than passing a "datasource" name, the tag accepts a "datatable" attribute in order to determine what data table to apply XPath queries to (SQL queries simply name the data table in the SQL). There is also an "XSLT" attribute for performing XSL Transformations on the data (the attribute value is either the location or contents of an XSL stylesheet) and a "CSV" attribute for passing the location of a CSV file (or CSV content) to be parsed into a data table.
Within the tag body a SQL INSERT, UPDATE, SELECT, or DELETE command may be passed, as well as a DROP command to remove a data table from memory, and a SAVE command for committing a data table already in memory to file. The tag itself creates a DFA API instance in the application scope if one doesn't already exist (in start mode), and performs all of its "work" in end mode. The only thing required to use the tag is the existence of three request scope variables that store the locations of the DFA API component, the location of the XML file that defines data table "columns," and the location of the XML file that "maps" these definitions to physical files.
Though the API was never intended for use in large enterprise-level applications, early tests have yielded surprising performance results. The API definitely does perform very well...exactly how much data or how many concurrent users is too many is something you'll have to test for yourself. I wouldn't be surprised to find that even large-scale solutions can be delivered, driven by the API rather than by a traditional database. Even if you decide to stick with the more traditional methods of data storage, I highly recommend looking to the DFA API to serve as an example of how to best architect an API and as an example of how ColdFusion Components, Custom Tags, Query of Queries, and XML Parsing functionality in ColdFusion MX can be combined to achieve amazing results in your applications.
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Today's Top Reads