A Unified Database Interface for S
David A James, (Bell Labs, Lucent Technologies), firstname.lastname@example.org
Distributed statistical computing with S (both S-Plus and R) is in its early stages and thus able to benefit greatly from advances made in areas of general distributed computing. One such area is the integration of S and database systems. In this paper we describe a unified database API to relational databases from S in the context of distributed computing and some of its driver implementations. This S interface or API is similar to existing interfaces in languages such as Python, Perl, Java, etc. In addition to the convenience of a common API, the S database interface serves as a platform for ``attaching'' database objects to the S namespace, for handling data sets much larger than what can be held in memory, and for building distributed applications. We show some of its uses through a case study of network monitoring that requires the incremental quantile estimation of a large number of groups in almost real time. We close with some remarks about the hurdles we need to overcome in order to migrate from a client-server to a more general n-tier distributed architecture, such as CORBA and Microsoft's ActiveX Data Objects.