TeraFlow Testbed TeraFlow Middleware

UDT

UDT, or UDP-based Data Transport (UDT) protocol, is an application level transport protocol designed for distributed data intensive applications. The new protocol is motivated by the growing importance of wide area high-speed optical networks, in which applications employing TCP generally fail to utilize the available bandwidth.

UDT demonstrates 1) good efficiency (it utilizes available bandwidth quickly); 2) good friendliness (UDT is friendly to flows independently of their RTT and also friendly to TCP flows sharing the same bandwidth); and, 3) good fairness (UDT is fair to other UDT based teraflows). UDT is designed to be deployed in high performance computing environments in which a small number of teraflows share bandwidth with each other along with accompanying TCP control flows. It combines both rate-based and window-based control and uses bandwidth estimation to determine the control parameters automatically.

UDT is open source and available from Source Forge. The current release is Version 2. Detailed information on UDT can be found at udt.sf.net

Version 3 will be developed using a framework for high performance protocol development called the Composible Protocol Development Framework (CPDF). Using CPDF, protocols with varying and specialized congestion control mechanisms can be developed easily.

SOAP*

SOAP* is an open source library for high performance web services. SOAP* combines a TCP/XML based control channel with a separate data channel that can employ: 1) specialized protocols such as UDT; and 2) alternatives to XML that provide greater efficiency for large data sets.

An open source implementation of SOAP* is available from Source Forge as part of the DataSpace Transfer Protocol (DSTP) framework. DSTP is a framework designed for exploring and analyzing remote and distributed data. SOAP* web services using the current version of DSTP have been used successfully in applications employing 1 Gbps data flows.

The next release of SOAP* will be independent of DSTP and will be designed to scale to 10 Gbps data flows.

Teraflow services for processing, exploring, and analyzing are generally built over SOAP*.

High Performance Scoring Engines

The Predictive Model Markup Language or PMML is an open standard for statistical and data mining models that is supported by over two dozen vendors. Traditionally, deploying data mining into operational systems has been very labor intensive. This is especially true of high performance or distributed applications.

Over the past few years, light weight, high performance PMML-based scoring engines have been developed. Once integrated into an operational system, a new statistical model can be deployed simply by updating the PMML file. This is beginning to change dramatically how statistical models are deployed.

High performance open source scoring engines are an important part of the teraflow services middleware. An initial version of a scoring engine is available now and a full release is expected in the 3Q 2005.

telephone (312) 996-0305
e-mail staff@teraflowtestbed.net
address 700 SEO MC 249, 851 S. Morgan St. Chicago, IL. 60607