Uses of InfoSleuth Agents
[
Overview |
Application-Supporting Features
]
Overview
Heterogeneous Information Fusion
Heterogeneous information fusion is the ability to combine information
from current or legacy information sources using database query processing
techniques. This type of application is similar to ones addressed by federated
and multidatabase technology, but the agent-based approach is more flexible
and extensible, and adapts well to situations where the underlying set
of information sources being accessed is likely to change over time. An
example of this type of application is EDEN.
Key technical features of InfoSleuth that support these applications are
described below.
Information Subscription, Classification and Analysis
Information subscription allows the user to gather both documents and data
from the web as well as from known data sources, and to further process
those documents and data to narrow down its relevance to the real needs
of hte user. This "narrowing down" may be done, for example, by classifying
retrieved documents using a classification hierarchy, and then enabling
the users to retrieve documents by classified concept. Another type of
"narrowing down" is to analyze the stream of documents for specific trends,
and report when significant changes occur in those trends (along with the
supporting evidence); for example, whether or not a particular topic is
acquiring more or fewer documents than usual over a given period of time.
Alternatively, the agents could be looking for some specific type of event,
and only report when that event has occurred. An example of this type of
application is Technology
Tracking. Key technical features of InfoSleuth that support these types
of applications are described below.
Composition of Legacy Information and Applications
InfoSleuth can be used to control simple pipelined processes that incorporate
external applications such as legacy and COTS applications along with legacy
and current information resources. Both intermediate and final results
may be stored in an intermediary database, and made available to users
and to later processes in the pipeline using either a one-time or a perpetual
query paradigm. These applications may be structured as pipelined workflows
that are controlled by an axternal agent or by internal data-flow mechanisms.Very
long-duration workflows (such as would be used in 7x24 operation) can be
implemented even in a situation where individual agents are unstable. An
example of this type of application is Genome
Mapping. Key technical features of InfoSleuth that support these types
of applications are described below.
Application-Supporting
Features of InfoSleuth
Heterogeneous Information
Fusion Features
Ontology-based information representation: InfoSleuth agents integrate
legacy and heterogeneous information sources using a common ontology. The
ontology is the domain- or user-centric view of the information, shared
among the different applications in the domain and the agents that support
them. In InfoSleuth, ontologies tend to be narrowly-focused on the domain,
and many different ontologies are supported within the system. Ontology-based
integration requires no unified or conceptual schema over all the data
in the underlying data resources, which allows for less constrained information
fusion.
Common view of information: Specialized resource agents provide
seamless, ontology-based access to diverse information sources, which
may be distributed geographically or across enterprises.A resource agent
maps the information it has access to into the terms of the ontology. This
may include mapping among different representations ("value mapping") so
that related information can be put together, such as mapping company names
to a common representation. Sometimes, these complex value mapping procedures
may be required for multiple resources, in which case they are encapsulated
into separate value mapping agents.
Global relational query processing: Other agents, called multiresource
query agents, can gather information from multiple resource angents and
run complex global queries over the gathered data. InfoSleuth supports
all relational query operations over all accessible information. Portal
agents, which provide an interface to user application GUIs, pass queries
specified by the user to the InfoSleuth agents, and forward the responses
back as HTML pages, either directly or via email.
Dynamic set of available resources: New information sources
may be added to the system simply by encapsulating them into resource agents
and starting them. The resource agents advertise the avalability of the
new data, and henceforward any queries that require that data will access
the new information source. Similarly, when an information source becomes
unavailable, the resource agent advertises its unavailability and goes
offline.
Perpetual queries: In addition to submitting one-time queries,
the InfoSleuth agents support a mode where the user can specify a query
to monitor over time. When a perpetual query is set up, the user can select
to either have the result of the query returned periodically, or to receive
an initial result followed by any changes that occur.
User interface: The user may ask one-time or perpetual queries
either through a customized interface or through the TQML browser provided
with InfoSleuth. Either type of interface allows for the annotation of
information with its source, as well as the ability to drill-down into
specific results.
Information
Subscription, Classification and Analysis Features
Sweeping the World Wide Web: Access to the web to retrieve documents
relevant to a topic has been tried using two separate means. The web sweeper
agents have locations for starting URLs and their regions which are known
to have relevant information. All documents in the specified region are
returned.Web sweepers also periodically monitor those areas to pick up
any new or updated documents that may have become available, and to note
any obsolete documents. Secondly, some text agents have used a web search
engine to do keyword-based retrievals, then further processed those documents
to determine relevance to specific ontological concepts.
Classification: Several types of classifiers have been developed
that classify semi- and un-structured documents according to a concept
hierarchy. This classification includes measuring the applicability of
the document to the topic it has been classified under, to characterize
the level of uncertainty in the classification.
Fact extraction: In certain cases, especially when a document
has some known structure, text agents can extract facts from documents
and store this structured information in a separate data table for later
querying or monitoring by users.
Data analysis: InfoSleuth has encapsulated some data analysis
tools into agents. A data analysis agent monitors and analyzes information
provided by the other agents, and stores its results in a new database,
so that the results are accessible to users and to other agents. For example.
the deviation detection agent can analyze a stream of documents for trends
and to determine how it is changing in character (for instance, is a particular
topic becoming more or less popular?).
User interface: Users can query the classification hierarchy,
extracted facts, or data analysis results. The results may be presented
in customized presentations. Through these customized displays, users can
drill down for more detail on specific documents.
Composition of Legacy
Information and Applications Features
Incorporation of legacy and COTS applications: Existing applications
are wrapped as InfoSleuth agents using the analysis agent shell. This shell
takes directives from another agent or from a user concerning the activities
it should be doing on an ongoing bases. These activities read information
from one database, feed the data and the associated commands to the external
application, process the result data and stores it in a second database.
The new results are then available for users or for further processing.
Incorporation of legacy data sources: Resource agents map data
from existing data sources onto the common ontology. Existing data sources
may also include information-generating sources such as sensors and medical
monitors, wrapped with a resource agent shell that enables them to be accessed
using a queryable interface for one-time or perpetual queries.
Pipelined mini-workflows: Mini-workflows are captured as plans
that can be enacted by control agents. Mini-workflows can integrate people,
information resources, information-generating machines and analytic applications.
Depending on the types of processes integrated into the mini-workflow,
steps in the workflow may be executed by single, long-running processes
or by a series of short invocations of the processes as needed or when
input is available.
User interface: Users can define simple processes using the
prototype process definition tool. Results may be accessible via customized
presentations or via a generic table representation.
|