Operations used for this purpose include conforming operations, which change the form of a schema. In this paper, a set of primitive conforming operations for Object-Oriented schemas are presented. The work described in this article arises from two needs. First, there is still a need for providing more sophisticated database systems than just relational ones. Secondly, there is a growing need for distributed databases. These needs are adressed by fragmenting schemata of a generic object data model and providing an architecture for its implementation.
Key features of the architecture are the use of abstract communicating agents to realize database transactions and queries, the use of an extended remote procedure call to enable remote agents to communicate with one another, and the use of multi-level transactions. Linguistic reflection is used to map database schemata to the level of the agents. Transparency for the users is achieved by using dialogue objects, which are extended views on the database. Over the past decades many papers have been published about the effects of Information Technology IT on organisations.
However despite the facts that IT has become a fundamental variable for organisational design very few studies have been done to explore this vital issue in a systematic and convincing fashion. The small amount of information and few theories available on the effects of IT on organisational design is surprising. Also one major efficiency of previous studies is the lack of empirical evidence. This has led researchers to describe IT in general ways and resulted in different and very often contradictory findings.
Many researchers have become very concerned about the shortfall of comprehensive study on organizational design and IT which has been apparent for decades. One objective of this research is to fill this gap. This study will investigate three questions, aiming to develop a theoretical framework to evaluate the effects of IT on organisational design,. What are the effects of IT on organisational design variables? How IT influences organisational design variables? Which effects are resulted from which IT technologies?
These could be considered as the most important features of this study, which are different with respect to previous literature. In this paper, we define two patterns that fall under the category of the architectural patterns described in Shaw, , to provide solutions for client-server applications. The first pattern defines the structure of a client-server application by defining the server's functionality in the form of standardized services, and the second defines the structure of a service in this type of application.
This is a project for developing an asynchronous approach to distributed job execution of legacy code. A job execution environment is a set of tools used to run jobs, generated to execute a legacy code, and handles different input and output values for each run. Current job execution and problem solving environments are mostly based on synchronous messaging and customized API that needs to be ported to different platforms. The environment allows the execution of computational algorithms utilizing standard Internet technologies such as Java, XML, and asynchronous communication protocols.
It has been tested successfully using several legacy simulation codes on pools of Windows and Solaris systems. Most database applications capture their data using graphical forms. Text fields have limited size and predefined types. Although data in fields are associated with constrains, it should be modeled in a suitable way to conform to a rigid schema. Unfortunately, too much constrains on data are not convenient in human activities where most activities are document-centric.
- A Fit of Hissy: a schlockumentary;
In fact, documents become a natural way for human production and consumption. Nowadays, an increased interest is put on managing data with irregular structures, exchanging documents over the net, and manipulating their contents as efficiently as with structured data. It ensures flexible and well-adapted information capture based on a Document User Interface and at the same time information retrieval based on databases. DRUID relies on a wrapper that transforms documents contents into relevant data. Also, it provides an expressive specification language for end-users to write domain-related extraction patterns.
We validate our information system with a prototype of different modules, the primary realization is promising for a wide range of applications that use documents as a mean to store, exchange and query information. This paper is intended to introduce the Database Grid, an Internet oriented resource management architecture for database resource.
We identify the basic requirements on database in two major application domains: Next, we illustrate how a layered service architecture can fulfil these emerging data sharing and data management requirements from Grid computing application. We introduce a series of protocols to define the proposed services. The problem of creating a global schema over a set of heterogeneous databases is becoming more and more important due the availability of multiple databases within organizations.
The global schema should provide a unified representation of local possibly heterogeneous local schemas by analyzing them to exploit their semantic contents , resolving semantic and schematic discrepancies among them, and producing a set of mapping functions that translate queries posed on the global schema to queries posed on the local schemas.
In this paper, we provide a general framework that supports the integration of local schemas into a global one. The framework takes into consideration the fact that local schemas are autonomous and may evolve over time, which makes the definition of the global schema obsolete. We define a set of integration operators that integrates local schemas, based on the semantic relevance of their classes, into a set of virtual classes that constitute the global schema. We also define a set of modifications that can be applied to local schemas as a consequence of their local autonomy. For every local modification, we define a propagation rule that will automatically disseminate the effects of that modification to the global schema without having to regenerate it from scratch via integration.
Intranet information retrieval is very important for corporations in business. They are trying to discover the useful knowledge from hidden web pages by using data mining, knowledge discovery and so on. In this process, search engine is useful. However, conventional search engines, which are based on centralized architecture, are not suited for intranet information retrieval because intranet information is frequently updated.
Centralized search engines take a long time to collect web pages by crawler, robots and so on. So, we have developed a distributed search engine, called Cooperative Search Engine CSE , in order to retrieve fresh information. In CSE, a local search engine located in each Web server makes an index of local pages. And, a Meta search server integrates these local search engines in order to realize a global search engine. In such a way, the communication delay occurs at retrieval time. So, we have developed several speedup techniques in order to realize fast retrieval.
As this result, we have succeeded in increasing the scalability of CSE. In this paper, we describe speedup techniques and evaluate them. This paper describes an English-Chinese cross language patent retrieval system built on a commercial database management software. The system makes use of various software products and lexical resources for the purpose of helping English native speakers to search for Chinese patent information. This paper reports the overall system design and cross language information retrieval CLIR experiments conducted for performance evaluation.
The experimental results and the follow-up analysis demonstrated that commercial database systems could be used as an IR system with reasonable performance. Better performance could be achieved if the translation resources were customized to the document collection of the system, or more sophisticated translation disambiguation strategies were applied. In this paper we describe a technique for implementing compensating transactions, based on the active database concept of triggers. This technique enables specification and enforcement of compensation logic in a manner that facilitates consistent and semi-automatic compensation.
A web service, with its loosely-coupled nature and autonomy requirements, represents an environment well suited for this compensation mechanism. Open network can be used for many purposes, e-commerce or e-government, etc. Different from those conventional applications, we consider networked collaborative activities, for example networked research activities.
This application might be very useful and research activities could be significantly promoted. However, we must care about many security problems.
Among those problems, we focus on an architecture of a secure database in this paper. The design of such an architecture is not a trivial task, since the data sets in database could be composed of wide range of data types, and each data type needs to satisfy its own security properties, including not only security but also an appropriate management of intellectual-property right, and so on. Thus, we design an architecture of a secure database, considering data types and various security operations.
One of the most important problems encountered by the cooperation among distributed infomation systems is that of heterogeneity that is often not easy to deal with. This problem requires the use of the best combination of software and hardware components for each organization. However, the few suggested approaches for managing virtual factories have not led to satisfaction. Along with motivating the importance of such systems, this paper describes the major design goals of agent-based architecture for supporting the cooperation of heterogeneous information systems.
This combination guarantees the interoperability of legacy systems regardless respectiveley of their data models and platforms heterogeneity and, therefore, improves the cooperation process. Examples are given from the supply chains of manufacturing enterprises. A data warehouse is a large centralized repository that stores a collection of data integrated from external data sources EDSs. The purpose of building a data warehouse is: EDSs are autonomous in most of the cases. In a consequence, their content and structure change in time.
UCREL References and Publications
In order to keep the content of a data warehouse up to date, after source data changed, various warehouse refreshing techniques have been developed, mainly based on an incremental view maintenance. A data warehouse will also need refreshing after a schema of an EDS changed. This problem has, however, received little attention so far.
Few approaches have been proposed and they tackle the problem by using mainly temporal extensions to a data warehouse. Such techniques expose their limitations in multi—period quering. Moreover, in order to support predictions of trends by decision makers what—if analysis is often required. For these purposes, multiversion data warehouses seem to be very promising. In this paper we propose a model of a multiversion data warehouse, and show our prototype implementation of such a multiversion data warehouse.
Indeed, in certain real-time applications, incomplete results obtained timely are more interesting than complete results obtained late. When the applications are distributed, DBMSs on which these applications are based have a main problem of managing the transactions concurrency control and commit processes. Since these processes must be done timely such as each transaction meets its deadline , committing transactions timely seems to be the main issue. In this paper, we deal with the global distributed transaction commit and the local concurrency control problems in applications where transactions may be decomposed into a mandatory part and an optional part.
- Books published.
- T is for Trespass (Kinsey Millhone Alphabet series Book 20);
- Der innere Kreis (German Edition).
- And Then Theres Murder.
- Building Competitiveness in Africas Agriculture (Agriculture and Rural Development Series);
- Dont Count Me Out Yet.
In our model, the means to determine these parts is based on a weight parameter which is assigned to each subtransaction. It is used to help the coordinator process to execute the commit phase when a transaction is close to its deadline. An other parameter, the estimated execution time, is used by each participant site in combination with the weight to solve the possible conflicts that may occur between local subtransactions.
Some simulation have made to compare RT-WEP protocol with two other protocols designed to the same purpose. The results have shown that RT-WEP protocol may be applied efficiently in a distributed real-time context by allowing more transactions to meet their deadlines. The pervasive connectivity of the Internet and the powerful architecture of the WWW are changing many market conventions and creating a tremendous opportunity for conducting business on the Internet.
Digital marketplace business models and the advancement of Web related standards are tearing down walls within and between different business artifacts and entities at all granularities and at all levels, from devices, operating systems and middleware to directory, data, information, application, and finally the business processes. As a matter of fact, business process integration BPI , which entails the integration of all the facets of business artifacts and entities, is emerging as a key IT challenge.
In this paper, we describe our effort in exploring a new approach to address the complexities of BPI. More specifically, we study how to use a solution template based approach for BPI and explore the validity of this approach with a frequently encountered integration problem, the item synchronization problem for large enterprises. The proposed approach can greatly reduce the complexities of the business integration task and reduce the time and amount of effort of the system integrators.
Different customers are deploying the described Item Synchronization system. A major problem that arises from integrating different databases is the existence of duplicates. Data cleaning is the process for identifying two or more records within the database, which represent the same real world object duplicates , so that a unique representation for each object is adopted. Existing data cleaning techniques rely heavily on full or partial domain knowledge.
This paper proposes a positional algorithm that achieves domain independent de-duplication at the attribute level. The paper also proposes a technique for field weighting through data profiling, which, when used with the positional algorithm, achieves domain-independent cleaning at the record level. Experiments show that the positional algorithm achieves more accurate de-duplication than existing algorithms. Data integration systems are planned to offer uniform access to data from heterogeneous and distributed sources.
Two basic approaches have been proposed in the literature to provide integrated access to multiple data sources. In the materialized approach, data are previously accessed, cleaned, integrated and stored in the data warehouse and the queries submitted to the integration system are evaluated in this repository without direct access to the data sources. In the virtual approach, the queries posed to the integration system are decomposed into queries addressed directly to the sources. The data obtained from the sources are integrated and returned to the user.
In this work we present a data integration environment to integrate data distributed on multiple web data sources which combines features of both approaches supporting the execution of virtual and materialized queries. Other distinguished feature of our environment is that we also propose the use of a cache system in order to answer the most frequently asked queries. All these resources are put together with the goal of optimizing the overall query response time. Global query optimization in a multidatabase system MDBS is a challenging issue since some local optimization information such as local cost models may not be available at the global level due to local autonomy.
It becomes even more difficult when dynamic environmental factors are taken into consideration. In our previous work, a qualitative approach was suggested to build so-called multistate cost models to capture the performance behavior of a dynamic multidatabase environment. It has been shown that a multistate cost model can give a good cost estimate for a query run in any contention state in the dynamic environment.
In this paper, we present a technique to perform query optimization based on multistate cost models for a dynamic MDBS. Two relevant algorithms are proposed. The first one selects a set of representative system environmental states for generating an execution plan with multiple versions for a given query at compile time, while the second one efficiently determines the best version to invoke for the query at run time. Experiments demonstrate that the proposed technique is quite promising for performing global query optimization in a dynamic MDBS.
Compared with related work on dynamic query optimization, our approach has an advantage of avoiding the high overhead for modifying or re-generating an execution plan for a query based on dynamic run-time information. These collections may represent application software of scientific areas, they reside in geographically disperse organizations and constitute the system content. The user may invoke on-line computations of scientific datasets when the latter are not found into the system.
Thus, ARION provides the basic infrastructure for accessing and deriving scientific information in an open, distributed and federated system. In this work, we first introduce a list representation in main memory for storing and computing datasets. The sparse transaction dataset is compressed as the empty cells are removed Accordingly we propose a ScanOnce algorithm for association rule mining on the platform of list representation, which just needs to scan the transaction database once to generate all the possible rules. Attributing to its integrity in data structure, the complete itemset counter tree can be stored in a one-dimensional vector without any missing gap, whose direct-addressing capability ensures fast access to any counter.
In our opinion, this new algorithm using list representation economizes storage space and accesses. The experiments show that this ScanOnce algorithm beats classic Apriori algorithm for large problem sizes, by factors ranging from 2 to more than 6. Developing software-in-the-large involves many developers, with experts in various aspects of software development and in various aspects of the application area. This paper presents an approach to integrate software process models in a distributed context. The integration methodology presented allows unifying the various fragments both at the static level as well as at the dynamic level behavioural.
We consider various possible semantic conflicts; formal definitions of the inter-fragments properties are formulated and solutions for these conflicts are proposed. This integration approach provides multiple solutions for the integration conflicts and gives the possibility to improve and design new software process models by a merging of reusable process fragments.
This paper brings together two research areas, i. Data Warehouses and Temporal Databases, involving representation of time. Looking at temporal aspects within a data warehouse, more similarities than differences between temporal databases and data warehouses have been found. The first closeness between these areas consists in the possibility of a data warehouse redefinition in terms of a bitemporal database. A bitemporal storage mechanism is proposed along this paper. In order to meet this goal, a temporal study of data sources is developed. Moreover, we will show how Object-Oriented temporal data models contribute to add the integration and subject-orientation that is required by a data warehouse.
An organisation must enable to share knowledge and information within its employees to optimise their tasks. However, the volume of information contained in documents represents a major importance for these companies. Indeed, companies may be fully reactive to any new information and must follow the fast evolution of spread information. So, a documentary memory, which store this information and allow end-user to access or analyse it, constitutes a necessity for every enterprise. We propose, in this paper, the architecture of such a system, based on a document warehouse, allowing the storage of relevant documents and their exploitation via the techniques of information retrieval, factual data interrogation and information multidimensional analysis.
In order to meet their temporal constraints, current applications such as Web-based services and electronic commerce use the technique of data replication. To take the replication benefit, we need to develop con-currency control mechanisms with high performance even when the distributed system is overloaded.
In this paper, we present a protocol that uses a new notion called importance value which is associated with each real-time transaction. Under conditions of overload, this value is used to select the most important transactions with respect to the application transactions in order to pursue their execution ; the other transactions are aborted. Our protocol RCCOS Replica Concurrency-Control for Overloaded Systems augments the protocol MIR-ROR, a concurrency control protocol designed for firm-deadline applications operating on replicated real-time databases in order to manage efficiently transactions when the distributed system is overloaded.
A platform has been developped to measure the number of transactions that meet their deadlines when the processor load of each site is controlled. Christie Ezeife, Pinakpani Dey Abstract: Horizontal fragments of a class in an object-oriented database system contain subsets of the class extent or instance objects.
These fragments are created with a set of system input data consisting of the application queries, their access frequencies, the object database schema with components - class inheritance and class composition hierarchies as well as instance objects of classes. When these system input to the fragmentation process change enough to affect system performance, a re-fragmentation is usually done from scratch.
This paper proposes an incremental re-fragmentation method that uses mostly the updated part of input data and previous fragments to define new fragments more quickly, saving system resources and making the data at distributed sites more available for network and web access.
This paper presents research in Geographic Information Systems interoperability. Also, paper describes our work in development, introduces interoperability framework called GeoNis, which uses proposed technologies to perform integration task between GIS applications and legacy data sources over the Internet.
Our approach provides integration of distributed GIS data sources and legacy information systems in local community environment. Large organizations have disparate legacy systems, applications, processes, and data sources, which interact by means of various kinds of interconnections. Merging of companies can increase the complexity of system integration, with the need to integrate applications like Enterprise Resource Planning and Customer Relationship Management.
Even if sometimes these applications provide a kind of access to their underlying data and business logic, Enterprise Application Integration EAI is still a challenge. In this paper we analyse the needs that drive EAI with the aim of identifying the features that EAI platforms must exhibit to enable companies to compete in the new business scenarios. We discuss the limitations of current EAI platforms and their evaluation methods, mainly economies of scale and economies of scope, and argue that a shift is needed towards the economies of learning model. Finally, we outline an EAI architecture that addresses current limitations enabling economies of learning.
Enterprise Resource Planning Systems ERPs are large, complex enterprise-wide information system that offer benefits of integration and data-richness to organisations. This paper explores the quality issue of response times, and the impact of poor response times on the ability of the organisation studied to achieve their strategy. The PeopleSoft ERP was implemented within the International Centre for international student recruitment and support at an Australian University, as part of a University-wide implementation.
To achieve the goal of increased international student enrolments, fast turnaround times on student applications are critical. Due to the number of parties participating in the design phase of an automation project, various design, engineering and operational systems are needed. At the moment, the means to transfer information from one system to another system, so that it can be further processed or reused, are not efficient.
An integration approach in which XML technologies are utilized for implementing systems integration is introduced. Data content of systems are defined by XML Schema instances. XML messages containing automation design information are transformed using transformation stylesheets employing a generic standard vocabulary.
Loosely coupled, platform independent, data content-oriented integration is enabled by XML technologies. A case study that proceeds according to the approach is also described. It consists of both a software prototype responsible for communication and data content including XML Schema instances and transformation stylesheets for the systems covered in the study. It is found that XML technologies seem to be a part of the right solution. However, some issues related to schema design and transformations are problematic.
If complex systems are integrated, XML technologies alone are not sufficient. Future developments include a general purpose web-service solution that is to answer questions that were not dealt with by this case study. The Global-As-View approach to data integration has focused on the semi-automatic definition of a global schema starting from a given set of known information sources. In this paper, we investigate how to employ concepts and techniques to model imprecision in defining mappings between the global schema and the source schemas and to answer queries posed over the global schema.
We propose an extended relational algebra using fuzzy sets for defining SQL-like query mappings. Such mappings explicitly take into account the similarities between global and source schemas to discard source data items with low similarity and to express the relevance of different sources in populating the global schema. In the case the global schema is not materialized, we propose a query rewriting technique for expressing over the sources the queries posed over the global schema Title: Data warehouses are currently given a lot of attention; both by academics and practitioners, and the amount of literature describing different aspects of data warehousing is ever-increasing.
Much of this literature is covering the characteristics and the origin of the data in the data warehouse and the importance of external data is often pinpointed. Still, the descriptions of external data are on a general level and the extent of external data usage is not given much attention. Therefore, in this paper, we describe the results of an interview study, partly aimed at outlining the current usage of external data in data warehouses. The study was directed towards Swedish data warehouse developers and the results shows that the usage of external data in data warehouses is not as frequent as expected.
Reasons given for rather low usage were problems on assuring the quality of the external data and lack of data warehouse maturity amongst the user organizations. Distributed databases offer a complete range of desirable features: However, all of these benefits are at the expense of some extra management; main issues considered in literature as the base of a tuned distributed database system could be data replication and synchronization, concurrency access, distributed query optimization or performance improvement.
Work presented here tries to provide some clues to the last point considering an issue which has not been taken enough into account under our humble opinion: It is tried to be shown how the right load balancing policy influences the performance of a distributed database management system, and more concretely a shared-nothing one. OO Conceptual models are key artifacts produced at these early phases, which cover not only static aspects but also dynamic aspects.
Therefore, focusing on quality aspects of conceptual models could contribute to produce better quality OOSS. While quality aspects of structural diagrams, such as class diagrams, have being widely researched, the quality of behavioural diagrams such as statechart diagrams have been neglected. This fact leaded us to define a set of metrics for measuring their structural complexity. In order to gather empirical evidence that the structural complexity of statechart diagrams are closed with their understandability we carried out a controlled experiment in a previous work.
The aim of this paper is to present a replication of that experiment. The findings obtained in the replication corroborate the results of the first experiment in the sense that at some extent, the number of transitions, the number of states and the number of activities influence statechart diagrams understandability. To achieve the interoperation of heterogeneous data sources with respect to their context and rich semantics keeps yet a real challenge.
Users need to integrate useful information and query coupled data sources in a transparent way. We propose a solution to help the integration of heterogeneous sources according to their context. We present a model to define contextual information associated to local data and a mechanism which uses this semantics to compare local contexts and integrate relevant data.
Our contextual integration approach, using a rule based language, allows us to build virtual objects in a semi-automatic way. They play roles of transparent interfaces for end-users. Building a data warehouse involves complex details of analysis and design of an enterprise-having wide decision support system. Dimensional modeling can be used to design effective and usable data warehouses. The paper highlights the steps in the implementation of data warehouse in a client project.
All the observations and phases mentioned in this document are with reference to the project carried out for medium-to-large multi-dimensional databases for a client in a controlled test environment. The recommendations, conclusions and observations made in this document may not be generalized for all cases unless verified and tested.
As the need to store large quantities of increasingly complex XML documents augments, the requirements for database products that claim to support XML also increases. For example, it is no longer acceptable to store XML documents without using indices for efficient retrieval of large collections. In this paper we analyse the current versions of products representing the three main approaches to XML storage: Several products are analysed and compared, including performance tests.
Our main conclusion is that the market urgently needs a standard query language and API, analogous to SQL and ODBC, which were probably the main drivers for the success of relational databases. Currently, there are multiple different classifications for product descriptions used in enterprise-internal applications and cross-enterprise applications, e. A key problem is to run applications developed for one catalogue on product descriptions that are stored in a different classification.
A common solution is that a catalogue specialist manually maps different classifications onto each other. Our approach avoids unnecessary manual work for mapping and automatically generates mappings between different classifications wherever possible. This allows us to run E-procurement applications on different catalogues with a fairly reduced manual work needed for mapping, what we consider to be an important step towards enterprise application integration.
Today, XML is the format of choice to implement interoperability between systems. A Data Warehouse is a central repository of integrated information available for the purpose of efficient decision support or OLAP queries. One of the important decisions when designing a data warehouse is the selection of views to materialize and maintain in a data warehouse.
The goal is to select an appropriate set of materialized views so as to minimize the total query response time and the cost of maintaining the selected views under the constraint of a given total view maintenance time. The performance and behavior of the Greedy Algorithm considering the maintenance costs GAm and the proposed Greedy Interchange Algorithm considering maintenance cost GIAm are examined through experimentation. An enhancement to the GIAm is proposed, the enhancement introduced depends on selecting a subset of views to which the GIA is applied rather than all the views of a view graph.
This selection is based upon views dependencies and result in substantial run time. Developers need to find the best set of components that implements most of required features. Retrieving components manually can be very complicated and time expensive. Tools that partially automate this task help developers to build better systems with less effort. This paper proposes a methodology for ranking and selecting components to build an entire system instead of retrieving just a single component. Data warehousing is an essential element of decision support. In order to supply a decisional database, meta-data is needed to enable the communication between various function areas of the warehouse and, an ETL tool Extraction, Transformation, and Load is needed to define the warehousing process.
The developers use a mapping guideline to specify the ETL tool with the mapping expression of each attribute. In this paper, we will define a model covering different types of mapping expressions. We will use this model to create an active ETL tool. In our approach, we use queries to achieve the warehousing process. SQL queries will be used to represent the mapping between the source and the target data.
Thus, we allow DBMS to play an expanded role as a data transformation engine as well as a data store. This approach enables a complete interaction between mapping meta-data and the warehousing tool. In addition, this paper investigates the efficiency of a Query-based data warehousing tool. It describes a query generator for reusable and more efficient data warehouse DW processing. Besides exposing the advantages of this approach, this paper shows a case study based on real scale commercial data to verify our tool features. Algorithms for validation play a crucial role in the use of XML as the standard for interchanging data among heterogeneous databases on the Web.
Although much effort has been made for formalizing the treatment of elements, attributes have been neglected. This paper presents a validation model for XML documents that takes into account the element and attribute constraints imposed by a given DTD. Our main contribution is the introduction of a new formalism to deal with both kinds of constraints. We deem that our formalism has interesting characteristics: Moreover, our formalism can be implemented easily, giving rise to an efficient validation method. The Web has an ever-changing technological landscape.
UCREL References and Publications
Standards and techniques utilized for the implementation of Web Applications as well as the platforms on which they are deployed are subject to constant changes. In order to develop Web-Applications in a structured and systematic manner regardless of this dynamics a clear development methodology considering the flexibility and extensibility as central goals is needed. This paper proposes a definition of the term Web-Application and a conceptual architectural framework for Web-Applications. Besides this some important characteristics of such a framework will be investigated and a construction methodology will be presented.
Descriptive knowledge about a multivalued data table or Information System can be expressed in declarative form by means of a binary Boolean based language. This paper presents a contribution to the study of an arbitrary multivalued Information System by introducing a non-binary array algebra that allows the treatment of multiple valued data tables with systematic algebraic techniques. An Information System can be described by severeal distinct, but equivalent, array expressions. Among these, the all-prime-ar expression is singled out. The all-prime-ar expression is a unique expression, although it is not necessarily minimum in the number of prime-ars.
Finally, a completely intensional technique that determines a cover, a minimal prime-ar expression is presented. The aim of this paper is to present a middleware that combines the flexibility of distributed heterogeneous databases with the performance of local data access. The middleware will support both XML and relational database paradigms and applies Grid security techniques. The computing and database access facilities are implemented using Grid and Java technologies. In our system, data can be accessed in the same way independently of its location, storage system, and even storage format.
The system will also support distributed queries and transaction management over heterogeneous databases. Our system can be utilised in many applications related to storing, retrieval, and analysis of information. Because of advanced security components, e-commerce is a potentical application area, too. The implementation is based on the principle that each node on the computing grid containing a database contains also a Java agent. The database requests are first sent to the agent which takes care of security tasks, possibly does some preprocessing or translation to the query, and finally transmits the request to the database system.
The agents also take care of distributed transaction management. The system does not have a special master but each agent has a capability to handle distributed transactions by sending requests to other agents. A definition of types in an information system is given from real-world abstractions through constructs employed for data and function descriptions through data schema and definitions to the physical data values held on disk.
This four-level architecture of types is considered from the real-world interpretation of the types and the level-pairs between types: The theory suggests that four levels are sufficient to provide ultimate closure for computational types to construct information systems. The Godement calculus can be used to compose mappings at different levels.
IRDS appears to be the more open at the top level but does not support two-way mappings.
Geo-referenced data is acquired and then postponed into an existing GIS. With the advent of mobile computing devices, namelly personal Digital Assistants PDAs , this integration task is sure to be avoided. This way, the task of updating geo-referenced data could be done on-site, in the palce were the data is to be acquired, and the integration in the GIS could be done automatically. In order to have the system coping with many different applications, we decided to provide a transformer from and to GML, the OGC proposed standard. The model we define organises data in a constellation of facts and dimensions with multiple hierarchies.
In order to insure data consistence and reliable data manipulation, we extend this constellation model by intra- and inter-dimension constraints. The intra-dimension constraints allow the definition of exclusions and inclusions between hierarchies of same dimension. The inter-dimension constraints are related to hierarchies of different dimensions.
Also, we study effects of these constraints on multidimensional operations. The revolution in computing brought about by the Internet is changing the nature of computing from a personalized computing environment to a ubiquitous computing environment in which both data and computational resources are network-distributed. Client-server communications protocols permit parallel ad hoc queries of frequently-updated databases, but they do not provide the functionality to automatically perform continual queries to track changes in those data sources through time. The lack of persistence of the state of data resources requires users to repeatedly query databases and manually compare the results of searches through time.
To date, continual query systems have lacked both external and internal scalability. Herein we describe CQServer, a scalable, platform- and implementation-independent system that uses a distributed object infrastructure for heterogeneous enterprise computation of both content- and time-based continual queries. XML is widely used by the database management systems for data representation and transportation of data.
In this paper we focus on the integration of latest W3C XML Schema specifications and hash maps for performance and efficient retrieval of objects from XML documents and transforming them into heterogeneous object oriented databases. Besides XML Schema incorporation, this research endeavor also provides new options for the handling of large XML-ized database document size. The ubiquity of the Internet gives organizations the possibility to form virtual alliances. This not only implies that the business transactions must be linked, but also requires that business applications are integrated to support them.
In this paper, we present an integral approach for blending modern business data requirements with existing legacy data resources that offers techniques at both the conceptual and the implementation level. The reverse engineering strategy allows modernized business data systems to co-exists with legacy repository systems.
In particular, the methodology aims at constructing a conceptual Federation Enterprise Schema FES for supporting the complicated task of data wrapper integration throughout the development cycle: The FES model plays a pivot role in the creation of the virtual alliances by representing an unified data view for all participants. This unified data model serves as the foundation for the actual integration of the wrapped legacy data systems, possibly with modernized data systems.
Thus, in contrast to other available approaches, the FES is not developed from scratch, but instead, composed out of pre-existing legacy wrappers. Within the framework of an international sporting manifestation which involved 23 countries in 23 disciplines and gathered not less than participators VIP, disciplines and gathered not less than participators VIP, officials, athletes, judges, referees, doctors, journalists, technicians, voluntaries , the central committee of organization was obliged to automatize its activities and to distribute them among 16 committees in order to guarantee especially the best conditions of organization and safety.
Thus, we were called to elaborate a prototype dealing with the transport activity. Due to their analytically oriented and cleansed integration of data from several operational and external data sources, data warehouse systems serve as a substantial technical foundation for decision support. Within the scope of our research we are seeking novel solutions for handling data acquisition within such environments. In this paper we present some aspects of our approach to data acquisition. We briefly sketch our framework and outline the underlying process model.
We show how logic programs may be used to protect secure databases that are accessed via a web interface from the unauthorized retrieval of positive and negative information, and from unauthorized insert and delete requests. To achieve this protection, we use a deductive database expressed in a form that is guaranteed to permit only authorized access requests to be performed. The protection of the positive information that may be retrieved from a database and the information that may be inserted are treated in a uniform way as is the protection of the negative information in the database and the information that may be deleted.
The approach we describe has a number of attractive technical results associated with it, it enables access control information to be seamlessly incorporated into a deductive database, and it enables security information to be used to help to optimize the evaluation of access requests. These properties are particularly useful in the context of a database which is accessed via the internet, since this form of access requires a practical access control method which is both powerful and flexible. We describe our implementation of a web-server front-end to a deductive database which incorporates our access authorization proposals.
As the Internet expands, and the amount of information that we can find on the web grows along with it, the usability of the pages gets more important. Many of the sites still get quite low evaluations from participants when it came to certain aspects of usability. This paper proposes a set of quantitative and qualitative metrics under a usability-centred quality model and an usability testing experiment where this model can be validated. But finally, usability tests may do a great job by showing what is not working in a design, but do not get caught in the trap of asking testers to suggest design improvements because creating Web sites is easy, however, creating sites that truly meet the needs and expectations of the wide range of online users is quite another story.
COPLA is a software tool that provides an object-oriented view of a network of replicated relational databases. It supports a range of consistency protocols, each of which supports different consistency modes. The resulting scenario is a distributed environment where applications may start multiple database sessions, which may use different consistency modes, according to their needs. This paper describes the COPLA platform, its architecture, its support for database replication and one of the consistency algorithms that have been implemented on it.
A system of this kind may be used in the development of applications for companies that have several branch offices, such as banks, hypermarkets, etc. In such settings, several applications typically use on-site generated data in local branches, while other applications also use information generated in other branches and offices. The services provided by COPLA enable an efficient catering for both local and non-local data querying and processing.
Information integration has been an important area of research for many years, and the problem of integration of geographic data has recently emerged. OLAPWare aims to overcome this limitation by allowing its Java clients to query the objects of a dimensional data cube without depending on the chosen implementation platform. Globalisation phenomenon has created a very competitive environment for modern business organisations. In order to survive and continue being competitive in that environment, an organisation has to adapt to it quickly with a minimal negative impact over its current ways of working and organising.
The purpose of this paper is to present a methodological framework for business modelling. A business model is presented as a set of three interrelated models — the Business Goals model, the Business Processes model, and the Information Systems model. The main contribution of our paper is to make visible and explicit the relationships among the three levels: These relationships are commonly hidden or implicit in most business modelling methods. Our proposition has proven its usefulness as a strategic management tool in two studies cases.
The classic problem of information integration has been addressed for a long time. The Semantic Web project is aiming to define an infrastructure that enables machine understanding. This is a vision that tackles the problem of semantic heterogeneity by using ontologies for information sharing. Agents have an important role in this infrastructure. In this paper we present a new solution, known as DIA Data Integration using Agents , for semantic integration using mobile agents and ontologies.
The refreshment of a data warehouse is an important process which determines the effective usability of the data collected and aggregated from the sources. Indeed, the quality of data provided to the decision makers depends on the capability of the data warehouse system to convey in a reasonable time, from the sources to the data marts, the changes made at the data sources. We present our current work related to: We use different approaches to maintain temporal coherency of data gathered from web sources; and wrappers extended with temporal characteristics to keep temporal consistency.
Cognition Technology and Work , Volume 7, number 2, pp. Information Systems Frontiers Journal. Volume 4, Issue 3, Kluwer, Netherlands, pp. Dix 1 Supporting information evolution on the WWW. Social differentiation in the use of English vocabulary: Volume 2, number 1. Core vocabulary for learners. Word Frequencies in Written and Spoken English: University of Birmingham, July Peter Lang , Frankfurt.
UCREL technical paper number ISBN 1 5. Corpus Linguistics and the Languages of the World. ISBN 3 8. Proceedings of the Corpus Linguistics conference. Linguistics Department, Lancaster University. ISBN 1 2. Lecture Notes in Computer Science , pp. New trends in corpus linguistics for translation studies. Quantitative analysis of translation revision: A tool for dealing with spelling variation in historical corpora. Exploring children's written language production: January , Lancaster University. A framework for experimenting with different NLP techniques.
Semantics-Based composition for aspect-oriented requirements engineering. Tagging historical corpora - the problem of spelling variation. International Entrepreneurship - from local to global enterprise creation and development. Developing a Russian semantic tagger for automatic semantic annotation. In proceedings of Corpus Linguistics , St. Petersburg, Russia, October , pp. The identification of spelling variants in English and German historical texts: University of Nottingham, UK, June A backward step permits further advance.
A tool suite for aspect-oriented requirements engineering. Automated Semantic Assistance for Translators. English-Russian-Finnish cross-language comparison of phrasal verb translation equivalents. Lancaster University, 8 September An Approach for Identifying Aspects in Requirements. August 29th - September 2nd , Paris, France, pp. Changing English across the twentieth century: Mining Aspects in Requirements. Presented at Early Aspects The phraseology of the verb 'make'.
Digital Library over Peer-to-Peer. Using a semantic tagger as dictionary search tool. Springer-Verlag, Berlin Heidelberg, pp. Extending the Cochran rule for the comparison of word frequencies between corpora. Le poids des mots: A tool for assisting translators using automatic semantic annotation. Extracting Multiword Expressions with a Semantic Tagger. In proceedings of the Workshop on Multiword Expressions: Developing an automated semantic analysis system for Early Modern English.
Porting an English semantic tagger to the Finnish language. Finding Decisions Through Artefacts. In Julie Jacko and Constantine Stephanidis eds. Crete, Greece, June , Lawrence Erlbaum Associates, New Jersey, pp. Personality Engineering for Emotional Interactive Avatars. In Constantine Stephanidis and Julie Jacko eds. Angers - France, April , Assisting requirements engineering with semantic document analysis. Versailles, France, June th, Comparing corpora using frequency profiling.
In proceedings of the workshop on Comparing Corpora , held in conjunction with the 38th annual meeting of the Association for Computational Linguistics ACL Published by University of Namur, pp. ISBN 2 4. In Reidar Conradi ed. Software configuration management supplementary proceedings. ISBN 0 What's in a word-list? Investigating word frequency and keyword extraction. Unlocking Content Through Computational Linguistics.