The number of documents generated in a construction project and stored in inter organizational information systems is significant. Since a large percentage of these project documents are generated in text format,They provide a common framework to enact document organization and information exchange among project members. Current systems for document management rely on manual classification methods controlled by human experts. Due to the widespread use of information technologies for construction, the increasing availability of electronic documents, and the development of systems based on project object models, manual classification becomes unfeasible. This paper presents a unique way to improve information organization and access in inter organizational systems based on automated classification of construction project documents according to their related project components. Machine learning methods were used for this purpose. A prototype of a document classification system was developed to provide easy deployment and scalability to the classification process.
EWAS is an early warning prototype that collects and analyses news items from the European Media Monitor. Although, it currently processes news articles, it can easily be adapted to any other form of text. Data mining functions performed by the system are categorization, clustering, and named entity extraction. The main design concern of the system is scalability, which is achieved by a modular architecture that allows multiple instances of the same component to run in parallel. The main objective behind this research is to develop a system to identify “who is doing what” to estimate “who can do what” in order to predict “what can really happen tomorrow”
Textual databases are useful sources of information and knowledge and if these are well utilised then issues related to future project management and product or service quality improvement may be resolved. A large part of corporate information, approximately 80%, is available in textual data formats.
Text Classification techniques are well known for managing on-line sources of digital documents. The identification of key issues discussed within textual data and their classification into two different classes could help decision makers or knowledge workers to manage their future activities better.Compared with structured data sources that are usually stored and analysed in spread sheets, relational databases, and single data tables, unstructured construction data sources such as text documents, site images, web pages, and project schedules have been less intensively studied due to additional challenges in data preparation, representation, and analysis. In this paper, our vision for data management and mining addressing such challenges are presented, together with related research results from previous work, as well as our recent developments of data mining on text-based, web-based, image-based, and network-based construction databases
Source by jacksonbird