• Login
    View Item 
    •   Research Bank Home
    • Study Areas
    • Computing
    • Computing Journal Articles
    • View Item
    •   Research Bank Home
    • Study Areas
    • Computing
    • Computing Journal Articles
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    An efficient stream-based join to process end user transactions in real-time data warehousing

    Jamil, Noreen

    Thumbnail
    Share
    View fulltext online
    5.pdf (408.9Kb)
    Date
    2014-06
    Citation:
    Jamil, N. (2014). An Efficient Stream-based Join to Process End User Transactions in Real- Time Data Warehousing. Journal of Digital Information Management, 12, pp.201-215.
    Permanent link to Research Bank record:
    https://hdl.handle.net/10652/4069
    Abstract
    In the field of real-time data warehousing semistream processing has become a potential area of research since last one decade. One important operation in semi-stream processing is to join stream data with a slowly changing diskbased master data. A join operator is usually required to implement this operation. This join operator typically works under limited main memory and this memory is generally not large enough to hold the whole disk-based master data. Recently, a seminal join algorithm called MESHJOIN (Mesh Join) has been proposed in the literature to process semistream data. MESHJOIN is a candidate for a resource-aware system setup. However, MESHJOIN is not very selective. In particular, MESHJOIN does not consider the characteristics of stream data and its performance is suboptimal for skewed stream data. In this paper we propose a novel Semi-Stream Join (SSJ) using a new cache module. The algorithm is more appropriate for skewed distributions, and we present results for Zipfian distributions of the type that appears in many applications. We present the cost model for our SSJ and validate it with experiments. Based on the cost model we also tune the algorithm up to a maximum performance. We conduct a rigorous experimental study to test our algorithm. Our experiments show that SSJ outperforms MESHJOIN significantly
    Keywords:
    real-time data warehousing, semi-stream processing, join operator, performance measurement, data processing, information resources management
    ANZSRC Field of Research:
    080109 Pattern Recognition and Data Mining
    Available Online at:
    http://dline.info/fpaper/jdim/v12i3/5.pdf
    Rights:
    This digital work is protected by copyright. It may be consulted by you, provided you comply with the provisions of the Act and the following conditions of use: Any use you make of these documents or images must be for research or private study purposes only, and you may not make them available to any other person. You will recognise the author's and publishers rights and give due acknowledgement where appropriate.
    Metadata
    Show detailed record
    This item appears in
    • Computing Journal Articles [50]

    Library home
    Send Feedback
    Research publications
    Unitec
    Moodle
    © Unitec Institute of Technology, Private Bag 92025, Victoria Street West, Auckland 1142
     

     

    Usage

    Downloads, last 12 months
    50
     
     

    Usage Statistics

    For this itemFor the Research Bank

    Share

    About

    About Research BankResearch at UnitecContact us

    Help for authors  

    How to add researchOpen Access GuideVersions Toolkit

    Register for updates  

    LoginRegister

    Browse Research Bank  

    EverywhereAcademic study areasAuthorDateSubjectTitleType of researchSupervisorThis CollectionAuthorDateSubjectTitleType of researchSupervisor

    Library home
    Send Feedback
    Research publications
    Unitec
    Moodle
    © Unitec Institute of Technology, Private Bag 92025, Victoria Street West, Auckland 1142