Traditionally, data warehouses take too long to build and are too hard to change. WhereScape RED is an Integrated Development Environment, designed to support the quick and simple building and managing of data warehouses.
It has the flexibility to enable you to build a variety of architectures including:
- Enterprise data warehouses
- Dimensional data warehouses
- Data marts
- User-facing views, aggregates, and summaries
In all cases, the core values of WhereScape RED are twofold: it's rapid building capabilities that enable better data warehouses to be built, faster, and its integrated environment that simplifies management.
As a data warehouse-specific tool, WhereScape RED embodies a simple, pragmatic approach to building data warehouses. With WhereScape RED you specify what you want to achieve by dragging and dropping objects to create a meta view, and then let WhereScape RED do the heavy lifting of creating the necessary tables, procedures, documentation, etc. Data warehouse wizards prompt for additional information at critical points to provide the maximum value from the generated objects.
Different data warehousing approaches including agile, prototyping, and waterfall are supported by WhereScape RED. Agile developers will find specific functionality that has been included to support common agile practices. For developers who are new to data warehousing, or are looking for a fast, pragmatic approach, WhereScape's Pragmatic Data Warehousing Methodology can be used.
The basic concepts behind WhereScape 's Pragmatic Data Warehousing Methodology are the following:
- Minimize the impact on the source systems
- Centralize management within the data warehouse
- Store transactional data at the lowest practical grain within the data warehouse
- Snapshot, combine, and rollup transactional tables to provide additional value
- Utilize star schemas, views, or cubes for end-user access
- Allow for incremental loads from day one
- Use of an iterative approach
- Simplifying the reconciliation of Load tables to source system tables - data is cleaned and transformed in subsequent Stage tables instead
- Design the data warehouse independently from the end-user tool layer
WhereScape RED supports these concepts to facilitate very rapid delivery of data warehouses. WhereScape RED controls the flow of data from the source systems through transforming and modeling layers to analysis areas.
Different styles of data warehousing (e.g. EDW 3NF, dimensional, etc.) are supported and utilize different objects, but all follow the same basic flow.
Data Flow - Enterprise Models
- Source (OLTP) System
- Load tables
- Stage tables
- Data store tables
- Model tables, dimension tables, or detailed (transactional) fact tables
- Roll up fact table(s)
- Aggregate and/or KPI fact table(s)
- Views
- Export objects
- Microsoft Analysis Services cubes
The diagram below shows the objects and the information flow:
Data is moved from source tables to Load tables via scripts, database links, and ODBC links. These Load tables are created by dragging and dropping from a connection object. Load tables are generally based on source system tables. Their main purpose is to be a destination for moving data as simply and quickly as possible from the source system. Load tables generally hold a single unit of data (e.g. last night or last month), and are truncated at the start of each extract. Transformations can be performed on the columns during the load process if required.
Load tables feed Stage tables, which in turn feed Data Store, Model, or Dimension tables. Data from multiple Load tables can be combined at this level.
First-tier transactional tables (Fact or Model) are created and updated from Stage tables. Second-tier tables (Model, Summary Rollup, Aggregate, KPI, etc.) are created and updated from lower-level tables.
Cubes can be created from transactional tables or views.
WhereScape RED generates procedural code in the target database's native language at each stage in the data warehouse build process. The generated code is, in nearly all cases, sufficient to create a rapid prototype of the data warehouse.
While the generation of code is often seen as a key benefit of WhereScape RED, the ability to control and manage custom code is also critical to the long-term management of the data warehouse environment.
In most cases, 85-100% of the generated code is taken through to production with no customization required.
The flow of data from the source systems to data warehouse tables is controlled and managed by the WhereScape RED Scheduler. All generated code includes audit and error-logging logic that is used by the Scheduler. The Scheduler provides a single point of control for the data warehouse. From the Scheduler, the state of all jobs can be ascertained. Any warning or error messages can be investigated and if a problem occurs, the Scheduler controls the restart of the job from the point of failure.
Documenting the data warehouse is often a task left until last, and in many cases done once (if done at all!) and not kept up to date. WhereScape RED generates user and technical documentation, including diagrams in HTML format. Technical documentation includes copies of all current procedures. User documentation includes a glossary of business terms, available independently of any end-user tool. Where additional specific information needs to be included in the documentation, WhereScape RED supports the inclusion of custom HTML pages in the generated output. This means in many cases the entire documentation requirements can be managed from one location, and regenerated as changes occur.
WhereScape RED's core strength is in the rapid building of data warehouse structures. Organizations that have already purchased traditional ETL tools can use WhereScape RED as a pureplay data warehouse toolset. WhereScape RED can be used to iteratively build data marts or presentation layer objects that need to be constantly updated to keep relevant for end users. In most cases, customers will find that WhereScape RED has enough ETL capabilities to build the entire data warehouse, using the database rather than a proprietary engine to perform ETL processing.
The crossover in functionality between ETL tools and WhereScape RED is not large. WhereScape RED is tightly integrated into the data warehouse database and has an embedded data warehouse building approach. For WhereScape, data movement is the start of the process—from the source system to Load tables. The key benefits of the product: development productivity and an integrated environment to manage and maintain your data warehouse, come after the data movement stage.
Where a traditional ETL tool is already in use, the output of the ETL process is a WhereScape RED Load, Stage, Dimension, Fact, or Model table from which WhereScape RED builds more advanced data warehouse structures.