Pending Projections: Future Claims About the Future The Pending Projections dataset consists of all those rows in an asserted version table which lie in both the assertion time future an
Trang 1Pending Updates exist in what we called, in the previous chapter, either the time near future or the assertion-time far future Those in the near future have an assertion begin date close enough to Now() that the business is willing
to let the passage of time make them current Near future deferred assertions would typically have a begin date that will become current in the next few seconds, hours, days or weeks
In a conventional database, pending updates are transactions accumulated in an external batch transaction file, or perhaps
in a batch transaction table within the database
Far future deferred assertions are the internalization of data located in what are often called staging areas They are collections
of data that are usually more complicated than usual to update By placing them in far future assertion time, we guarantee that they will not inadvertently become current assertions simply because
of the passage of time They can become current assertions only when, presumably after a review-and-approve process, the busi-ness releases them into near-future assertion time
Pending Projections: Future Claims About the Future
The Pending Projections dataset consists of all those rows in
an asserted version table which lie in both the assertion time future and in the effective time future Its subject matter is things as they may turn out to be Its rows are claims about what currently lies in the future, but claims which we are not yet willing to make Pending Projections are a record of what we may eventually be willing to say things are going to be like Here is the view which re-presents Pending Projections With the suffix “Pend_Proj” standing for “pending projections”, it looks like this:
CREATE VIEW Policy_Pend_Proj
AS SELECT oid, asr_beg_dt, asr_end_dt, eff_beg_dt, eff_end_dt, client, type, copay
what things used to be like
what we used to claim what we currently claim
what we will claim things will be like
what we will claim
what things are like what things will be like
Figure 13.12 Pending Projections
306 Chapter 13 RE-PRESENTING INTERNALIZED PIPELINE DATASETS
Trang 2FROM Policy_AV
WHERE asr_beg_dt > Now()
AND eff_beg_dt > Now()
As we have seen with our other re-presented pipeline
datasets, Pending Projections include both the assertion and
effective time period as part of the unique identifier because
both temporal dimensions are specified as ranges, and neither
as points in time
Mirror Images of the Nine-Fold Way
As we said in Chapter 9, effective time exists within assertion
time First, logically speaking, we make a statement about how
things are Next, logically speaking, we make a truth claim about
that statement
Most of our queries against bi-temporal tables will specify a
point in assertion time—most commonly Now()—and then ask
for rows asserted at that point in time that were in effect at
some point or period of effective time For example, we might
ask for all policies that were in effect on August 23, 2008, as we
currently believe them to have been Or we might ask for all
policies which we currently claim were in effect any time in
the first half of 2008
Pinning down a point in assertion time, and then asking for
versions of objects claimed at that point in time to be correct,
is the general form that queries will take when posed by business
users But we can look at bi-temporal data from the opposite
point of view as well We can pin down a point or period in
effec-tive time, and ask for everything we ever asserted about things at
that point in time
It would not be too misleading to call this the auditor’s point
of view From this point of view, we are interested in the history
of our claims about what is true, not in the history of what
actually happened out there in the world Of course, we could
also ask for all future assertions about a given point in effective
time But auditors, by the nature of their work, have little interest
in future assertions By the same token, they are very interested
in past assertions, along with current ones So an auditor’s
mirror-image of the nine categories reduces to a set of six categories,
those shown inFigure 13.13
These views that auditors are interested in are physically the
same ones we have already described The “mirror-image” is in
perspective, not in content
Chapter 13 RE-PRESENTING INTERNALIZED PIPELINE DATASETS 307
Trang 3The Value of Internalizing Pipeline Datasets
The cost of managing physical pipeline datasets is high This cost is seldom discussed because it is universally thought to be just an inevitable cost of doing business Bringing down this cost
is a matter of doing all those various things that IT management has done for decades, and continues to do Quality control pro-cedures are put in place so errors don’t creep into our databases and later have to be backed out The platform costs of storing, transforming, and moving data into and out of pipeline data-sets are controlled by minimizing redundancy, and by moving datasets up and down the storage hierarchy Software that sets
up and runs production schedules minimizes the human costs
of scheduling work involving these pipeline datasets
But the work of managing pipeline datasets is tedious And whenever the management of these datasets is a one-off kind
of thing, i.e whenever the development group has to manage these datasets rather than the IT Operations group that handles scheduled maintenance, errors in managing them are not uncommon
Asserted Versioning does not offer a way to more efficiently manage pipeline datasets It offers a way to eliminate them and, consequently, eliminate the totality of their management costs! There will always be some circumstances in which data must be manipulated in external pipeline datasets But these can become the exception rather than the rule
In place of these pipeline datasets, Asserted Versioning stores the information contained in those pipeline datasets internally, within the production tables that are their sources and destinations Pending transactions can be stored within the production tables themselves Posted transactions can be, too Data staging areas can also exist as semantically distinct sets
of rows, physically contained within production tables Pipeline datasets, then, cease to exist as distinct physical objects They become virtualized, as semantically distinct collections of rows
what things used to be like
what we used to claim what we currently claim
what we used to claim things used to be like
what we currently claim things used to be like what we currently claim things are like now what we currently claim things will be like
what we used to claim things are like now what we used to claim things will be like
what things are like what things will be like
Figure 13.13 The Auditor’s Mirror Image of the Nine-Fold Way
308 Chapter 13 RE-PRESENTING INTERNALIZED PIPELINE DATASETS
Trang 4all physically existing within the same tables, re-presented in
different views
We may think that the principal cost elimination benefit of
internalizing pipeline datasets is that it reduces the number of
distinct datasets that programs, SQL and production scheduling
software have to identify and manage This is a reduction in the
cost of the mechanics of pipeline datasets Instead of assembling
data from multiple tables, it already exists all in one place
But the more significant cost reduction has to do with the
semantics of pipeline datasets With all data about the same
things in the same place, we will, all of us, find all of it when
we go looking for it The most junior member of the business
community will find the same set of data for his queries that
the most senior member does There won’t be differences in
completeness of the source data, or quality of that data, as there
so often are in today’s business world and today’s collections of
business data
When we need any of this data, we won’t have to go looking for
it All of the data about what we once thought was true, or what
we currently think is true, or what we are not yet willing to assert
is true, will be available by simply changing the assertion
point-in-time selection criterion on views and queries By changing that
predicate in a WHERE clause to a past point in assertion time, we
will be able to access the internalized re-presentation of posted
transactions By changing the predicate to a future point in time,
we will be able to access the internalized re-presentation of
pend-ing transactions
By the same token, we will be able to access historical data
about what things used to be like from the same table that
contains data about what they are like right now, and that may
also contain data about what those things are going to be like
sometime in the future Again, it will be as easy as changing a
predicate in a WHERE clause
Glossary References
Glossary entries whose definitions form strong
inter-dependencies are grouped together in the following list The
same glossary entries may be grouped together in different ways
at the end of different chapters, each grouping reflecting the
semantic perspective of each chapter There will usually be
sev-eral other, and often many other, glossary entries that are not
included in the list, and we recommend that the Glossary be
consulted whenever an unfamiliar term is encountered
Chapter 13 RE-PRESENTING INTERNALIZED PIPELINE DATASETS 309
Trang 5We note, in particular, that none of the nine types of pipeline dataset are included in this list In general, we leave category sets out of these lists, but recommend that the reader look them up
in the Glossary
as-is as-was Asserted Versioning Framework (AVF) assertion time
statement conventional table non-temporal table deferred assertion far future assertion time near future assertion time instance
type managed object object
oid queryable object pipeline dataset inflow pipeline outflow pipeline internalization of pipeline datasets re-presentation of pipeline datasets production database
production table temporal dimension temporal transaction version
310 Chapter 13 RE-PRESENTING INTERNALIZED PIPELINE DATASETS
Trang 6ALLEN RELATIONSHIP AND
OTHER QUERIES
Allen Relationship Queries 313
Time Period to Time Period Queries 316
Point in Time to Period of Time Queries 333
Point in Time to Point in Time Queries 341
A Claims Processing Example 343
In Other Words 346
Glossary References 347
In this chapter, we examine each of the thirteen Allen
relation-ships, as well as each non-leaf node in the taxonomy of Allen
relationships which we introduced in Chapter 3 We describe
the Allen relationships as they hold between two time periods,
between a time period and a point in time, and also between
two points in time We show how these relationships are
expressed in terms of time periods represented with the
closed-open convention, and we provide a sample query for each one
After a section in which we illustrate how much simpler these
queries would be to express if we had a PERIOD datatype, we
conclude this chapter by discussing queries which involve
tem-poral joins
Figure 14.1 shows our taxonomy of the Allen relationships
Those relationships are the leaf nodes in this taxonomy Every
leaf node has an inverse relationship, except the [equals]
rela-tionship We italicize that relationship name to emphasize that
it has no inverse So counting the [equals] relationship, and
the six leaf nodes and their inverses, we have the full set
of thirteen Allen relationships We also underline the non-leaf
node relationships in the taxonomy, to emphasize that they
are relationships we have defined, and are not one of the Allen
relationships
Managing Time in Relational Databases Doi: 10.1016/B978-0-12-375041-9.00014-5
Copyright # 2010 Elsevier Inc All rights of reproduction in any form reserved. 311
Trang 7Many of the Allen relationships are used by the AVF to enforce TEI and TRI For example, as we pointed out in Chapter 3, the [intersects] relationship is important because it defines TEI If two asserted versions of the same object share even a single effective time clock tick, within shared assertion time, then they [intersect], and violate TEI Otherwise, they don’t The [fills] rela-tionship is important because it defines TRI If a TRI relarela-tionship fails, it is because there is no episode of the referenced parent object which temporally includes, i.e [fills1], that of the child version The [before] relationship is important because it distinguishes episodes from one another Every episode of an object is non-contiguous with every other episode of the same object, and so one of them must be [before] the other
As for queries issued by business users, we have found that many ad hoc queries, and perhaps the majority of them, are queries about episodes, not about versions That is, they are queries that want (i) the begin and end date of the episode and, for business data, (ii) the last version of past episodes, the current version of current episodes, or the latest version of future episodes Because of the importance of episodes to queries, the SQL examples in this chapter will select episodes The last, cur-rent or latest version contains the business data The episode
| -|
| -|
Before | -| | -|
Meets | -| -|
Equals | -|
| -|
Occupies Starts Finishes During | -|
| -|
Aligns | -|
| -|
| -|
| -|
Time Period Relationships Along a Common Timeline
Figure 14.1 The Asserted Versioning Allen Relationship Taxonomy
312 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES
Trang 8begin date that is on every version, and the version’s own effective
end date, provide the effective time period of the episode itself
We also note that the SQL in many of the following examples
does not represent typical queries that a business would write
Each of these queries focuses on one specific Allen relationship,
and show how to express it in SQL In particular, these sample
queries do not include typical join criteria Instead, the only join
criteria used in these examples are two time periods and the
Allen relationship between them
Another reason these sample queries don’t look very real
world is that they select from two of the tables in our sample
database that don’t have much to do with one another In
partic-ular, there is no TRI relationship between them They are the
Policy and Wellness Program tables If we had used, for example,
the Client and Policy tables instead, many of the queries would
have been more realistic
But TRI-related tables cannot illustrate all of the Allen
relationships In fact, every instance of a TRI relationship
involves a parent and a child time period that is an instance of
one of seven of the Allen relationships This leaves six other Allen
relationships that TRI-related tables cannot illustrate
Nevertheless, as overly simple and unrealistic as most of
these sample queries may be, they are the foundation for all
queries that express temporal relationships No query will ever
need to express a temporal relationship that is not one of these
relationships So if we know how to write the temporal
pre-dicates in these queries, we will know how to write any temporal
predicate for any query
Allen Relationship Queries
The value of reviewing all the Allen relationships in terms of
queries against asserted version tables is that, as we already
know, the Allen relationships are exhaustive There are no
posi-tional relationships along a common timeline, among time
periods and/or points in time, other than those ones Thus,
by showing how to write a query for each one of them, as well
as for the groups of them identified in our taxonomy, we will
have provided the basic material out of which any query against
any assertion version table may be expressed
In addition to the thirteen Allen relationships themselves,
our taxonomy provides five additional relationships, each of which
is a logical combination of two or more Allen relationships
And these combinations are not formed simply by stringing
together Allen relationships with OR predicates Although they
Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 313
Trang 9are, necessarily, logically equivalent to the OR’d set of those relationships, they are often much simpler expressions, easier to understand and faster when executed
In these sample queries, we will not include predicates for asser-tion time, and will pretend that our sample tables are uni-temporal versioned tables This eliminates unnecessary detail from these examples We will do this by using two version table views, shown below: V_Wellness_Program_Curr_Asr and V_Policy_Curr_Asr The former is a view of all currently asserted Wellness Program versions The latter is a view of all currently asserted Policy versions CREATE VIEW V_Wellness_Program_Curr_Asr AS
SELECT * FROM Wellness_Program_AV WHERE asr_beg_dt <¼ Now() AND asr_end_dt > Now() CREATE VIEW V_ Policy _Curr_Asr AS SELECT * FROM Policy_AV
WHERE asr_beg_dt <¼ Now() AND asr_end_dt > Now()
In these example queries, as we said before, we will be selecting episodes, not versions For the two tables used in this chapter, these are the views which provide episodes as queryable managed objects:
CREATE VIEW V_Wellness_Program_Epis AS SELECT wp.wellpgm_oid, wp.epis_beg_dt, wp.eff_end_dt
AS epis_end_dt, wp.welllpgm_nm, wp.wellpgm_nbr, wp.wellpgm_cat_cd
FROM V_Wellness_Program_Curr_Asr AS wp WHERE wp.eff_end_dt ¼
(SELECT MAX(wpx.eff_end_dt) FROM V_Wellness_Program_Curr_Asr AS wpx WHERE wpx.wellpgm_oid ¼ wp.wellpgm_oid AND wpx.epis_beg_dt ¼ wp.epis_beg_dt) CREATE VIEW V_Policy_Epis AS
SELECT pol.policy_oid, pol.epis_beg_dt, pol.eff_end_dt AS epis_end_dt, pol.policy_type, pol.copay_amt,
pol.client_oid, pol.policy_nbr FROM V_Policy_Curr_Asr AS pol WHERE pol.eff_end_dt ¼ (SELECT MAX(px.eff_end_dt) FROM V_Policy_Curr_Asr px WHERE px.policy_oid ¼ pol.policy_oid AND px.epis_beg_dt ¼ pol.epis_beg_dt)
314 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES
Trang 10These episode views are the query-side work of defining an
episode datatype The AVF presents episodes as maintainable
managed objects These views present episodes as queryable
managed objects
This is a very important point Both computer science
research and IT practice have shown the importance of the
con-cept of a string of one or more contiguous clock ticks with a
known location in time SQL does not directly support this
con-cept; and so instead, we, and others, have to write code to
exclude gaps and overlaps occurring in the timespan between a
pair of dates or timestamps A PERIOD datatype is the direct
support needed for this concept This datatype implements this
concept at the correct level of abstraction
By the same token, our own research and practice has shown
the importance of the concept of an episode, a string of one or
more contiguous and non-overlapping versions of the same
object Without that concept, and the concepts of objects and
versions on which it depends, there is also no concept of
tempo-ral entity integrity and tempotempo-ral referential integrity Without
that concept, collections of rows are defined, as needed, within
each SQL statement As we can see with both the standard and
alternative temporal models, their SQL insert, update and delete
statements do result in bi-temporal data that satisfies what we
call TEI and TRI Their SQL queries do find episodes, when they
need them, past assertions when they need them, and so on But
the level of abstraction is wrong, for the same reason that getting
the same results with a pair of dates that one would get with a
PERIOD datatype is wrong
So we now have two views which externalize, as queryable
managed objects, the best data we currently have (i.e our
currently asserted data) about policy episodes and wellness
program episodes Now, using these two views, we will define
another view that we will use to illustrate each of the Allen
relationships This is the view V_Allen_Example This view
will keep the examples as small and easy to understand as
possi-ble, eliminating all extraneous and repetitive detail while
focus-ing on the Allen relationships themselves Here is the
V_Allen_Example view:
CREATE VIEW V_Allen_Example AS
SELECT wp.wellpgm_oid, pol.policy_oid,
wp.epis_beg_dt AS wp_epis_beg_dt,
wp.epis_end_dt AS wp_epis_end_dt,
pol.epis_beg_dt AS pol_epis_beg_dt,
pol.epis_end_dt AS pol_epis_end_dt
Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 315