data warehouse with both historic and real-time
data and batch processes and low latency processes coexisting in a common computing and
storage fabric called the data warehouse grid or
cluster or cloud.
We’ve tried to move away from static,
monolithic data sets to in-memory or on
the fly processing, so where do the two
meet?
Well, this is true. Because in-memory is much
faster than disk, if you want true real-time apps
end-to-end, you need a completely distributed
in-memory caching environment and a heck of
a lot of I/O bandwidth. The real-time fabric of
modern life will require a push to all in-memory
data architectures in data warehouses everywhere, but also into clients. As memory gets
ever cheaper and we’re able to persist more info
in solid state disk and flash on devices, users are
going to want many things. One of those is the
ability to bring a lot of information into their
own device for exploration and visualizations
and what-if analyses on the fly. They’ll eventu-
ally be able to work with multiple gigabytes and
potentially terabytes of information right in
their hands. At some point you’re going to want
a complete personal data mart in your iPhone so
you don’t need to do the round trip back to the
data warehouse or server site data mart to grab
all this information. We’ll see more in-memory
BI architectures that are very mobile-oriented.
But will the monolithic or virtual data
warehouse continue to grow?
The majority of data warehouses in the world
now are between one and 10 terabytes total. As
memory gets cheaper and you can begin to have
terabyte scale memory on a handheld client, all
that traditional information will fit in anyone’s
pocket. And the data warehouse on the server
side where the master records are kept will grow
ever larger, into petabytes. The caches that you
hold locally will be synchronized for your use,
and your user device will personalize the displays, the calculations and the functions on the
cache to meet your specific needs. And they’ll
auto-synchronize with a server site according
At some point you’re going
to wAnt A complete personAl dAtA mArt in your
iphone so you don’t need
to do the round trip bAck
to the dAtA wArehouse or
server site dAtA mArt to
grAb All his informAtion.
to [user access] policy. It will be real-time decisioning in the field without that round trip, and
while there will be bandwidth constraints with
wireless, you won’t sync all the data sets in a split
second. But it will be more and more critical to
have a deep and fairly current cache of the latest
information in your hands.
So a lot of data will persist on the server
side and users will access decision-making
information like a sponge?
Exactly. It will be important to manage the
transmission and the storage requirements on
the client side to control the replication and
synchronization. But the bottom line is that
the user doesn’t have to incur that latency on
the run, they can query massive marts and put
them in their pocket at memory speed to support the transactions and decisions they are
making now. Real time will enable ever more
agile and data-rich exploration and calculation
in the field for everybody.