I've heard of data mining tools for over a decade. Not sure what the true definition of it is but my experience has been that it's used as a "filter". A ton of the work I've done on what's been called "data minning" projects in my feild of work has involved tons and tons of SQL queries. Some simple and some extreemly complex.
The basics are that it's a GUI, nothing but button presses and pull down menus, more complicated tools like a query builder are ok but sort of defeats the purpose. So based on some pull down menus and maybe even a few logic parameters (greater than, equal too, etc..) you have a basic Data Mining front end. The queries behind the button presses is where the bulk of the heavy lifting is done.
I have a plant being monitored and controlled by as SCADA (Supervisory Control And Data Aquisiton) system. This system collects dozens of samples per second on hundres, or thousands, of I/O points (temps, on/off status, flow rates, levels, etc...). So it's possible that you have tens/hundreds of thousands or even millions of data points in the system, 99% of which are useless to 99% of the users.
Simple stuff: Pump starts. Just run a simple query of Off to On transitions of a discrete (on/off) input point within a given time frame.
A bit more complex would be to determine the volumetric flow rate of the gasoline the pump is moving. You have to query various I/O points, that are a mix and match of analog and digital points, and take the results to calculate a volumetric flow rate. Then you probably trend it over a given timeframe. It would be easy if you had a volumetric flow meter but typically you only have a flow rate. You then have to take into account things like density, specific gravity, temp., etc...
I've done some even more complex queries when I was looking at a data structure that is 3 dimensional. It was used for scheduling production runs of cetain types of machines for a sprinkler factory.
This "automation industry" type of data mining methods I've used can be applied to several applications using all kinds of data. You are basically taking in massive amounts of data and filtering out what you want. Sounds like your data will be retrieved from several different db sources. That will create a variety of complexity in your queries.
The easiest way to attack a problem like this is to ask "what is it you want to see/know". You need very specific criteria and data points to identify. Then you just isolate the points you need and query away. Now if your talking about things like "trends" of somewhat "vague" variables then you need to work on trimming it down to specifics otherwise it will become "overwhelming" as you stated. I've heard "I want to know what the plant is doing". Ok WTF does that mean? Are you looking for production rates? resource consumption? idle time? WHAT?!?! Break it down to quantifiable and manageble tasks/goals and just start knocking them down one at a time.