where you say
store the incoming data in a fully normalised DB
My project (so far) does not really deal with any user interactions such as users updating records. My database has records on the viewer(viewer_id, dob, sex, occcupation, location) the viewer table as attributes, and the viewer hobbies in a separate table (with viewer_id, hobby_id) as these can be multivalues attributes.
The information that would 'come in' periodically would be in the observations table and would consist of
obs_no(PK), viewer_id (FK to the viewer table), viewer_start_time, viewer_stop_time, viewer_channel_id.
As the first task is to find the hourly share for each station, I had to group all the observations by hours, and then separate the observations for each hour into stations and then into viewer_ids.
I (think ? ) what I am doing is the extract and transform phases, where I am grouping the data into what I would need, and then transforming them using the needed (repetivtive) calculations on the small invidiuals groups.
I have not decided on a proper star schema, as I am not sure how the actual info I calculated would store physically into the fact table, I know there would be a dimension for each thing I am calculating for.
For eg, two of the facts I know I would have:
Shares. Ratings.
Dimensions
Station
ViewerSex (m,f)
Time (hourly, daily)
Household
Occupation
I am currenty calculating the share for each of the dimensions, but haven't actually stored it anywhere cause I don't understand where it should go.
I know their would be another DB which would really be the DW with two fact tables(but what would be their attributes?), would there be five dimension tables, one for each dimension? I haven't seen any indept examples on the star schema, only introductory ones.
Since I have calculated so far 1) the hourly for each station, and 2) the hourly share in terms of males and 3) hourly share females for each of these stations, where exactly in the star schema would these calculated values go?