|
I'm trying to monitor electric meter readings for several substations, and provide a database that can be used to analyze usage patterns. The meter readings are provided in csv format with the structure:
Date, MeterID, HE1, HE2, HE3,..., HE24, OffPeak, OnPeak, Total
where HEn is the hour ending MWh usage; I don't much care about the On/Off Peak values, or the totals.
I want to store the data in the form:
Date MeterID1 MeterID2 MeterID3 MeterID4
date HE1 HE1 HE1 HE1
date HE2 HE2 HE2 HE2
date ... ... ... ...
date HE24 HE24 HE24 HE24
That's one approach, at least. But it creates a problem if we add another meter to the mix later. I'd have to recreate the table with a new schema, then repopulate it from the previous table.
Another way would be to create a separate table for each meter, using the same schema; that would make it easier to add new meters later with no manual operations, but it seems inefficient to me.
Worse still would be to use a single table date-meterID-HEn value. That would entail a huge number of records with little content, hardly a useful structure.
I'm sure this sort of problem happens all the time in other applications, and there must be some recommended solution, but I don't know of it. Can someone with more experience suggest a solution?
Will Rogers never met me.
|
|
|
|
|
Roger Wright wrote: use a single table date-meterID-HEn value
Thats actually the way to go. The amount of rows isn't bothering tha database very much if you index the table properly.
It makes it very easy to add a new meter, or get the values for a single meter for a certain datespan.
If you want to show values for several meters at the same time you can PIVOT the data.
You should probably also add a separate table with info on the meters.
|
|
|
|
|
Yup what Jorgen said, definately go for a 2 table design.
MeterTable - with any details about the meter if there is more than just a name
TranTable - 1 record = a meter/hour read.
And yes a pivot view (even if it is hard coded to #meters) is an excellent option.
If you use the first 2 option you are committing database abuse. Use option 3 (your perceived worst option) this is by fard the best design.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
Roger Wright wrote: Worse still would be to use a single table date-meterID-HEn value.
That would entail a huge number of records with little content, hardly a useful
structure.
Actually it's quite a common design. Modern databases won't have much of a problem with this, particularly since it seems as if you will be doing mostly reads, inserts occurring in batches relatively infrequently, few or possibly even no updates to existing rows. If you find the table grows too big over time, you can look to archive data off (e.g. data over a year old could be moved out to another table). There is another thread here about archiving.
|
|
|
|
|
I agree with what the others have said; use a pivot.
Where I work, we made a number of products on any given day and we need to report metrics on the products on a daily basis.
The corporate solution was a spreadsheet limited to 20 products; new products are simply added to a column and grouped as a 'family' of products even though the specs aren't the same... just a 'close enough' fit.
I developed a solution to pull the data into a single table, as others have suggested, and used a pivot to show the products by day with metrics. It was some work to get it going, since it was my first stab at using the pivot, but it has been wonderful for reporting purposes.
Tim
|
|
|
|
|
Basically nothing new to add but since you've asked for opinions...
Having two tables is absolutely, definitely and undeniably the correct way because of:
- maintainability
- (quite) easy to query regardless of the reporting needs
- flexibility when quering
- efficiency
Depending on the needs I would even concider having three separate (typed) rows, one for On peak value, second for off peak and third for the total (if it cannot be calculated from the data). If the total can be calculated from the data I wouldn't store it, at least not in these tables.
Best regards,
mika
|
|
|
|
|
Hello,
Is there some tool in sql server which would help me compare two tables and tell me what is different in it?
Thanks!
|
|
|
|
|
Do you want to compare the contents of the tables. If that's the case I think the quickest way could be to use EXCEPT[^].
If the table structure is the same, you could do comparison like:
SELECT * FROM FirstTable
EXCEPT
SELECT * FROM SecondTable
and then other way round:
SELECT * FROM SecondTable
EXCEPT
SELECT * FROM FirstTable
If there are structural differences, specify the columns you want to compare.
|
|
|
|
|
Are you talking about comparing the structure or the data? They require 2 different strategies.
Do you want to do this for 1 pair of table or are you looking to compare an entire database? They require 2 different strategies.
If you need to compare databases then look at Red-Gate sql and data compare tools.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
To be more clear, I would like compare the data with in two tables of same design. Would like to report all rows that are different.
|
|
|
|
|
Thanks so much, EXCEPT really works, and it is really fast as well.
|
|
|
|
|
|
|
|
folder where files arrive in every two hrs.
Sp i created read files in every 2 hrs from folder and moves files in separate folder.
When reading files it inserts approx 11k - 12k lines in table.
Now this has to be run everyday at every two hour.
So, What measure should i take from slowing down queries result as days, months, year passes by?
One i have in mind run a SP at every 10-15 day interval which will copy this table data into another table and removes rows?
Your ideas ?
|
|
|
|
|
If I understood correctly you're concerned about the performance when the amounts of data grow. Based on the rough information you provided, few opinions:
- correct indexing strategy must be implemented (depending on the needs)
- I wouldn't use separate tables since it would affect program logic
- think carefully about the table structure, optimize it for typical use-cases
- consider using partitioning.
|
|
|
|
|
You got it right.
I think about moving data to separate table as query which has to run on table, will need only information of 1 previous day , today, 1 after day on any given day.
Like if run my queries today i.e 09 Mar 11 17:04 PM
then query will look for data of 8, 9, 10 march
that's why i think about moving data after 10 days to separate table.
Mika Wendelius wrote: - correct indexing strategy must be implemented (depending on the needs) - I wouldn't use separate tables since it would affect program logic - think carefully about the table structure, optimize it for typical use-cases
If apply all these will it still work after 5 year (assuming 12000 * 12 * 365 * 5 rows) ?
Mika Wendelius wrote: consider using partitioning.
I don't have any clue about this?
Will be helpful if you point to an article (targeting less then have DBA exp)
|
|
|
|
|
Hum Dum wrote: I think about moving data to separate table as query which has to run on table,
will need only information of 1 previous day , today, 1 after day on any given
day.
This makes things even easier and goes back to good indexing. So if the new rows are always added to the end in your timeline, you could consider using clustered index. This would make queries selecting the near past very efficient, see: http://msdn.microsoft.com/en-us/library/ms190639.aspx[^]
Hum Dum wrote: If apply all these will it still work after 5 year (assuming 12000 * 12 * 365 *
5 rows) ?
Roughly 25 million rows, as long as the structure and indexing are well done you should be safe, of course depending on the requirements
Basically the earlier rows are kinda a dead mass if you don't query them as long as you have a good access path to the few rows you need
Hum Dum wrote: don't have any clue about this? Will be helpful if you point to an article
(targeting less then have DBA exp)
Here's one starting point: http://msdn.microsoft.com/en-us/library/ms178148.aspx[^]
Hopefully these help,
mika
|
|
|
|
|
What I have done in the past is to periodically delete rows older than some threshold. If you like, you can use a trigger to also write them to some archive table.
|
|
|
|
|
Hello Everybody,
I have generate a problem While Aliasing Query.
SQL Command is : Select S_no as S. No. From Emp;
If you can think then I Can.
|
|
|
|
|
the alias name should be a valid identifier, no punctuation, no spaces,...
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
|
|
|
|
|
Select S_no as [S. No.] From Emp;
would work but as Luc said that its not best practice.
As barmey as a sack of badgers
Dude, if I knew what I was doing in life, I'd be rich, retired, dating a supermodel and laughing at the rest of you from the sidelines.
|
|
|
|
|
It's Not working..............
If you can think then I Can.
|
|
|
|
|
what sort of error message did you get?
As barmey as a sack of badgers
Dude, if I knew what I was doing in life, I'd be rich, retired, dating a supermodel and laughing at the rest of you from the sidelines.
|
|
|
|
|
Error is :
Invalid bracketing on name 'S. No.'.
If you can think then I Can.
|
|
|
|