|
Peter Beedell wrote:
The bottom line is that an INSERT in ADO.NET is SLOW - which is a shame really
Hmm... interesting. Are you using SqlCommands for the insert operations (with stored procedures or manually builded commands) ? Are you doing the insert operations inside a transaction ?
Mauricio Ritter - Brazil
Sonorking now: 100.13560 MRitter
English is not my native language so, if you find any spelling erros in my posts, please let me know.
|
|
|
|
|
The following is a pretty complete code snippit - the code for reading and parsing the source files (6.77Gb of data!) has been ommitted, but you don't need much imagination here. The two 'Console.WriteLine' commands provide debug info about how long each phase is taking and was how I measured the ~1/2 second per insert result.
SqlConnection sqlConnection1 = new SqlConnection();
sqlConnection1.ConnectionString = "....";
sqlConnection1.Open();
SqlCommand sqlInsertCommand1 = new SqlCommand();
sqlInsertCommand1.Connection = this.sqlConnection1;
sqlInsertCommand1.CommandText = @"INSERT INTO tmp_mueller_test(fk_transaction_typ, fk_card_typ, " +
"amount, ec_cash_id, pk_terminal_id, pk_transaction_date, transaction_time, fk_answercode_number, " +
"card_number, expiry_date, pk_trace_number, auth_code, extended_datas) VALUES (@fk_transaction_typ, " +
"@fk_card_typ, @amount, @ec_cash_id, @pk_terminal_id, @pk_transaction_date, @transaction_time, " +
"@fk_answercode_number, @card_number, @expiry_date, @pk_trace_number, @auth_code, @extended_datas);
sqlInsertCommand1.Parameters.Add(new SqlParameter("@fk_transaction_typ", SqlDbType.SmallInt, 2, "fk_transaction_typ"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@fk_card_typ", SqlDbType.SmallInt, 2, "fk_card_typ"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@amount", SqlDbType.Decimal, 5, ParameterDirection.Input, false, " +
"((System.Byte)(8)), ((System.Byte)(2)), "amount", DataRowVersion.Current, null));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@ec_cash_id", SqlDbType.Decimal, 9, ParameterDirection.Input, false, " +
"((System.Byte)(18)), ((System.Byte)(0)), "ec_cash_id", DataRowVersion.Current, null));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@pk_terminal_id", SqlDbType.VarChar, 8, "pk_terminal_id"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@pk_transaction_date", SqlDbType.DateTime, 8, "pk_transaction_date"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@transaction_time", SqlDbType.DateTime, 8, "transaction_time"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@fk_answercode_number", SqlDbType.SmallInt, 2, "fk_answercode_number"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@card_number", SqlDbType.VarChar, 20, "card_number"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@expiry_date", SqlDbType.VarChar, 4, "expiry_date"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@pk_trace_number", SqlDbType.Decimal, 9, ParameterDirection.Input, false, " +
"((System.Byte)(18)), ((System.Byte)(0)), "pk_trace_number", DataRowVersion.Current, null));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@auth_code", SqlDbType.VarChar, 10, "auth_code"));
sqlInsertCommand1.Parameters.Add(new SqlParameter("@extended_datas", SqlDbType.VarChar, 99, "extended_datas"));
while(true) {
sqlInsertCommand1.Parameters["@fk_transaction_typ"].Value = int.Parse(strMsgType);
sqlInsertCommand1.Parameters["@fk_card_typ"].Value = strMsgType.StartsWith("4") ? 4 : 2;
sqlInsertCommand1.Parameters["@amount"].Value = Decimal.Parse(strValue);
sqlInsertCommand1.Parameters["@ec_cash_id"].Value = Decimal.Parse(strTermID);
sqlInsertCommand1.Parameters["@pk_terminal_id"].Value = strMakatelID;
sqlInsertCommand1.Parameters["@pk_transaction_date"].Value = DateTime.Parse(strDate + " " + strTime);
sqlInsertCommand1.Parameters["@transaction_time"].Value = DateTime.Parse(strDate + " " + strTime);
sqlInsertCommand1.Parameters["@fk_answercode_number"].Value = int.Parse(strRC);
sqlInsertCommand1.Parameters["@card_number"].Value = strPAN;
sqlInsertCommand1.Parameters["@expiry_date"].Value = strExp;
sqlInsertCommand1.Parameters["@pk_trace_number"].Value = Decimal.Parse(strTrace);
sqlInsertCommand1.Parameters["@auth_code"].Value = strAuth;
sqlInsertCommand1.Parameters["@extended_datas"].Value = strBMP60;
Console.WriteLine(DateTime.Now.Ticks + " - Ready to insert this record");
decInsertCount += sqlInsertCommand1.ExecuteNonQuery();
Console.WriteLine(DateTime.Now.Ticks + " - Insert complete, read next record");
}
Peter Beedell
"Remember that age and treachery will always triumph over youth and ability." Quoted from the BBC Comedy 'The Office'.
|
|
|
|
|
Peter,
Maybe you should insert this statement before the while(true) loop:
sqlinsertcommand1.Prepare()
this should improve your performance.
Also you could try to put your mass insert operation inside a transaction... I'm not sure but I think it will improve the command performance too.
Let us know if this stuff works ! I'm curious !
Mauricio Ritter - Brazil
Sonorking now: 100.13560 MRitter
English is not my native language so, if you find any spelling erros in my posts, please let me know.
|
|
|
|
|
Hi Mauricio,
Thanks for taking an interest
Here are the results for comparason - note, all my file reading and parsing is happening within one tick! My otherwise perfectly capable PC is registering 100% system load during this process, resulting in failing screen refreshes etc. and we are seeing > 5 seconds per insert here!
With 'Prepair()'
06.08.2003 16:23:52 - Processing: 03072801.pos
631957838325046322 - Ready to insert this record
631957838479761077 - Insert complete, read next record
631957838479761077 - Ready to insert this record
631957838633374303 - Insert complete, read next record
631957838633374303 - Ready to insert this record
Additionaly with 'Transaction()' but no 'Commit()'
06.08.2003 16:35:27 - Processing: 03072801.pos
631957845277396675 - Ready to insert this record
631957845485785934 - Insert complete, read next record
631957845485785934 - Ready to insert this record
631957845640400550 - Insert complete, read next record
631957845640400550 - Ready to insert this record
631957845794614610 - Insert complete, read next record
631957845794614610 - Ready to insert this record
Just for comparason, without either of the above!
06.08.2003 16:38:57 - Processing: 03072801.pos
631957847378012478 - Ready to insert this record
631957847531425426 - Insert complete, read next record
631957847531425426 - Ready to insert this record
631957847683336289 - Insert complete, read next record
631957847683336289 - Ready to insert this record
631957847835247152 - Insert complete, read next record
631957847835247152 - Ready to insert this record
Peter Beedell
|
|
|
|
|
Woops - please ignore the last set of results - here are the results after the glaring bugs have been removed!
Basically, your suggestions have had a positive result - thank you
Just for comparason, without either of the improvements
06.08.2003 16:52:05 - Processing: 03072801.pos
631957855251883659 - Ready to insert this record
631957855255488375 - Insert complete, read next record
631957855255488375 - Ready to insert this record
631957855255488375 - Insert complete, read next record
631957855255488375 - Ready to insert this record
631957855255788768 - Insert complete, read next record
631957855255788768 - Ready to insert this record
631957855255888899 - Insert complete, read next record
631957855255888899 - Ready to insert this record
631957855255888899 - Insert complete, read next record
631957855255888899 - Ready to insert this record
With 'Prepair()'
06.08.2003 16:53:12 - Processing: 03072801.pos
631957855928869350 - Ready to insert this record
631957855929770529 - Insert complete, read next record
631957855929770529 - Ready to insert this record
631957855929770529 - Insert complete, read next record
631957855929770529 - Ready to insert this record
631957855929770529 - Insert complete, read next record
631957855929770529 - Ready to insert this record
631957855929770529 - Insert complete, read next record
631957855929770529 - Ready to insert this record
631957855929870660 - Insert complete, read next record
631957855929870660 - Ready to insert this record
631957855929870660 - Insert complete, read next record
631957855929870660 - Ready to insert this record
631957855929870660 - Insert complete, read next record
631957855929870660 - Ready to insert this record
631957855929870660 - Insert complete, read next record
With 'Transaction()' but no 'Commit()'
06.08.2003 16:55:30 - Processing: 03072801.pos
631957857304631307 - Ready to insert this record
631957857305332266 - Insert complete, read next record
631957857305332266 - Ready to insert this record
631957857305332266 - Insert complete, read next record
631957857305332266 - Ready to insert this record
631957857305332266 - Insert complete, read next record
631957857305332266 - Ready to insert this record
631957857305432403 - Insert complete, read next record
631957857305432403 - Ready to insert this record
631957857305432403 - Insert complete, read next record
631957857305432403 - Ready to insert this record
631957857305432403 - Insert complete, read next record
631957857305432403 - Ready to insert this record
631957857305432403 - Insert complete, read next record
631957857305432403 - Ready to insert this record
631957857305432403 - Insert complete, read next record
631957857305432403 - Ready to insert this record
631957857305432403 - Insert complete, read next record
631957857305432403 - Ready to insert this record
631957857305432403 - Insert complete, read next record
Peter Beedell
|
|
|
|
|
Peter Beedell wrote:
Basically, your suggestions have had a positive result
Good to know that !
I wonder what is the total time to insert the 2.5 million records with these tecnics. Maybe it will not work pretty well with the transaction one (the SQL transaction will probably blow up), but the other one should improve the performance. I'll try to do some tests here.
thanks for replying !
Mauricio Ritter - Brazil
Sonorking now: 100.13560 MRitter
English is not my native language so, if you find any spelling erros in my posts, please let me know.
|
|
|
|
|
I am making a COMMIT after every file. The data (6,8Gb) is in 401 files and generate 2.2 million records -> 5,500 records per file. It looks like the process will take around one hour to complete, so we are getting over 600 records added per second - and that is with transaction control!
OK - so I am impressed (thinks: just how much time I could have saved if I had only found that stupid bug sooner) - but your suggestions are a definate help.
Again, a big thank you.
Peter Beedell
|
|
|
|
|
Hi,
How can I manage it with Transation. Should i have to open & ... (handling all by myself sqlAdapter.Update() doesnt let me to use Transation!?!
Can u give me a sample. I'll be thankful.
Always,
Hovik Melkomian.
|
|
|
|
|
I would consider using BCP to get the data straight into the DB. Don't worry about putting it directly into the correct tables. Store it in one flat temporary table. Then use SQL to "massage" it into the final tables. Oh yes, consider dropping all indexes (except constraints) on the final tables before inserting the data, then after the process is complete recreate them.
Cheers
Mark
|
|
|
|
|
I had a problem like yours some time ago
But I had a more complex problem than you said.
I had a Books table, a Author Table (with n-n relation) and a Category table (n-n) again.
What I did was IMPORT all to SQL unnormalized, ofcourse, there were something like 8000 registers (one for every book) (not as many as yours, but a mor complex problem)
Then I've make a SQL "program" to solve the problem.
It takes make some minutes to fill all up.
Hmm other thing... the authors and the categorys were in a comma separated list, that why i've the same number of rows in books and the unnormalized one
Reply if you're interested in the script.
I've did it with sql cursors
|
|
|
|
|
One thing to note is that we make multiple calls to the same stored procedure. As a result we keep parsing and casting objects into primitives (stored procedure parameters). Possibly other approach could help here. This is discussed in the well knows Henderson's book but there is a nice chapter in the SQL Server "Books on line", just search for "openxml","sp_xml_preparedocument", and "sp_xml_removedocument".
The point is that one can pass ALL data
to the server in xml format. Let me repeat this: ALL data is passed in one trip.
Once this is done, the xml document is placed in a table.
From here on, it is much faster to move/cast data into the right tables.
Here is the code outline:
CREATE PROCEDURE InsertThisIntoMyTable(@xml NTEXT)
AS
DECLARE @hDoc INT
EXECUTE sp_xml_preparedocument @hDoc OUTPUT,@xml
INSERT INTO MyTable(Field1,...)
SELECT Field1,...
FROM OPENXML(@hDoc,'TheXPathToData',1)
WITH(Field1,DataType,'NodeName',
...)
EXECUTE sp_xml_removedocument
Give it a try!
|
|
|
|
|
One thing to note is that we make multiple calls to the same stored procedure. As a result we keep parsing and casting objects into primitives (stored procedure parameters). Possibly other approach could help here. This is discussed in the well known Henderson's book but there is a nice chapter in the SQL Server "Books on line", just search for "openxml","sp_xml_preparedocument", and "sp_xml_removedocument".
The point is that one can pass ALL data
to the server in xml format. Let me repeat this: ALL data is passed in one trip.
Once this is done, the xml document is placed in a table.
From here on, it is much faster to move/cast data into the right tables.
Here is the code outline:
CREATE PROCEDURE InsertThisIntoMyTable(@xml NTEXT)
AS
DECLARE @hDoc INT
EXECUTE sp_xml_preparedocument @hDoc OUTPUT,@xml
INSERT INTO MyTable(Field1,...)
SELECT Field1,...
FROM OPENXML(@hDoc,'TheXPathToData',1)
WITH(Field1,DataType,'NodeName',
...)
EXECUTE sp_xml_removedocument
Give it a try!
|
|
|
|
|
I was wondering a bit why you used the update row event. When you look at the autogenerated sql statements that the Microsoft IDE generates for a data adapter you see, that they solve the same problem with an insert command followed by a select command. This may look like that:
"INSERT INTO Customers(CompanyName) VALUES (@CompanyName) "; SELECT * FROM Customers where customerID=@@IDENTITY"
The IDbCommand Interface has a property called UpdateRowSource which says how output parameters and returned rows are handled when used with the update method of the DataAdapter. By default it is set tho "Both" which says, that both, the outputparameters and the first returned row are mapped to the returned row in the dataset. Which gives us exactly the effect we wanted, at least I think so
This, I do agree in that, has the disadvantage that you can't use the commandbuilder and have to write the insert,update and delete statements manually. But I think in performance matters it is the better solution, because you have just one command that is actually excecuted. Besides I made the expirience, that working with transactions and commandbuilder is quite uncomfortable.
|
|
|
|
|
Well,
This option is a possibility too (similiar to the one with the stored procedure, but with the statements created inline). The problem is that with this approach you need to reselect the entire row again after you update it, when actually there is no need to get the entire row (since the only value that changed is the identity column). If you have a big row and your database is located in a remote machine, you can have some overhead in terms of performance (we´re working with access here, and probably with a small database, but this method can be used with other databases).
Mauricio Ritter - Brazil
Sonorking now: 100.13560 MRitter
Life is a mixture of painful separations from your loved ones and joyful reunions, without those two we'd just be animals I guess. The more painful the separation, that much more wonderful will be the reunion - Nish
"Th@ langwagje is screwed! It has if's but no end if's!! Stupid php cant even do butuns on forms! VISHAUL BASICS ARE THE FUTSHURE!" - Simon Walton
|
|
|
|
|
1. It is not nessesary to reselect the hole row with the approach I described. I think it also works if you just reselect one row.
2. The biggest performance loss, is the seperation of the commands, wich (in a distributed scenario) ends in two seperate networkcalls to the db. I guess you need quite a row until this performs bader . But of corse you are right, that it is better just to reselect the id row, as it is possible with both approches.
3. Of cors you are right about the stored procedure.
Besides...what do you say about the following scenario: When you are working with transactions
you often have to get the changes of a dataset and make the update on them. If the transaction is successfull you have to merge back the dataset with the changes to the original one. Here there occurs the problem that the dataset is not capable of merging the inserted rows back, because of the id's that have changed. One solution is to reject the changes on the original dataset, but this works only if we are just inserting data. My current solution is to select all the deleted rows in the original database and reject the changes on them before merging back the dataset with the changes. But I am wondering, that there should be a nicer sollution. Can you help me in that?
|
|
|
|
|
mkonrad wrote:
One solution is to reject the changes on the original dataset
Actually, thats what I thought... and in my opinion it works
mkonrad wrote:
but this works only if we are just inserting data.
why !? In my opinion it should work with deletions and updates too (as long as you use the AcceptChangesDuringFill option)... here is the steps I would take:
1) Call the GetChanges() of the dataset with the changes (let's call it "original dataset"), and create a "changes submit" dataset;
2) Change the "AcceptChangesDuringFill" option of all the dataadapters involved to FALSE (in order to maintain the state of each row)
3) Create the transaction and associate with each dataadapter command
4) Call the Update method of each dataadapter passing the "changes submit" dataset as parameter
5) If everything goes fine (and we don´t need to call rollback), we will reject all the changes in the "original dataset" in order to remove the inserted rows. Then, we´ll Merge the "submit changes" dataset with the "original" dataset. Since the we´ve set the AcceptChangesDuringFill" to false, the rowstate of the "submit chnages" will be maintained, and it will insert, update and delete the rows in the "original" dataset;
6) If we need to rollback, we´ll just roolback the transaction and our "original dataset" will be preserved (without the identity key fields)
Mauricio Ritter - Brazil
Sonorking now: 100.13560 MRitter
Life is a mixture of painful separations from your loved ones and joyful reunions, without those two we'd just be animals I guess. The more painful the separation, that much more wonderful will be the reunion - Nish
"Th@ langwagje is screwed! It has if's but no end if's!! Stupid php cant even do butuns on forms! VISHAUL BASICS ARE THE FUTSHURE!" - Simon Walton
|
|
|
|
|
Thank you for the tip! I somehow missed the AcceptChangesDuringFill flag
|
|
|
|
|
This is not a reflection on your article at all, which I think is excellent.
The question is more oriented to the use of data sets, adapaters, etc. Do people really code with all these hard coded SQL statements, table names, column names, etc, in their applications? I see this quite often in CP (and other) articles. This seems awful, because what happens when you change the schema? All this code needs to be changed and recompiled. Working in my own little one-man consulting universe, I really have no idea what people really do. The only thing I experienced as a team leader was a disaster between the DB group and the project group, because of exactly this style of DB interfacing.
I guess the reason I've seen a move toward stored procedures is that it takes all this stuff and offloads it to the DB, and you can just pass parameters to the procedure.
But what I've never seen done (except by myself ) is abstracting the interface (well, now it's called "business logic", isn't it?) so that the data structures the application uses are completely separate from the database. Anyways, I'm getting around to that in my AAL articles, but I was just wondering, in general, how most people code their database interface.
Great article, BTW!
Marc
Help! I'm an AI running around in someone's f*cked up universe simulator. Sensitivity and ethnic diversity means celebrating difference, not hiding from it. - Christian Graus Every line of code is a liability - Taka Muraoka Microsoft deliberately adds arbitrary layers of complexity to make it difficult to deliver Windows features on non-Windows platforms--Microsoft's "Halloween files"
|
|
|
|
|
Marc Clifton wrote:
Do people really code with all these hard coded SQL statements, table names, column names, etc, in their applications?
Actually... I don´t (I abstract all the logic in classes), but my point on the article was to solve the problem with the updatable datasets (with identity columns), so I tried to create the code as simple as possible. I created this little app as an example for my classes at the CTEC, since most of the students ask questions about this kinda stuff.
Anyway... if you work in a disconnected environment (where you don´t have the SQL Server connected all the time for example) you could "hard-code" the inserts and updates in the DataSet and persist it to xml file (using the xml persistence properties of the dataset class), so that later on you submit all the changes to the DB.
Marc Clifton wrote:
I'm getting around to that in my AAL articles
BTW... nice articles Marc... I read all of them !
Mauricio Ritter - Brazil
Sonorking now: 100.13560 MRitter
Life is a mixture of painful separations from your loved ones and joyful reunions, without those two we'd just be animals I guess. The more painful the separation, that much more wonderful will be the reunion - Nish
"Th@ langwagje is screwed! It has if's but no end if's!! Stupid php cant even do butuns on forms! VISHAUL BASICS ARE THE FUTSHURE!" - Simon Walton
|
|
|
|
|
so that later on you submit all the changes to the DB.
I recently installed a custom inventory and workorder management system for a new boatyard client, and one of the questions that the two owners asked was "can we each take some of the data entry work home with us and work on it over the weekend, and then merge all our changes back into the master database?" to which I said "no, definitely not!".
The primary issue being, as you described in your article, the use of a primary key (basically an autoincrement ID) which would have caused lots of conflict between the two "sets" of data and of course their children.
This is not the first time I wish Access had something like Oracle's "sequence" (I haven't checked if SQL Server does, but at one point it did not), because then I could have started them each with two different sequences.
I suppose I could actually have done the same thing because I use a table to simulate a sequence, but I hadn't thought of that until now.
Anyways, the point being, it's not easy merging changes that have been done offline, especially when there's more than one person making changes!
Marc
Help! I'm an AI running around in someone's f*cked up universe simulator. Sensitivity and ethnic diversity means celebrating difference, not hiding from it. - Christian Graus Every line of code is a liability - Taka Muraoka Microsoft deliberately adds arbitrary layers of complexity to make it difficult to deliver Windows features on non-Windows platforms--Microsoft's "Halloween files"
|
|
|
|
|
Marc Clifton wrote:
"can we each take some of the data entry work home with us and work on it over the weekend, and then merge all our changes back into the master database?" to which I said "no, definitely not!".
The answer to that question could be YES if instead of IDENTITY you use a GUID (uniqueidentifier ) data type for the Key. I believe that after the GUID data type was implemented in SQL Server, IDENTITY is a lot less useful, with a GUID you don't have to wait for the insetion to happend you can know the value before it actually does !
I am aware of the size increase in the data type but even that is becoming less of a problem nowdays! I was against the use of it for a while but a lot has change since I started to realize the benefits that comes with this type.
Good article by the way ...
Noeld
|
|
|
|
|
I would be very hesitant of using a GUID as a PK for lots of reasons other than size. For one, GUIDs are not sequentual which means no clustered indexing. Also, not all DBMSs support GUIDs, though most do now. Portbility between databases can be important especially for some types of projects. To use a GUID in these DBMSs, you will either have to use a BLOB or a 36 byte char field. That's a big PK. Another issue with GUID's is that it is not possible to request a range of GUIDsin SQL which is often quite usefull for limiting searches, etc.
However, one method which I use is to have a PK which is a 4 byte integer and may or may not be auto-incrementing (int IDENTITY in SQL Server), but have an additional 36 byte char column for GUID which is not part of the PK. I can then use this value to allow for merging while keeping my nice and simple INT for PK. This is basically what SQL Server merge replication uses, though obviusly there is more to it than just that.
|
|
|
|
|
"For one, GUIDs are not sequential which means no clustered indexing."
I believe you are not correct on that one; a Clustered index Does NOT has to be sequential The Data is ORDERED sequentially but can have gaps.
Example :
<br />
CREATE TABLE <br />
TEST1<br />
<br />
(TestID uniqueidentifier NOT NULL,<br />
TestData int NULL,<br />
Constraint PK_Test1 PRIMARY KEY(TestID)<br />
)
That script will create a CLUSTERED INDEX on The TestID Column and to verify it
you can try:
CREATE CLUSTERED INDEX I_PK ON Test1(TestID)
and it will return:
Server: Msg 1902, Level 16, State 3, Line 10<br />
Cannot create more than one clustered index on table 'TEST1'. Drop the existing clustered index 'PK_Test1' before creating another.
"However, one method which I use is to have a PK which is a 4 byte integer and may or may not be auto-incrementing (int IDENTITY in SQL Server), but have an additional 36 byte char column for GUID which is not part of the PK"
Let me see if I can understand your point. You are VERY hesitant to use it but you included as an extra column
"...while keeping my nice and simple INT for PK."
How can you guarantee the uniqueness in a disconnected environment for your INT primary key?
May be doing one of the following:
- You will have to know in advance how many users will use your system an divided the value space (NOT a very scalable choice in my opinion)
- You will use an Identifier location value within that integer value but the value domain will be punished again
- You will regenerate the "primary key" in the synchronization process but if you use the GUID for that, your ACTUAL primary key is the GUID because is the only one that didn't change
"Also, not all DBMSs support GUIDs, though most do now."
What do you want to say, yes or no?
"...Another issue with GUID's is that it is not possible to request a range of GUIDsin SQL which is often quite usefull for limiting searches"
It is true that searches by GUID Ranges are meaningless but it was not designed for that. The main purpose is uniqueness, remember. That may be considered a limitation for applications where the primary key has some meaning. Be careful though with a primary key that carries some meaning because the stability over time can be changed with the business requirements and it could break it.
|
|
|
|
|
Marc Clifton wrote:
I recently installed a custom inventory and workorder management system for a new boatyard client, and one of the questions that the two owners asked was "can we each take some of the data entry work home with us and work on it over the weekend, and then merge all our changes back into the master database?" to which I said "no, definitely not!".
The primary issue being, as you described in your article, the use of a primary key (basically an autoincrement ID) which would have caused lots of conflict between the two "sets" of data and of course their children.
Hey Marc. One method I would suggest for implementing this feature is to not think of the process as merging changes, but rather as pure data entry followed by transactional generation. What I mean by this is that you can treat the data entry and database updates as seperate processes entirely. I have done exactly this with one of my clients and it works quite nicely and has some great side-benefits as well.
The way it works is this. First, the user enters their data into forms. Often I create a form that simulates the look of the paper form they are entering (Work order, PO, survey, etc.) They can enter as many records into this form as they want to. The form records are stored in the DB as their own table/tables. When they are finished entering the raw data, they then "Process" the data which generates the actual database transactions modifying customer records, creating invoice records, etc.
This methold not only allows for offline data entry and online merging, but has the addeed benefit of maintaining the integrity of the original data entry process. Once processed, the original form records are marked as READ ONLY and exist for reference purposes.
Just my 2 cents.
PS. I suspect that you have already thought of this method, since it is not particularly original, but I thought I would bring it up anyway.
|
|
|
|
|
Matt Gullett wrote:
as pure data entry followed by transactional generation.
What an excellent suggestion! Thanks Matt. Talk about tunnel vision--I should have figured that out! And the beauty of it is, since everything is already going through a single method for all transactions, which is already instrumented, I just have to read the log file back in with consideration to the PK ID's! Beautiful!
Marc
Help! I'm an AI running around in someone's f*cked up universe simulator. Sensitivity and ethnic diversity means celebrating difference, not hiding from it. - Christian Graus Every line of code is a liability - Taka Muraoka Microsoft deliberately adds arbitrary layers of complexity to make it difficult to deliver Windows features on non-Windows platforms--Microsoft's "Halloween files"
|
|
|
|
|