Dumping might be the easiest way to go here, although it can get data and time consuming quite quickly. So I'd suggest to just update and insert the changes even if it's a bit more complicated - especially if you're going to work on big tables.
For a start: Are the two databases on the same server? If that's the case, you might be able to handle this with SQL using something like this:
UPDATE table1, table2
SET table1.col1 = table2.col1,
table1.col2 = table2.col2,
table1.col3 = table2.col3,
...
table1.col_last = table2.col_last
WHERE table1.id = table2.id AND table1.time_stamp <> table2.time_stamp
INSERT INTO table1 (col1, col2, col3, ... col_last)
SELECT col1, col2, col3, ... col_last FROM table2
WHERE NOT EXISTS (SELECT * FROM table2 WHERE table2.id = table1.id)
I'm not familiar with MySQL and I cannot test it. So please do some testing before using this!
If you need two different connections, it get's ugly. I'd try to do something similar as the SQL statement implies to do:
- Get lists of all ids and their timestamps:
- List A from table1 (target)
- List B from table2 (source)
- Filter the lists to find:
- List1: ids that exist in A and B but differ in timestamp
- List2: ids in B that do not exist in A
- Load all entries from table2 where id is in List1 and update those in table1
- Load all entries from table1 where id is in List2 and insert those in table1
The reason behind loading only id and timestamp is to safe memory and query / transfer time so the code doesn't have to handle all the data just to determine where the changes are. In C# that could look something like this:
class IdTsEntry
{
public int Id { get; set; }
public DateTime TimeStamp { get; set; }
}
using (DbCommand cmd1 = conn1.CreateCommand())
using (DbCommand cmd2 = conn2.CreateCommand())
{
cmd1.CommandText = "SELECT id, time_stamp FROM table1";
List<idtsentry> A = new List<idtsentry>();
using (DbDataReader reader = cmd1.ExecuteReader())
{
while (reader.Read())
{
A.Add(new IdTsEntry() {
Id = reader.GetInt32(0),
TimeStamp = reader.GetTimeStamp(1)
});
}
}
cmd2.CommandText = "SELECT id, time_stamp FROM table2";
List<idtsentry> B = new List<idtsentry>();
using (DbDataReader reader = cmd2.ExecuteReader())
{
while (reader.Read())
{
A.Add(new IdTsEntry() {
Id = reader.GetInt32(0),
TimeStamp = reader.GetTimeStamp(1)
});
}
}
List<int> List1 = new List<int>();
List<int> List2 = new List<int>();
foreach (IdTsEntry b in B)
{
var a = A.FirstOrDefault(e => e.Id.Equals(b.Id));
if (a == null)
List2.Add(b.Id);
else if (!a.TimeStamp.Equals(b.TimeStamp))
List1.Add(b.Id);
}
string ct1 = "SELECT col1, col2, ... col_last FROM table2 WHERE id = {0}";
string ct2 = "UPDATE table1 SET col1 = @val1, col2 = @val2, ... " +
"col_last = @val_last WHERE id = {0}";
foreach (int id in List1)
{
cmd1.CommandText = String.Format(ct1, id);
cmd2.Parameters.Clear();
int i = 0;
using (DbDataReader reader = cmd1.ExecuteReader())
{
if (!reader.Read()) continue;
for (int n = 0; n < reader.FieldCount; n++)
{
cmd2.Parameters.Add(String.Format("val{0}", ++i),
reader.IsDBNull(n) ? DBNull.Value : reader.GetValue(n));
}
}
cmd2.CommandText = String.Format(ct2, id);
cmd2.ExecuteNonQuery();
}
ct1 = "SELECT id, col1, col2, ... col_last FROM table2 WHERE id = {0}";
ct2 = "INSERT INTO table1 (id, col1, col2, ... col_last) " +
"VALUES (@val1, @val2, @val3, ... @val_last)";
cmd2.CommandText = ct2;
foreach (int id in List2)
{
cmd1.CommandText = String.Format(ct1, id);
cmd2.Parameters.Clear();
int i = 0;
using (DbDataReader reader = cmd1.ExecuteReader())
{
if (!reader.Read()) continue;
for (int n = 0; n < reader.FieldCount; n++)
{
cmd2.Parameters.Add(String.Format("val{0}", ++i),
reader.IsDBNull(n) ? DBNull.Value : reader.GetValue(n));
}
}
cmd2.ExecuteNonQuery();
}
}
</int></int></int></int></idtsentry></idtsentry></idtsentry></idtsentry>
Like the SQL statement: I haven't tested this and I wrote that quickly. Hopefully it'll help you though. You can also alter the WHERE statements to cover only the date range after the last sync - that'll speed things up.