Click here to Skip to main content
15,746,302 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
How to process multiple excel file from a folder and load into database using python.

What I have tried:

  all_data = pd.DataFrame()

  for f in glob.glob("*.xlsx"):
     df = pd.read_excel(f)
     all_data = all_data.append(df,ignore_index=True)

# now save the data frame
  writer = pd.ExcelWriter('data.xlsx')
  all_data.to_excel(writer,'Sheet1',index = False)

  data = pd.read_excel('c:\\users\sam\desktop\data.xlsx')

 # rename columns
 data = data.rename(columns={'posteddate': 'posteddate',
                          'totalamt': 'totalamt',
                          'qty': 'qty',
                          'itemstatus': 'itemstatus',
                          'groupitem': 'groupitem',
                          'supdlno': 'supdlno',
                          'barcode': 'barcode',
                          'unitprice': 'unitprice',
                          'subtotal': 'subtotal',
                          'itemnumber': 'itemnumber'})

 # open the workbook and define the worksheet
 book = xlrd.open_workbook("c:\\users\sam\desktop\data.xlsx")
 sheet = book.sheet_by_name("sheet1")

 query1 = """
 create table [leaf] (

 posteddate varchar(255),
 totalamt varchar(255),
 qty varchar(255),
 itemstatus varchar(255),
 groupitem varchar(255),
 supdlno varchar(255),
 barcode varchar(255),
 unitprice varchar(255),
 subtotal varchar(255),
 itemnumber varchar(255)

 query = """
 insert into [leaf] (
  ) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"""

 # execute create table
   except pypyodbc.programmingerror:

 # grab existing row count in the database for validation later
 cursor.execute("select count(*) from leaf")
 before_import = cursor.fetchone()

 for r in range(1, sheet.nrows):
  posteddate = sheet.cell(r,1).value
  totalamt= sheet.cell(r,4).value
  qty = sheet.cell(r,8).value
  itemstatus= sheet.cell(r,12).value
  groupitem= sheet.cell(r,14).value
  supdlno= sheet.cell(r,16).value
  barcode= sheet.cell(r,17).value
  unitprice= sheet.cell(r,18).value
  subtotal= sheet.cell(r,19).value
  itemnumber= sheet.cell(r,20).value
  # assign values from each row

  values = (posteddate, totalamt, qty, itemstatus,
 groupitem, supdlno, barcode, unitprice, subtotal, itemnumber)

     # execute sql query

  cursor.execute(query, values)

     # commit the transaction

 # if you want to check if all rows are imported
 cursor.execute("select count(*) from leaf")
 result = cursor.fetchone()

 print((result[0] - before_import[0]) == len(data.index))  # should be true

 # close the database connection
Updated 20-Sep-18 23:59pm

1 solution

If it's a sql server database, create a SSIS package to to it. That's the absolute easiest way to do it.
Share this answer
Member 13264296 16-Sep-18 20:23pm    
Thank you for your response sir. I will search about it. Can you gave some references?
#realJSOP 17-Sep-18 5:27am    
The tools you use depend on what version of sql server you're using. When you google "ssis", include the version of ql, such as "sql server 2016 ssis".
Member 13264296 17-Sep-18 22:38pm    
SSIS does not need a plenty of code. right?. I may try using DataFrame and update my post after I solved it.
#realJSOP 18-Sep-18 5:21am    
I've been doing SSIS for two years and have only had to write two script tasks. I have no idea what dataframe is.
Member 13264296 18-Sep-18 5:30am    
I got it sir John Simmons.I used dataframe to combine all excel file from directory into one excel file and I used my code above to load into sql database. My problem is now, is how can I locate different directory that contains xlsx file.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900