How to read all the sheets from excel and get into an output file?

Question

0.00/5 (No votes)

See more:

Certainly! I have two Excel files, each containing multiple sheets. In my existing code, using Data frames, I've implemented logic to compare a specific sheet from both files, and I'm obtaining the desired results.

However, the issue is that my code currently only retrieves that specific sheet from both files. But I would like to modify my existing code to also include the remaining sheets from the input files in my output file. How can I achieve this?

What I have tried:

The code I tried:

Python

import pandas as pd
import numpy as np
import openpyxl
from openpyxl import load_workbook
from openpyxl.styles import PatternFill, Border, Side

df1= pd.read_excel(r'D:\excel1.xlsx','Sheet1',na_values=['NA']).fillna('')
df2= pd.read_excel(r'D:\excel2.xlsx','Sheet2',na_values=['NA']).fillna('')

df1 = df1.set_index('ID')
df2 = df2.set_index('ID')

df3 = pd.concat([df1,df2], sort=True, copy=True)
df3a = df3.stack().explode().groupby(level=[0,1]).apply(lambda x: ', '.join(map(str, x.unique()))).unstack(1).copy()

df3a['status'] = ""
       
df3a.loc[~df3a.index.isin(df2.index),'status'] = 'old' # if not in df2 index 
df3a.loc[~df3a.index.isin(df1.index),'status'] = 'new'     # if not in df1 index

idx = df3.stack().groupby(level=[0,1]).nunique() # get modified cells. 
df3a.loc[idx.mask(idx <= 1).dropna().index.get_level_values(0),'status'] = 'modified'
df3a['status'] = df3a['status'].fillna('same') # assume that anything not fufilled by above rules is the same.
        
reorder_columns = df1.columns.tolist() + ['status']
df3a = df3a[reorder_columns]

#-------------------------------------Highlight rows with different colors---------------------------------------------------------------------------#
with pd.ExcelWriter(r'D:\excel_output.xlsx') as writer:
        df3a.to_excel(writer, sheet_name='Sheet1', index=True)

workbook = load_workbook(r'D:\excel_output.xlsx')

sheet1 = workbook['Sheet1']

#rest of the code I implemented logic to apply formatting to that particular sheets..........

sheet1.delete_cols(13)

workbook.save(r'D:\excel_output.xlsx')

Posted 16-Oct-23 20:34pm

User-16078880

Updated 25-Oct-23 4:09am

Deeksha Shenoy

v2

Add a Solution

Comments

Richard MacCutchan 17-Oct-23 4:14am

Check the documentation to see if there is a function to list all worksheets in a workbook.

2 solutions

Solution 2

You can obtain the name of sheets through panda's dataframe methods. See:

Python

import pandas
import openpyxl

# read workbook:
df = pandas.read_excel("D:\\ConflictData.xlsx", None)
# then - method 1:
print(df.keys())

As you can see, df returns a dictionary of dataframes - the names of sheets :)

Posted 27-Oct-23 6:11am

Maciej Los

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Richard MacCutchan · Accepted Answer · 2023-10-16T22:27:00

You just need to iterate through the worksheets in the workbooks. So something like rhe following will list the sheet names in the workbook.

Python

workbook = load_workbook(r'D:\excel1.xlsx')
for sheet in workbook.worksheets:
    print(sheet.title)

So you need something similar for the other workbook. You can then select each pair of worksheets that you wish to compare, and pass the details to the function that processes the content.

How to read all the sheets from excel and get into an output file?

2 solutions

Solution 1

Solution 2

Add your solution here

Preview 0

Existing Members

...or Join us