Automate Excel with Python with openpyxl
By now you know how to work with Text, CSV and JSON.
Now, let's learn how to Automate Excel with Python.
There are many packages for working with Excel in Python, but we'll focus on OpenPyXL.
OpenPyXl is a very simple and straight-forward package that allows you to do anything you need to automate your work in Excel. So let's dive in.
📃Step 1 - Documentation
⬇️ Install OpenPyXL
First of all, we need to import the package:
However, this is not a built-in package so we need to install it.
If you use pyCharm:
Settings → Project → Python Interpreter → + (Add Package) → search openpyxl
For other IDE:
You can use the terminal and install package with pip .
⚠️ If you get any errors, double-check your virtual environment or Python version.
And if everything works, then let's start with the basics of openpyxl.
Sign-Up For Future Updates✨
Be among the first people to hear about
New Python Courses or Useful Resources!
Once there's enough demand I might start a Python Newsletter
with even more Tips and Tricks to help you learn it better!
Sign-Up For Future Updates✨
Be among the first people to hear about
New Python Courses or Useful Resources!
Once there's enough demand I might start a Python Newsletter
with even more Tips and Tricks to help you learn it better!
Sign-Up For Future Updates✨
Be among the first people to hear about
New Python Courses or Useful Resources!
Once there's enough demand I might start a Python Newsletter
with even more Tips and Tricks to help you learn it better!
1️⃣ Open/Create Excel Workbook
To begin we need to get the Workbook, which refers to the Excel file itself.
We can either open existing file or create a new one. Just remember that nothing will happen on disk until you use .save() on your Workbook.
from openpyxl import *
filename = 'Example.xlsx'
wb = load_workbook(filename)
wb = Workbook()
wb.save('text.xlsx')
from openpyxl import *
filename = 'Example.xlsx'
wb = load_workbook(filename)
wb = Workbook()
wb.save('text.xlsx')
from openpyxl import *
filename = 'Example.xlsx'
wb = load_workbook(filename)
wb = Workbook()
wb.save('text.xlsx')
💡 Tip: If you see "Permission denied" when saving, it usually means the Excel file is currently open. Close it and try again.
2️⃣ Excel Worksheets: Get/Create/Copy/Remove
Next, we need to learn about Worksheets, which are tabs inside your Excel file.
It's very common to use wb.active to get the last used worksheet, but sometimes you want to be more specific. So here are different ways to get existing Worksheet.
ws = wb.active
ws = wb['Sheet2']
all_ws = wb.worksheets
all_ws_names = wb.sheetnames
ws = wb.active
ws = wb['Sheet2']
all_ws = wb.worksheets
all_ws_names = wb.sheetnames
ws = wb.active
ws = wb['Sheet2']
all_ws = wb.worksheets
all_ws_names = wb.sheetnames
We can also Create/Copy/Remove Worksheets. Just make sure you don't delete important data when you use Remove.
new_ws = wb.create_sheet('NewSheet4')
new_ws = wb.copy_worksheet(ws)
new_ws.title = 'Copy!'
ws_to_remove = wb['NewSheet4']
wb.remove(ws_to_remove)
wb.save(filename)
new_ws = wb.create_sheet('NewSheet4')
new_ws = wb.copy_worksheet(ws)
new_ws.title = 'Copy!'
ws_to_remove = wb['NewSheet4']
wb.remove(ws_to_remove)
wb.save(filename)
new_ws = wb.create_sheet('NewSheet4')
new_ws = wb.copy_worksheet(ws)
new_ws.title = 'Copy!'
ws_to_remove = wb['NewSheet4']
wb.remove(ws_to_remove)
wb.save(filename)
💡Don't forget to include the code where we created/opened a Workbook.
🔎 Reading data
Alright we have a Workbook and a Worksheet, now we can look at the data.
filename = 'Example.xlsx'
wb = load_workbook(filename)
ws = wb.active
print(f'Worksheet: {ws.title}')
print('-'*50)
filename = 'Example.xlsx'
wb = load_workbook(filename)
ws = wb.active
print(f'Worksheet: {ws.title}')
print('-'*50)
filename = 'Example.xlsx'
wb = load_workbook(filename)
ws = wb.active
print(f'Worksheet: {ws.title}')
print('-'*50)
There are multiple ways to read information so let's cover the basics.
Starting with Reading Specific Cells (very straight-forward):
cell_a1 = ws['A1']
cell_a2 = ws.cell(row=2, column=1)
print(cell_a1.value)
print(cell_a2.value)
cell_a1 = ws['A1']
cell_a2 = ws.cell(row=2, column=1)
print(cell_a1.value)
print(cell_a2.value)
cell_a1 = ws['A1']
cell_a2 = ws.cell(row=2, column=1)
print(cell_a1.value)
print(cell_a2.value)
💡Notice that Rows/Columns count starts with 1 and not 0 like in programming.
Next, let's create a loop to read lots of values at once. At first we'll look at creating ranges to read over certain array of numbers.
Read with Range Loop :
for row in range(10):
row_data = []
for col in range(10):
cell = ws.cell(row=row+1, column=col+1)
if cell.value:
row_data.append(cell.value)
print(row_data)
for row in range(10):
row_data = []
for col in range(10):
cell = ws.cell(row=row+1, column=col+1)
if cell.value:
row_data.append(cell.value)
print(row_data)
for row in range(10):
row_data = []
for col in range(10):
cell = ws.cell(row=row+1, column=col+1)
if cell.value:
row_data.append(cell.value)
print(row_data)
This works well for writing data, but not so great for reading because we'll get a bunch of empty cells...
So let's look at better way to read.
Loop over Rows/Columns with Data:
So, instead of using random numbers to iterate over, we can actually get the number of rows/columns already used by Worksheet. This way you'll iterate over actual data.
Worksheet.iter_rows() - Iterate over list of rows containing data
Worksheet.iter_columns() - Iterate over list of columns containing data
So let's iterate over rows and read their data:
for row in ws.iter_rows():
row_data = [r.value for r in row]
print(row_data)
for item in row_data:
print(item.value)
for row in ws.iter_rows():
row_data = [r.value for r in row]
print(row_data)
for item in row_data:
print(item.value)
for row in ws.iter_rows():
row_data = [r.value for r in row]
print(row_data)
for item in row_data:
print(item.value)
Formulas vs Raw-Data
If you pay attention, you might notice that when we get values from cells we might also read actual formula instead of the data. Sometimes it might be useful, but in Python you'll often want to work with actual data instead.
So there is a trick to convert your workbook cells into raw-data with load_workbook(filename, data_only=True).
💡IMPORTANT: Be careful using data_only=True, because if you save your Excel, you'll remove all your formulas from all sheets! Might be a catastrophie
💡Pro Tip: Always create copies of your Excel files before you start automations. You don't want to destroy important data!
So, let me show you a workaround so we can read the same Excel file as Raw-Data and Values with Formulas at the same time:
wb = load_workbook(filename)
ws = wb.active
wb_data = load_workbook(filename, data_only=True)
ws_data = wb_data.active
for row, row_data in zip(ws.iter_rows(), ws_data.iter_rows()):
row_values = [cell.value for cell in row]
row_results = [cell.value for cell in row_data]
print('Row Values: ' , row_values)
print('Row Results: ' , row_results)
print('---')
wb = load_workbook(filename)
ws = wb.active
wb_data = load_workbook(filename, data_only=True)
ws_data = wb_data.active
for row, row_data in zip(ws.iter_rows(), ws_data.iter_rows()):
row_values = [cell.value for cell in row]
row_results = [cell.value for cell in row_data]
print('Row Values: ' , row_values)
print('Row Results: ' , row_results)
print('---')
wb = load_workbook(filename)
ws = wb.active
wb_data = load_workbook(filename, data_only=True)
ws_data = wb_data.active
for row, row_data in zip(ws.iter_rows(), ws_data.iter_rows()):
row_values = [cell.value for cell in row]
row_results = [cell.value for cell in row_data]
print('Row Values: ' , row_values)
print('Row Results: ' , row_results)
print('---')
💡IMPORTANT: Again! Be VERY careful using data_only=True, because if you save your Excel, you'll remove all your formulas from all sheets! Might be a catastrophie
✍️ Writing data (cells, appending rows, inserting, moving)
So we can read data, how about writing data to Excel?
Let's start with writing data to specific cells, and there are many options to choose from:
ws['B2'] = 'New B2 Value!'
ws['B3'].value = 'New B3 Value!'
ws.cell(row=4, column=2, value='New B4 Value!')
ws['B2'] = 'New B2 Value!'
ws['B3'].value = 'New B3 Value!'
ws.cell(row=4, column=2, value='New B4 Value!')
ws['B2'] = 'New B2 Value!'
ws['B3'].value = 'New B3 Value!'
ws.cell(row=4, column=2, value='New B4 Value!')
This works fine for single cells, but you'll more often want to add rows of data at the same time. Therefore it's best to use append like this:
n = ws.max_row+1
ws.append(['C007', 'Glass Window', 'Window', 5, 250, f'=D{n}*E{n}'])
n = ws.max_row+1
ws.append(['C007', 'Glass Window', 'Window', 5, 250, f'=D{n}*E{n}'])
n = ws.max_row+1
ws.append(['C007', 'Glass Window', 'Window', 5, 250, f'=D{n}*E{n}'])
💡Notice that we can use ws.max_row or ws.max_column to get the number of the last Column/Row Used. That can be really useful for formulas or getting the right cells!
Now, imagine that you have a simple table with data and the last row or column are sums. Then you might want to insert a row/column to write more data. For that you can use:
ws.insert_row()
ws.insert_cols()
ws.insert_rows(3)
ws.insert_cols(3)
ws.insert_rows(3)
ws.insert_cols(3)
ws.insert_rows(3)
ws.insert_cols(3)
Or similarly, you can also delete rows/columns
ws.delete_rows(1)
ws.delete_cols(1)
ws.delete_rows(1)
ws.delete_cols(1)
ws.delete_rows(1)
ws.delete_cols(1)
Lastly, it's less common to use but you can move your data around by defining a range of data and then specifying how many rows/columns to move it. Be careful overriding existing data!
ws.move_range("K11:L17", rows = -10, cols = -10)
ws.move_range("K11:L17", rows = -10, cols = -10)
ws.move_range("K11:L17", rows = -10, cols = -10)
🔠 Ranges
Now, let's look how we can work with ranges of data.
We can get all data in a Column, Row or a 2D-Matrix of data.
Here's an example:
cell_range = ws['A1':'C2']
col_C = ws ['C']
col_range = ws['C:D']
row10 = ws[10]
row_range = ws[5:10]
print(cell_range )
print(col_C )
print(col_range )
print(row10 )
print(row_range )
print('-'*50)
cell_range = ws['A1':'C2']
col_C = ws ['C']
col_range = ws['C:D']
row10 = ws[10]
row_range = ws[5:10]
print(cell_range )
print(col_C )
print(col_range )
print(row10 )
print(row_range )
print('-'*50)
cell_range = ws['A1':'C2']
col_C = ws ['C']
col_range = ws['C:D']
row10 = ws[10]
row_range = ws[5:10]
print(cell_range )
print(col_C )
print(col_range )
print(row10 )
print(row_range )
print('-'*50)
Now if you'd want to iterate over the data in a range, keep in mind of its structure.
If it's just a Column or a Row it will be treated as a list, so you'd use only one loop. But if it's a range of multiple rows and columns, then you'd want to iterate over rows and then items in each raw.
Like this:
for item in col_C:
print(item)
for row in cell_range:
print(row)
for item in row:
print(item)
for item in col_C:
print(item)
for row in cell_range:
print(row)
for item in row:
print(item)
for item in col_C:
print(item)
for row in cell_range:
print(row)
for item in row:
print(item)
🎨 Formatting
We've covered the main basics, now let's look at formatting.
For that we'll need to import styles from openpyxl.styles import * and then we'd want to configure various attributes of cells:
from openpyxl.styles import *
ws['A5'].font = Font(bold=True, color='FFFF9A2E', size=24)
ws['A5'].fill = PatternFill(start_color='1C1C1C', end_color='C9C9C9', fill_type='solid')
ws['A5'].alignment = Alignment(horizontal='center')
ws.merge_cells("B1:D5")
ws.row_dimensions[5].height = 50
ws.column_dimensions['A'].width = 50
from openpyxl.comments import Comment
ws['A5'].comment = Comment('Auto Comment', 'EF')
ws['A5'].hyperlink = 'https://www.python.org'
ws['A5'].style = 'Hyperlink'
wb.save(filename)
from openpyxl.styles import *
ws['A5'].font = Font(bold=True, color='FFFF9A2E', size=24)
ws['A5'].fill = PatternFill(start_color='1C1C1C', end_color='C9C9C9', fill_type='solid')
ws['A5'].alignment = Alignment(horizontal='center')
ws.merge_cells("B1:D5")
ws.row_dimensions[5].height = 50
ws.column_dimensions['A'].width = 50
from openpyxl.comments import Comment
ws['A5'].comment = Comment('Auto Comment', 'EF')
ws['A5'].hyperlink = 'https://www.python.org'
ws['A5'].style = 'Hyperlink'
wb.save(filename)
from openpyxl.styles import *
ws['A5'].font = Font(bold=True, color='FFFF9A2E', size=24)
ws['A5'].fill = PatternFill(start_color='1C1C1C', end_color='C9C9C9', fill_type='solid')
ws['A5'].alignment = Alignment(horizontal='center')
ws.merge_cells("B1:D5")
ws.row_dimensions[5].height = 50
ws.column_dimensions['A'].width = 50
from openpyxl.comments import Comment
ws['A5'].comment = Comment('Auto Comment', 'EF')
ws['A5'].hyperlink = 'https://www.python.org'
ws['A5'].style = 'Hyperlink'
wb.save(filename)
This is a big example that includes a lot of examples, but you can use only the parts that are relevant to you. Maybe you only want to add some styling or hyperlink then only use those sections.
💡If you use stlyes like Hyperlink it may override other attributes.
🧪 Final Example: EF-Rooms (mini “report”)
Alright, we've covered a lot of bits and pieces now let's put it all together into an actual example you might want to use in your Python scripts.
Let's say we calculated some data where we made a list of nested lists to represent multiple rows:
data = [
['Room Number', 'Room Name' , 'Area (sqm)' , 'Occupancy' , 'Finishes' ],
['101' , 'Lobby' , 35 , 50 , 'Marble Floor' ],
['102' , 'Conference Room' , 25 , 20 , 'Carpet' ],
['103' , 'Office' , 15 , 5 , 'Vinyl Flooring'],
]
data = [
['Room Number', 'Room Name' , 'Area (sqm)' , 'Occupancy' , 'Finishes' ],
['101' , 'Lobby' , 35 , 50 , 'Marble Floor' ],
['102' , 'Conference Room' , 25 , 20 , 'Carpet' ],
['103' , 'Office' , 15 , 5 , 'Vinyl Flooring'],
]
data = [
['Room Number', 'Room Name' , 'Area (sqm)' , 'Occupancy' , 'Finishes' ],
['101' , 'Lobby' , 35 , 50 , 'Marble Floor' ],
['102' , 'Conference Room' , 25 , 20 , 'Carpet' ],
['103' , 'Office' , 15 , 5 , 'Vinyl Flooring'],
]
Now here's a code snippet on how we could create a new Workbook and write this data with some simple formatting for headers:
data = [
['Room Number', 'Room Name' , 'Area (sqm)' , 'Occupancy' , 'Finishes' ],
['101' , 'Lobby' , 35 , 50 , 'Marble Floor' ],
['102' , 'Conference Room' , 25 , 20 , 'Carpet' ],
['103' , 'Office' , 15 , 5 , 'Vinyl Flooring'],
]
wb = Workbook()
ws = wb.active
ws.title = 'EF - Rooms'
headings = data[0]
data = data[1:]
from openpyxl.styles import Font
for col_n, heading in enumerate(headings, start=1):
cell = ws.cell(row=1, column=col_n, value=heading)
cell.font = Font(bold=True, size=16)
for row in data:
ws.append(row)
filename = 'EF-Rooms.xlsx'
wb.save(filename)
data = [
['Room Number', 'Room Name' , 'Area (sqm)' , 'Occupancy' , 'Finishes' ],
['101' , 'Lobby' , 35 , 50 , 'Marble Floor' ],
['102' , 'Conference Room' , 25 , 20 , 'Carpet' ],
['103' , 'Office' , 15 , 5 , 'Vinyl Flooring'],
]
wb = Workbook()
ws = wb.active
ws.title = 'EF - Rooms'
headings = data[0]
data = data[1:]
from openpyxl.styles import Font
for col_n, heading in enumerate(headings, start=1):
cell = ws.cell(row=1, column=col_n, value=heading)
cell.font = Font(bold=True, size=16)
for row in data:
ws.append(row)
filename = 'EF-Rooms.xlsx'
wb.save(filename)
data = [
['Room Number', 'Room Name' , 'Area (sqm)' , 'Occupancy' , 'Finishes' ],
['101' , 'Lobby' , 35 , 50 , 'Marble Floor' ],
['102' , 'Conference Room' , 25 , 20 , 'Carpet' ],
['103' , 'Office' , 15 , 5 , 'Vinyl Flooring'],
]
wb = Workbook()
ws = wb.active
ws.title = 'EF - Rooms'
headings = data[0]
data = data[1:]
from openpyxl.styles import Font
for col_n, heading in enumerate(headings, start=1):
cell = ws.cell(row=1, column=col_n, value=heading)
cell.font = Font(bold=True, size=16)
for row in data:
ws.append(row)
filename = 'EF-Rooms.xlsx'
wb.save(filename)
Overall, it's not complicated and you can expand it to your own needs. Just take it one step at the time.
🐼 Bonus: pandas
Remember I've mentioned that there are plenty of packages to work with Excel.
Openpyxl is a great all-in-one package for working with Excel using Python. However, if you're going to go deep into data science and often work with analyzing data, you might also want to learn a more scientific package called - Pandas.
It's also a custom pacakge so don't forget to install it - pip install pandas
It's a more powerful package which is often used in AI, machine learning and other data related disciplines. And it also has many option export your data in various formats including Excel. For that we'd need to use DataFrame data structure in pandas.
Here's an example on how we could convert a JSON data into an excel in just a few lines:
import pandas as pd
data = {
"Project Name": [
"Burj Khalifa",
"Shanghai Tower",
"Abraj Al-Bait Clock Tower",
"Ping An Finance Center",
"Lotte World Tower"
],
"Height (m)": [
828,
632,
601,
599,
555
],
"Country": [
"United Arab Emirates",
"China",
"Saudi Arabia",
"China",
"South Korea"
],
"City": [
"Dubai",
"Shanghai",
"Mecca",
"Shenzhen",
"Seoul"
]
}
df = pd.DataFrame(data)
print(df)
df.to_excel('big buildings.xlsx', index=False)
import pandas as pd
data = {
"Project Name": [
"Burj Khalifa",
"Shanghai Tower",
"Abraj Al-Bait Clock Tower",
"Ping An Finance Center",
"Lotte World Tower"
],
"Height (m)": [
828,
632,
601,
599,
555
],
"Country": [
"United Arab Emirates",
"China",
"Saudi Arabia",
"China",
"South Korea"
],
"City": [
"Dubai",
"Shanghai",
"Mecca",
"Shenzhen",
"Seoul"
]
}
df = pd.DataFrame(data)
print(df)
df.to_excel('big buildings.xlsx', index=False)
import pandas as pd
data = {
"Project Name": [
"Burj Khalifa",
"Shanghai Tower",
"Abraj Al-Bait Clock Tower",
"Ping An Finance Center",
"Lotte World Tower"
],
"Height (m)": [
828,
632,
601,
599,
555
],
"Country": [
"United Arab Emirates",
"China",
"Saudi Arabia",
"China",
"South Korea"
],
"City": [
"Dubai",
"Shanghai",
"Mecca",
"Shenzhen",
"Seoul"
]
}
df = pd.DataFrame(data)
print(df)
df.to_excel('big buildings.xlsx', index=False)
What I like a lot about pandas is that if you print your DataFrame it has a really nice styling in the console like this:
It also has advanced features for working with data, but it can be a lesson on its own...
Happy Coding!
🙋♂️ See you in the next lesson.
- EF
Happy Coding!
🙋♂️ See you in the next lesson.
- EF
Happy Coding!
🙋♂️ See you in the next lesson.
- EF