Excel & Python

Preface

Fibonacci

Simulation

Database

Web

Plotting results

UDF

This website's continuity

Python & Excel

Excel is part of MS Office.
VBA is the programming language in MS Office.
Python is a programming language that has been developed since the early 90's.
Unlike VBA Python is an Open Source programming language.
The number of Python developers exceeds those of VBA by far.
Python has many sub-libraries that can be complementary to the options in MS-Office.
Python can calculate with indefinite large numbers unlike Excel for instance.

Many Python programmers would like a better integration of Python and Excel.
In the Python platform several attempts have been made to link Python to Excel.
An important one being XLwings.
XLwings is part of the Anaconda distribution of Python: https://www.anaconda.com/download/

I researched XLwings to assess its value in linking Python and Excel.
XLwings creates an own Python sub library (xlwings.py) in which it rewrites all Excel options in Python.
I don't consider that a fruitful path to follow.

I tried to develop an alternative in which:
- Python can be started from/by Excel
- arguments from Excel can be passed to the Python code
- Python's result will be stored in a form that is manageable for Excel: a csv-file or graphic
- the Excel file is linked to the Python output by queries
- the user can view/adapt the Python code in Excel.

In order to compare the XLwings approach to mine I rewrote XLwings' examples.
You can find them here: https://www.xlwings.org/examples

Advantages in my approach:
- no need for a separate Python library
- the way the Python output will be used takes place in Excel, not 'hardcoded' in Python code
- Python will only be used for what Excel can't do: Python is supplementary to Excel
- the use of Python output in Excel is flexible
- the use of Excel-events eliminates buttons in a worksheet and makes changes instantaneous
- the Python tab in the Ribbon contains all (i.e. very few) buttons for recalculation/updating/refreshing
- no special Add-In required
- the Python-code can be viewed/adapted in a userform in Excel: no special Python editor is required
- all options in Excel (actual and in the future) are available; no need to wait for implementation of new options in a Python library

Excel as Python GUI

The VBA options in Excel provide you the means to create a VBA Userform as Grafical User Interface(GUI) for Python.
Python has no builtin GUI.
The Python-library with which you can create a GUI isn't very userfriendly to put it mildly.
So why not using the builtin graphical VBA Userform options ?
I integrated the examples 'Fibonacci', 'simulation', 'plot' and 'database' into the workbook 'Python GUI.xlsb'.
The web example uses the pivottable and slicers as GUI; a userform isn't applicable in that case.
The UDF example is meant for worksheets only.

As soon as the workbook opens the userform will be shown.
The Userform contains a Multipage with 5 tabs.
All Pythoncode has been integrated in 1 Python-file: 'Excel.py'.
The functions that will be called from Excel have separate names in the Python file.
If Python requires extra libraries to be loaded it will do so as soon as the Function has been called.
So the memory won't be burdened by libraries the function doesn't use.

The Python code writes its results to a 'csv'-file or a graphical file.
Sometimes the VBA code reads the csv-file directly.
In some cases the csv-file will be read by a querytable.
Querytables have an 'own' very hidden sheet ('database' and 'simulation').

The examples:

- a Fibonacci array
- a MonteCarlo-simulation
- the use of a music database
- web scraping and analysis of those data in a pivottable
- 'plotting': graphical representation of calculations that Excel doesn't contain
- the use of Python in UDFs

Start

Install Python in order to run the examples.
Python has momentarily 2 'flavours': Python 2.x and 3.x.
The x is a version number.
The examples on this page run in both versions, the 'database' excepted.
The database needs Python 3.x.
If Python has been installed correctly you can use "py -2" to start Python 2.x and "py -3" to launch Python 3.x.
The examples use this way to start Python.

Examples

A ZIP-file contains an example.
Each Zip-file consists of
- an Excel *.xlsb file
- a Python *.py file
- a *.csv file

Every Excel-file contains
'Python' tab in the Ribbon
Py edit button to show/amend the Python code
button(s) to run macros

NB. Keep in mind: a Userform only shows the Python code as plain text
The Userform is not a Python-editor with accessories.

Steps

Unzip the zip-file in a separate, own directory.
Open the Excel file.
The button(s) in the Python-tab start the macros that trigger the Python-code.
The result will be written into the worksheet.
The first time the code runs it might be somewhat slower since libraries have to be loaded into memory.

Excel as Python GUI Python GUI



Fibonacci Fibonacci

Enter in cell A2 how many Fibonacci numbers you want to generate.
The VBA code in the worksheet_change event triggers the Python code.
The Fibonacci numbers pop up in column B.

Enter in cells C2 and D2 which numbers will constitute the double-sum: 2*(C2+D2)
The VBA code in the worksheet_change event triggers the Python code.
The result will be shown in cell B1.

You can switch between calculations using the buttons 'Double' and 'Fibonacci' in the Python Tab in the Ribbon.

The Excel file contains 1 querytable (in cell B1) for several Python calculations.
The use of Python to calculate the double sum is only better than Excel for large numbers Excel can't handle.

Have for example a look at the result for 600 Fibonacci-numbers.

Python-file
def main(n,n_csv):
c_00 = ''
a, b = 0, 1
for it in range(n):
a, b = b, a + b
c_00 += str(a) +'\n'
open(n_csv,'w').write(c_00)

def d_sum(x, y, n_csv):
open(n_csv,'w').write(str(2 * (x + y))+'\n \n ')
Simulation Simulation

Enter the parameters for the simulation in range B1:B7.
Click button 'Refresh' in the Python tab in the Ribbon.
The query in the hidden worksheet 'data' reads the results of the calculation in Python.
The graph in sheet 'simulation' uses the query as DataSource.
Refreshing the querytable means updating the graph.

At the opening of the workbook the connection of the Querytable will be adapted to the csv-file's current location.

The VBA code to trigger Python is in the codemodule of Sheet1.
You can move the table in sheet1 to every location in the worksheet.

The button 'Py edit' in the Python tab shows the Python code.
You can inspect, amend and save the code.

Python file
import numpy as np


def main(n_0,n_1,n_2,n_3,n_4,n_5,n_csv):
n_3=np.log(1 + float(n_3))
n_4=float(n_4)
n_6=float(float(n_1)/n_2)

perc_selection = [5, 50, 95] # hardcoded

price = np.zeros((n_2 + 1, n_0))
percentiles = np.zeros((n_2 + 1, 3))

price[0,:] = n_5
percentiles[0,:] = n_5

for t in range(1, n_2 + 1):
rand_nums = np.random.randn(n_0)
price[t,:] = price[t-1,:] * np.exp((n_3 - 0.5 * n_4**2) * n_6 + n_4 * rand_nums * np.sqrt(n_6))
percentiles[t, :] = np.percentile(price[t, :], perc_selection)

sy=[t* n_6 for t in range(0,n_2+1)]

open(n_csv,'w').write('')
for (it,it1,it2) in zip(percentiles,price,sy):
open(n_csv,'a').write(str(it[0])+"," +str(it[1])+"," + str(it[2]) +','+str(it1[0]) +','+str(it2) +'\n')


Database Database

The Zip file contains:
- Excel file database.xlsb
- Python file database.py
- csv files combo.csv and list.csv
- Python database chinook.sqlite
The Python code uses the Python library 'sqlite'.

Excel file

The file contains 2 querytables

Querytable 'combo' in the hidden worksheet 'source'.
It contains the inventory of all musical genres in the database.
The Querytable is connected to the csv file 'combo.csv'.
File combo.csv contains the results of the function 'combo' in the Python file 'database.py'
This function filters the musical genres in database 'chinook.sqlite'.
The Combobox 'C_00' in sheet 'playlist' is filled by querytable 'combo' in sheet 'source'.

Querytable 'list' in sheet 'playlist'.
It contains all the tracks in the database that match the selected musical genre in the combobox in sheet 'playlist'.
The querytable is connected to the csv file 'list.csv'
File list.csv contains the results of the function 'list' in the Python file 'database.py'
This function filters the tracks in database 'chinook.sqlite' that match the selected musical genre in Combobox 'C_00' in sheet 'playlist'.

Opening the workbook will adapt the connection of the Querytables to the csv-file's current location.

Use button 'combo' in the Python tab in the Ribbon to update the musical genres in Combobox 'C_00'.

The button 'Playlist' in the Python tab selects the matching tracks.
It is more sensible to trigger the Python code after the selection in Combobox 'C_00'.
You will find that code in the 'C_00_change' event.

The button 'Py edit' in the Python tab shows the Python code.
You can inspect, amend and save the code.

The Python file

The VBA code to trigger Python is in the codemodule of Sheet1.
The Python code illustrates 2 methods to pass a SQL-string in Python:
1. the SQL-string is 'hard coded' in the Python code: the function 'list'
2. the SQL-string is passed as argument from Excel to the Python code: the function 'combo'

import sqlite3


def list(n_db,n_sql,n_csv):
cursor = sqlite3.connect(n_db, detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES).cursor()

cursor.execute(
"""
SELECT
t.Name AS Track, alb.Title AS Album, art.Name AS Artist, t.Composer
FROM PlaylistTrack pt
INNER JOIN Track t ON pt.TrackId = t.TrackId
INNER JOIN Album alb ON t.AlbumId = alb.AlbumId
INNER JOIN Artist art ON alb.ArtistId = art.ArtistId
WHERE PlaylistId = ?
""", (n_sql,))

open(n_csv,'w').write(','.join(it[0] for it in cursor.description)+'\n')
open(n_csv,'a').write('\n'.join(','.join(str(it1).replace(',',' ') for it1 in it) for it in cursor))

def combo(n_db,n_sql,n_csv):
cursor = sqlite3.connect(n_db, detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES).cursor()
open(n_csv,'w').write('\n'.join(it[1] for it in cursor.execute(n_sql)))

Web data Web

This file isn't a XLwings example.
The URL contains approx. 417000 records at the time this code was written.
If the URL doesn't exist anymore you can replace it in the VBA macro 'M_refresh'.

The Python code uses the Python library 'requests'.

The sheet 'data' contains a querytable 'data'.
The query is connected to the file 'data.csv'.
The Python code writes the results of the 'webscraping' into 'data.csv'.

The sheet 'data' can be made invisible.
The event code in ThisWorkbook_Open adapts the Querytable's connection to the location of the csv file 'data.csv'.

The sheet 'pivottable' contains a pivottable.
The 'Data Source' of this pivottable is the querytable in sheet 'data'.
Every time the webdata are being updated the pivottable will be refreshed too.

The button 'Refresh' in de Python tab refreshes the webdata.
The VBA macro 'M_refresh' to triggger Python is in the codemodule of Sheet1.

The button 'Py edit' in the Python tab shows the Python code.
You can inspect, amend and save the code.

Python-bestand
import requests


def load_data(n_url,n_csv):
open(n_csv, 'w', newline="").write(requests.get(n_url).text)

Plotting results Plot

The Python code uses 2 Python libraries:
- 'Numpy' for mathematical calculation
- 'Matplotlib' for graphical representation of mathematical results.

Enter in sheet 'streamplot' in cell B1 the parameter value (integer) for calculation in Numpy.
Start the Python code clicking the button 'Draw' in the Python tab in the Ribbon.

The Python code creates the graphical file 'graph.png' as result.
The VBA macro 'M_plot' adds the graphical file into the worksheet.
The VBA macro 'M_plot' is in the macromodule of sheet 'streamplot'.

The button 'Py edit' in the Python tab shows the Python code.
You can inspect, amend and save the code.

Python file
import numpy as np
import matplotlib.pyplot as plt


def main(n,n_png):
# Based on: http://matplotlib.org/users/screenshots.html#streamplot
Y, X = np.mgrid[-3:3:100j, -3:3:100j]
U = -1 + n * X**2 + Y
V = 1 - n * X - Y**2

fig, ax = plt.subplots(figsize=(6, 4))
strm = ax.streamplot(X, Y, U, V, color=U, linewidth=2, cmap=plt.cm.autumn)
fig.colorbar(strm.lines)

fig.savefig(n_png)

UDF UDF

One of XLwings' examples is a worksheet with UDFs.
A combination of UDF and Python is not very fortunate.
UDFs will be recalculated more often then strictly necessary.
Even if a change takes place in another open workbook.
I think it would be better to use the method implemented in the Fibonacci example.
In that case recalculation only takes place if one of the constituents of the calculation has been altered.
All other calculations will be left alone.
That saves a lot of calculation time and makes the code faster and more robust.
I strongly advise not to combine UDFs and Python.

Nevertheless for illustration purposes I rewrote the XLwings example to show how the same results can be accomplished without XLwings.

The ZIP file contains:
- Excel file UDF.xlsb
- 4 csv files
- Python file udf.py

The file 'UDF.xlsb' contains 4 querytables: 'data_001 ...data_004'.
These queries are connected to 4 CSV files 'data_001.csv ....data_004.csv'
Three querytables are part of sheet 'UDF'; querytable 'data_001' is in the hidden sheet 'data'.
The 'workbook_Open' event code adapts the connections between the csv files and the querytables.

UDFs can only be part of a macromodule; in this case module 'python_udf'.

UDFs will be triggered by every change in the worksheet. That why the Python tab in the Ribbon doesn't contain any button to start macro's.

The button 'Py edit' in the Python tab shows the Python code.
You can inspect, amend and save the code.

The XLwings example
The first UDF in the XLwings example retrieves the workbook.name.
That seems rather farfetched: the Excel formula =cell("filename") get it as well.
I would rather restrict the use of Python to cases Excel doesn't supply.
If you prefer a UDF I'd suggest this (much faster) one:
Function F_snb()
F_snb=thisworkbook.fullname
End Function
The Python file udf.py contains 4 functions: 'double_sum, add_one, matrix_mult and correl'.
The function 'correl' uses the Python library Pandas (part of the Anaconda distribution).
I wasn't able to get 'correl' working.

The other functions only use standard Python code.
Please notice that the result will be written alongside the UDF-formula in the worksheet.

Python file
import pandas as pd

def double_sum(x, y, n_csv):
open(n_csv,'w').write(str(2 * (x + y)) + '\n \n')

def add_one(sn, n_csv):
open(n_csv,'w').write('\n'.join(','.join(str(it1+1) for it1 in it) for it in sn))

def matrix_mult(sn, sp,n_csv):
open(n_csv,'w').write('\n'.join(','.join(str(sn[x][y]*sp[x][y]) for y in range(len(sn))) for x in range(len(sn))))

def correl(sn,n_csv):
st=pd.dataframe(sn).corr()
open(n_csv,'w').write('\n'.join(','.join(str(it1+1) for it1 in it) for it in st))