Programming

Uploading files to SharePoint using Python – Part 3

Python SharePoint

In this multi part article, we are going to setup a Python application that will upload document to a SharePoint Document Library. We will be using an open source Python module named Shareplum to help simplify some parts of this. 

Introduction

This tutorial is going to be broken into 3 parts. In part 1 we will go through the steps to create a new virtual environment, install the dependencies and authenticating with SharePoint. Part 2 we will work on uploading documents and adding metadata to an uploaded document. Then in part 3 will put all this together and iterate over a folder containing multiple files and add metadata to each file bases on an index file or the name of the file. 

I will be working with Visual Studio Code in a Windows environment for this tutorial. All the step include here will work with any IDE, however, some may need to be completed in a different manner.

Interacting with SharePoint using Python – Part 1

Interacting with SharePoint using python – Part 2 

Building the main.py file

We need a way to use all of the classes we have made in the last two parts. To do this we are going to make a main.py file. In this file we will use “argpars”, “configparser” and “pathlib” to accomplish a few tasks. Argpars will allow us to run the file with an argument to specify the config file to use. Configparser will parse this .ini config file so that we can set our variables. Pathlib will be what we used to iterate over the files in the folder. In this file we will also import all our classes we have create.

from argparse import ArgumentParser
from configparser import ConfigParser
from pathlib import Path

from authentication import Auth
from uploadfile import Upload
from addmeta import AddMeta
from indexing import Indexing
from util import Util

Now we are going to parse the argument. The below code will require the “-ConfigFile” argument be added in a string format for the file to execute.

parser = ArgumentParser(description = 'Process the arguments')
parser.add_argument('-ConfigFile', help='Enter the config file name', type=str)
args = parser.parse_args()

Then we are going to pars the config file. The code and an example of the config file can be seen below.

config = ConfigParser()
config.read(args.ConfigFile)
[DEFAULT]
;user must be an owner of the SharePoint Document Library
username = sharepointuser@domain.onmicrosoft.com
password = RubberDuck!eUrThe1

[fileinfo]
folderpath = P:\path\to\folder
fileextention = pdf
indexExtention = 
;This will except Yes or No. If Yes is set the program will read an index file with the above extension
hasindex = No

[sharepoint]
sitename = SharePoint site
baseurl = https://domain.sharepoint.com
siteurl = https://domain.sharepoint.com/sites/SharePointsite
sharepointfolder = Document Library Name
sharepointcustomecolumnlist = [‘CustomColumn1, CustomColumn1’]

Now that we have parsed the config file we are going to set the variables to some values from the config.

userName = config['DEFAULT']['username']
password = config['DEFAULT']['password']
siteName = config['sharepoint']['sitename']
baseURL = config['sharepoint']['baseurl']
siteURL = config['sharepoint']['siteurl']

folderPath = config['fileinfo']['folderpath']
fileExtention = config['fileinfo']['fileextention']
indexExtention = config['fileinfo']['indexextention']
sharePointFolder = config['sharepoint']['sharepointfolder']
sharePointCustomeColumnList = config[‘sharepoint’][‘sharepointcustomecolumnlist’]
hasIndex = config['fileinfo'].getboolean('hasindex')

Then we need to instantiate a few classes so we can call them later in the code.

sourceFolder = Path(folderPath)
util = Util()
indexing = Indexing()

Alright now we are going to check if the non_empty_dirs returned “True”. If it did we will step into the below code. If it returned “False” we will print “Looks like the folder is equal to your heart”.

if non_empty_dirs:
    #login into sharepoint site
    login = Auth(userName, password, siteURL, baseURL)
    # iterate over files in the source folder
    files = [p for p in sourceFolder.iterdir() if p.is_file()]

The code snippet above will call our ‘Auth’ class and pass it the username, password, site URL and base URL and store the results in the ‘login’ variable.  Then we iterate over the files in the source folder and save them in the ‘files’ variable. Now we will check to see if the ‘hasIndex’ is true or false.

Join Amazon Prime – Watch Thousands of Movies & TV Shows Anytime – Start Free Trial Now

 

Adding metadata with an index file:

Below you can see the code we will execute if it returns true.

if hasIndex:
        for file in files:
            # Find all files in folder
            match_file_name = file.match(f'*.{indexExtention}')
            if match_file_name:
                indexRead = indexing.readIndex(file.name, folderPath)
                
                Upload(login.site, folderPath, indexRead[-1], sharePointFolder)

                customeMetaDict = util.createDict(sharePointCustomeColumnList, indexRead[0:-1])

                AddMeta(login.site, indexRead[-1], folderPath, sharePointFolder, customeMetaDict)          

We first loop through the files. Then for each file that matches the indexExtention we will call the readIndex function from the Indexing class and pass it the file name and folder path. This will return a list from the csv index file.

Next we will call the Upload class and pass it login.site, the folder path, the file name from the indexRead list and the SharePoint folder.

Then we will call the the createDict function from the Util class and pass it the sharePointCustomeColumnList and the indexRead list with a range of 0 through -1.

Finally we will call the AddMeta class and pass the login.site, the indexRead[-1](for the filename), the folderPath, the sharePointFolder, and the customeMetaDict.

Adding metadata without an index file:

The process for adding meta data when we do not have an index file is basically the same as you can see below. If the hasIndex is set to No, we will step into the else and execute the code.

else:
        for file in files:
            # Find all files in folder
            match_file_name = file.match(f'*.{fileExtention}')
            if match_file_name:
                upload = Upload(login.site, folderPath, file.name, sharePointFolder)

                keywords = indexing.createKeywords(folderPath, file.name)

                customeMetaDict = util.createDict(sharePointCustomeColumnList, keywords)

                AddMeta(login.site, file.name, folderPath, sharePointFolder, customeMetaDict)

We first loop through the files. Then for each file that matches the indexExtention we will call the Upload class, passing it the login.site, folderpath, file.name and the sharePointFolder.

Then we call the createKeywords function from the Indexing class and pass it the folderPath and the file.name.

Next, we will call the createDict function from the Util class.

Finally we will call the AddMeta and pass it the login.site, the file.name, folderPath, sharePointFolder and the customeMetaDict.

If the folder is empty well will step into the below else statement.

else:
    print('Looks like the folder is equal to your heart')

In summary we have created 5 custom Classes that we then use in our main file. We use arguments when calling the main file to indicate what config we will us. This allow us to use a scheduling software to upload all kinds of file into our SharePoint system.

If you have any questions or suggesting on making this application, please leave a comment below.

The full project can be found on my GitHub: https://github.com/zitheran/SharePyUploader

8 thoughts on “Uploading files to SharePoint using Python – Part 3

  1. Great post, thank you very much for sharing. I have tried very hard to get this to work, but after finally getting the code to run (I had to change a few things as I’m running on linux, so the backslashes needed to be forward slashes etc.), I am still not quite there.

    The file uploads to sharepoint quite happily, but the metadata doesnt get updated. I have 3 custom metadata columns in sharepoint and they remain empty. I dont get any errors though when running the code either, which is odd.

    Can I confirm where I need to specify the custom metadata columns, what format they need to be in, and the format of the value, and where they need to be stored too?

  2. Boom. Got it. Date format needs to be yyyy-dd-mm format. I was trying dd-mm-yyyy format. All working now!
    Great utility and much appreciated.

  3. Is it possible to update the metadata in a column, where that column is a Lookup column? We have a lookup column which allows us to make a selection from a linked List but we have several thousand documents we want to pre-load into the document library. At the moment it seems we can upload the files without a problem but we cannot set the metadata in the Lookup column. I’ve a horrible feeling it can’t be done?

    1. I’m also hitting a bug, where once I get 5,000 documents in my document library I can still upload new documents but the script bombs when I try to update the metadata. It gives me an HTTP 500 sith an Internal Server Error from url: …./_vti_bin/lists.asmx Its throwing the error in the module addmeta.py then it runs the line GetData = instanceOfDocumentLibrary.GetListItems(fields=fields, query=query)

Leave a Reply