Testing a SharePoint Library with a Million Documents
Back in 2010, Microsoft introduced the famous 5000 items query threshold to protect the site users from performance issues. Ten years later, this is still one of the most misunderstood SharePoint limitations.
Today, we are going to see what happens to a library that contains a million documents. Which features are going to work and which are not? Are we still going to be affected by the 5000 items query threshold?
SharePoint Online library will contain a million documents
Our experiment involves a SharePoint Online Team site containing a million office documents. These are Word, Excel and PowerPoint documents that are 10-50 KB in size. All these files have been uploaded to the root folder of the document library. Therefore, there are no folders or document sets involved. As a result, the entire library is completely flat.
No indexed columns at the beginning of the experiment
Indexed columns help us bypass the 5000 items query threshold to some degree. So, to run a clean experiment, there were no indexed columns pre-created.
User account and permissions
None of the tests were conducted using site collection administrator's accounts as they have certain unfair advantages when it comes to working with large document libraries with unique permissions. Therefore, a user account with a Contribute permission level only was used.
Let's get down to testing
1. Open a document library. Modern view
2. Open a document library. Classic view
3. Sort by Name. Modern view
4. Sort by Name. Classic view
5. Sort by Modified Date. Modern view
Sorting by Modified Date produced a weird result at first, but at the same time, this action did not throw an error. This is because behind the scenes SharePoint had automatically created an indexed column for you. Within a minute, all the one million documents were automatically indexed. Now, we could sort by Modified Date all day long and it worked perfectly fine in the Modern view. In addition, we could do the same thing to other columns that support sorting with the same results.
6. Sort by Modified Date. Classic view
Sorting by an unindexed column produced the following error.
7. Group by "Modified By" column. Modern view
8. Group by "Modified By" column. Classic view
9. Show column Totals. Modern view
The column Totals incorrectly displayed 5000, whereas it should have displayed a million.
Classic views broke completely. The rest of the experiment was done using the Modern views only.
At this point, the Classic view stopped displaying any data whatsoever. Creating fresh views didn't make any difference either. Thus, the testing had to be continued with the Modern view only.
10. Filter by a column
Filtering by various columns didn't work very well. For example, when filtering by Document Type, only one option was always displayed.
When filtering by other columns, filter options were limited to the items that were displayed on the screen at that moment. Creating column indexes didn't resolve this issue. As a result, the dynamic filtering turned out to be quite useless in the Modern views.
领英推荐
11. Break library permission inheritance
12. Break document permission inheritance
Success: we could break permissions inheritance and share individual documents. Of course, we do not recommend breaking inheritance for more than 2000-4000 documents because it can cause issues that are not covered in this article.
13. Move files to a subfolder
It was possible to move 10 files to a new subfolder. Overall, moving a small number of documents worked really well.
14. Sorting, filtering and grouping inside a folder
Inside the folder, with only 10 documents in it, we were able to perform filtering, sorting, and grouping. All of these features worked perfectly fine. This is why the best practice is to always spread your documents among multiple folders or even libraries.
15. Delete 10 documents. How long does it take?
10 documents were deleted within a couple of seconds. Deletion works quite fast.
16. Restoring 10 documents. How long does it take?
10 documents were restored from the recycle bin within a couple of seconds. Same as above, the restore function worked well. Of course, if you are restoring a large number of documents, it might be a different story.
17. Sync document library
It took 45 minutes to sync a million documents to my local PC. This is really fast, but this is only because none of the documents are actually physically downloaded by default. All thanks to the Files On-Demand feature.
18. Search by document Name on user's local PC
19. Create a calculated column
20. Apply column formatting
Column formatting works, when applied. However, the column formatting wizard was not able to "see" all possible column values. This is not a big issue, in my opinion.
21. Lookup column that points to a document library with over a million documents
When we created a lookup column and pointed it to the large document library with a million documents, this is what we got:
So, unfortunately, even the Modern views cannot handle lookups with a large data source.
22. Customize the form with Power Apps
23. Export to Excel
Exporting to Excel took a while, and then all the attempts ended with the "out of memory" error.
Having 64 GB of RAM did not help. We gave up after four attempts.
24. Bulk edit document metadata for 10 documents
Bulk edit for just 10 documents worked with no issues. Doing the same for a million documents would not work.
Summary
The tests were performed for scientific reasons, out of curiosity. Remember that it is not advisable to upload a million documents in a single library in real-world scenarios. Instead, documents should be stored in multiple sites, document libraries and folders. When planning your information architecture, avoid large number of documents on the same level.
The test results showed that the Modern views work much better than the Classic ones when it comes to a large number of documents. However, a SharePoint library is not a database and should not be treated as such.
Developer Advocate at Microsoft. Follow for Copilot for Microsoft 365 and Microsoft 365 extensibility content.
4 年Great article ?? You can add not being able to modify, as well as create a calculated column to the list.
Software Architect at VTB. MBA, PMP.
4 年Wow. Well done ??
IT Expert Inagro
4 年Thank you for this interesting post! Are fears justified that splitting document libraries into multiple and using folders in SharePoint will negatively affect the user search experience?
Digital Transformation Consultant - User Adoption- Microsoft 365 - SharePoint - OneDrive - Modern Workplace - AI with Copilot
4 年Thanks for testing. I thought there is a new threshold value in SPO with 20 000 documents for automatic indexing. (You can read it in the limitations document online.) However even if there were a lot of positive results with this huge amount of documents I would not recommend to build such huge libraries as users will be completely overwhelmed. And good to see that the modern experience is not only so much more user friendly but also technically much better than the classic. Nice experiment!
Microsoft 365 Solution Architect at West Monroe Partners
4 年Thank you for taking the time to test all of these scenarios, no all heroes wear capes!