Importing MS Excel files with 500.000 rows using serverless Excel imports on Mendix with AWS
Solving the biggest issue when importing large excel files in Mendix applications with the help of Amazon Web Services (AWS)
After trying to import a Microsoft Excel file with 500K rows in my Mendix application, I ran into issues with my application’s memory. I didn’t want to upgrade the application memory just for my Excel import. Instead, I created a serverless pay-per-use solution where using Amazon S3, AWS Lambda and Amazon SQS, I transformed the Excel sheet into JSON files for Mendix to import and only pay 1.20 USD per month.

Low-code Serverless Excel Imports using AWS with Mendix
In the following article, you’ll learn more on how I tackled a problem you most likely ran into as well. Importing an Excel file in Mendix using the Excel Importer module and reaching the limits of your Mendix Cloud node or local runtime. With some knowledge of Amazon Web Services (AWS) and serverless architecture, I tried to solve this problem by creating a serverless application model on AWS using the SAM toolkit and created an AWS Lambda function to cut up my 500K rows Excel file in 10K rows JSON documents that were much easier for my Mendix app to process. This article will tell you what this serverless application model looks like, what the Lambda function does and how you can connect this all to your Mendix app using the AWS connectors from the Mendix Marketplace.
To me this is what the future should look like. All the heavy lifting that you want to scale, as fast as possible, hosted in a serverless cloud environment. This way you can keep your application performing and save unnecessary costs on hosting at the same time. Who knows a fully serverless Mendix can become the future?
TL;DR
After trying to import an Excel file with 500K rows I ran into issues with my application’s memory. Scaling this is a costly action, and I don’t want to get more memory just for my Excel import. Instead, I created a serverless pay-per-use solution where using Amazon S3, AWS Lambda and Amazon SQS, I transformed the Excel sheet into JSON files for Mendix to import and only pay 1.20 Dollar per month.
Want to play around with this yourself? Find all the resources in the following github page: https://github.com/brfreek/excel-to-json
Table of contents
- Excelling at Excel
- Some background
- Setting up the serverless infrastructure
- Writing the Lambda function
- Connecting the dots
- Finally doing some work in Mendix
- The results
Excelling at Excel
Did you ever receive an Excel file you needed to import in your Mendix app and at the start already thought: well, that’s not going to work? Hundreds of thousands of rows that you know will result in either waiting on that import for hours or making your Mendix app run out of memory? I had that same moment a couple of times and thought, this can be done in a more efficient way.
To put this to the test I downloaded an open source excel file with 500.000 fictive sales from a product company. The file in question is 50 MB (download it here) and loading this all into my Mendix app requires quite some memory. I tried to import it using the default Mendix Excel Importer module, but after just 5 minutes it ran out of memory.

The easy fix for this is to upgrade the RAM, which in Mendix Cloud terms means, buy more memory. However, I only want to import such a sheet once every while and don’t need the upgraded memory for the rest of my application, which would be a waste of money. What other way can we think of fixing this problem?
Some background
My previous job at Mendix involved a lot of working with AWS. As a matter of fact, I helped shape the roadmap of most AWS connectors you see in the Mendix Marketplace today. What I like about AWS is that they are leaders in serverless computing.
Serverless computing, I hear you think? Doesn’t AWS build their entire platform on physical servers? Well, yes that’s true. The concept of Serverless in this case is all about you running software in the cloud without the need of running this permanently on an ‘always on’ machine. Instead, you only pay for what you use, and it usually scales infinitely.
Drafting the idea
This made me think about how I could leverage that power of drastically reducing the resources I need to import this Excel file with 500k rows. I straight away thought about creating a Lambda function. AWS Lambda is a service that allows you to run a piece of code (in your preferred programming language) in the AWS Cloud and only pay for it when you use it. When you don’t use it, it is ‘turned off’ and it is not using any resources in the AWS Cloud. This means that you only pay for when you use such a function.
This is ideal for situations where you want to execute code that needs to scale, and you don’t want to bother your own Mendix application with. The best part is, AWS provides a Lambda free tier that provides 1 million requests for free and 3.2 million seconds of compute time per month which is more than plenty for this use case.
But what should it do? As this is a PoC I didn’t want to spend days learning different frameworks to manipulate Excel files. Instead, I wanted to opt for something simpler and more ‘up my alley’. I have experience with NodeJS and found the XLSX node module from SheetJS (https://sheetjs.com/). With just a few lines of code it transforms an Excel file to a JavaScript object. So, what if I just get that Excel file into that Lambda function, transform it into one big JavaScript object and return it as JSON somehow to Mendix so I just need to use an import mapping to store it in my applications database?

Setting up the serverless infrastructure
I started off with creating all the resources. The ideal starting point for me was to use the AWS SAM (Serverless Application Model) command line tools. A tool from AWS that provides powerful features to create your infrastructure with code and at the same time deploy it directly from your machine. I only had to define my Lambda function and its triggers and off we went. It automatically generated the boilerplate Javascript code for me and with the ‘sam build’ and ‘sam deploy’ command my Lambda function was pushed to AWS.
To get the Excel file from Mendix into my, to be created, Lambda function, I need to upload it to probably the most known AWS Service: S3 (Simple Storage Service). The document storage system of choice for a large part of the internet. All documents are stored in so called “buckets”. Even your Mendix application is storing all its System.FileDocuments in S3, but this is on the Mendix Cloud and closed off for everyone else.
To get my Lambda function working I’ll have to create my own bucket in my own AWS account where my Lambda function is running. The bucket created for this case was called the ‘Excel Input Bucket’. My Mendix app could use this to store all incoming Excel files. Next up, as AWS strongly advises against using the same bucket for output, I had to create a second S3 bucket called ‘JSON Output Bucket’ so all my transformed JSON files are stored as well.
Let’s say the average Excel file is 50 MB and the resulting JSON file 150 MB, and I upload 50 Excel files a month and choosing not to delete them, the total cost for running this whole operation on S3 will be: 1.20 USD per month!

Writing the Lambda function
The next step is to write the Lambda function itself. The Lambda function will be triggered by Amazon S3. These configurable event notifications that can trigger different AWS services including AWS Lambda will save a lot of time creating my own event mechanism. In this case the Lambda function is triggered when a document is created in the bucket called ‘Excel Input Bucket’. The identifier of the document is passed in this ‘event’ to the Lambda function and with the AWS JavaScript SDK you can now fetch the document using the identifier (key).
const stream = response.Body;
const buffer = await stream2buffer(stream);
const workbook = XLSX.read(buffer);
const worksheet = workbook.Sheets[workbook.SheetNames[0]];
const raw_data = XLSX.utils.sheet_to_json(worksheet);
const totalIterations = raw_data.length / ROWS_PER_ITERATION;
for (let i = 0; i < totalIterations; i++) {
const startSlice = i === 0 ? 0 : i * ROWS_PER_ITERATION;
const tmpRows = raw_data.slice(startSlice, startSlice+ROWS_PER_ITERATION);
let data = JSON.stringify({'Sheet': tmpRows});
const fileName = `/${key}/${i}.json`;
const command = new PutObjectCommand({
Bucket: OUTPUT_BUCKET,
Key: fileName,
Body: data,
});
try {
const response = await s3.send(command);
sendSQSMessage(fileName)
} catch (err) {
console.error(err);
}
}
With just a couple of lines of code, the XLSX file is transformed into a JSON object. But since JSON requires much more storage compared to an XLSX file, it will probably require quite some effort to get it back to Mendix and load it all into memory (the problem I’m trying to tackle in the first place). Instead, the Lambda function will create a JSON file for every 10K rows and write this as an individual JSON file to the Excel Output Bucket. This way the Mendix app only needs to import and map 10K rows and grab the next 10K when it’s finished and freed up the memory.
Connecting the dots
Finally, I want to let my Mendix app know when a JSON file is created in the Excel Output Bucket, so I don’t have to poll the S3 bucket all the time. Meet Amazon SQS (Simple Queueing Service) where the Lambda function can leave a message with the identifier of the JSON file in its body and wait for another system to pick it up. The free tier provides up to one million requests per month so that’s not going to cost me anything either! Mendix will poll the SQS queue every minute to see if there are messages. When there’s a message it will get the S3 object key out of the message and fetch the JSON file from the S3 bucket.

Finally doing some work in Mendix
With all the resources provisioned on the AWS side and the Lambda function created, the last thing to do is to extend the Mendix application to upload the Excel file to S3, read an SQS queue for Messages and get the JSON file from S3 and map it to the Mendix database.
The AWS connectors in the Mendix Marketplace provide a real low-code way of integrating multiple AWS Services in your Mendix app without having to manually integrate the REST API’s or Java SDK. This saves a ton of time (I know I’m biased, practice what you preach right?). Using the Amazon S3 Connector the Mendix app uploads the Excel file to the Excel Input Bucket that automatically triggers the Lambda function. This will now start the process of generating the JSON files. As mentioned, for each JSON file a message in SQS is generated. So, using the ReceiveMessages action from the Amazon SQS Connector in a scheduled event, the Mendix app checks for new messages posted by the Lambda function.
Once it receives a message, I can then use the Amazon S3 Connector again to fetch the JSON file using the GetObject action in the same microflow to get the new JSON file and use the default import mapping functionality from Mendix to store it in my Mendix Database. Once processed I make sure to delete the File from the bucket using the DeleteObject action (So I can shave some cents off that 1.20 USD) and delete the message from the Amazon SQS Queue to prevent it from being picked up again.

The result
After deploying both the AWS Serverless Application Model to my AWS Account and the Mendix app to a sandbox, the total time to process the 500K rows from the Excel file all the way to a persistent model in my Mendix application took less than 10 minutes without the need of calling support to increase the memory on my Mendix Cloud nodes!
From a technical stand point this is a great way to experiment with serverless architecture and Mendix. But if we take a more serious look at this, it can be a solution to many problems if executed correctly. Why pay for a huge running server if you only need the power of that machine for just a fraction of it’s lifecycle, or bother your end users with lagging applications because someone is running that humongous Microflow again that is eating away all the applications memory.
Combining the power of low code with serverless computing provided by platforms like AWS will help grow the capabilities of what we can deliver to customers and our end users.
All fun and stuff, but I could use some help with this…
There’s only that many words one can use to write out this whole example, so I can understand that this might feel like a bit to much. But don’t worry, both the Mendix application and the AWS Serverless Application Model are publicly available through the Mendix Marketplace (link to MP) and Github (https://github.com/brfreek/excel-to-json).
If you want to learn more about how you can extend your Mendix application with the AWS connectors or want to learn more about how AWS can improve your application landscape feel free to reach out to Freek Brinkhuis at freek.brinkhuis@theorangeforce.com or +31 6 40711523.
If you are interested in a strategic or technical deep dive into these or any of the other topics discussed, feel free to reach out to our team at The Orange Force! With a global presence we employ many experienced industry veterans who specialize in the Mendix platform and its combined use with the AWS Cloud platform. They have led development and digitalization teams and projects, developed countless innovative applications and are committed to your success.
You can visit us at our Utrecht or Denver office, or reach out to Jeffrey Kwee at Jeffrey.Kwee@TheOrangeForce.com or +31 6 11973008.
About the Author

Freek Brinkhuis is a Principal Consultant and Architect at The Orange Force. He previously worked in different roles ranging from native iOS developer to Product Manager AWS at Mendix. He’s a technical all-rounder who loves to learn about new technologies and how he can implement these at his customers. In his free time he loves to play around with both software and hardware. His growing collection of vintage Apple desktops can be seen in his online meeting background when he’s working from home!