I was working with someone from my local user group recently. He was dealing with a slow data import process. The Use Case called for any user to be able to import data from an excel file into the custom app and then work with that data. And it was slow. We took a look at it and talked through it together. A few hours later, the import was smooth and quick, and the data was available to all users shortly after the import finished. Let’s take a look at best practices for a FileMaker data import.
In this post, we’ll talk about the target table: the table that will receive the data from the FileMaker Data Import. In a future post, we’ll talk about the best practices of the actual import process: how to script it so any user can perform an import.
The Target Table
The first question we should ask: into which table will the data be imported? It seems obvious: the table which holds the data. If I’m adding new students to a student information system, it seems I should directly import them into the Students table.
But that’s not always the case. There are a few reasons why we want to, at first, steer clear of the table that holds the data:
- A table with many calculated fields will slow the import to a crawl. Even though the calc fields are not shown, the import forces the calcs to run.
- The data might be wrong. It is possible to forget some columns or rearrange the columns during an import.
- The data might need additional processing. Calculated fields shouldn’t process the data, but sometimes processing needs to happen.
So instead of the actual data table (importing students into the student table), FileMaker (and other system) developers will direct the FileMaker data import to happen in a temporary table.
A Temporary Table
The temporary table idea is worth exploring, and we’ll do a deep dive on it here.
A temp table has the following characteristics:
- It is a table in your custom app or in another file.
- The fields in this table are static fields: number, date, text fields. There are no calculated fields.
- The fields in this table match the columns in the import source.
- The scripting directs the import source to import the data here.
Since tables and fields are cheap, you can have one table per use case. If you have to import students and test scores, without issue, you can have one temp table for each.
Temporary tables simply accept the data and thus have great advantages.
Importing into a table with no calculated fields is quick. The data simply gets added to the table.
Once the data is imported into a temporary table, it can be validated before it gets placed into the real table. And this is a good reason to use a temp table. We don’t want data to be imported that is incorrect.
A school administrator needs to import test scores into his student information system. These scores, of course, are scores of his current students using the students’ district IDs. Somehow the import gets messed up. The administrator accidentally disturbs the order of the import: instead of the “StudentID” column being imported into the matching field, this column gets imported into the scores field. A validation process could go through the temp table’s data to ensure that all studentIDs in that field match an existing student. If not, the record shouldn’t be placed into the “Scores” table.
The data might be wrong. There might be missing fields or missing data, there might be incorrect data in the columns. A whole host of things could go wrong. Since we, as the developers of the system, may not control the source of the data, we have to make sure the data is error-free
In either case, we want to script the process to handle the validation of each column and each row. Our scripted process could then generate a report to show the importer which records were rejected and why.
Validated data could further undergo some processing as it is being moved to the correct tables. Here are some possible processing:
- Duplicates are removed.
- Phone numbers, email addresses could be added to a Contacts table.
- Scripted calculations could be made and the result placed into a static field.
- Add new records
The amount of processing that can happen is unlimited.
It is possible for two users to want to import data into the same table at the same time. Maybe they’re importing the same data, maybe one is importing Test 1 scores, and the other is importing Test 2 scores.
In either case, there’s a strong case to be made for using a temporary table. Both users can import the data; the data doesn’t interfere with each other and can be processed in turn.
Often, users are importing their data in a session. As part of the process, a session record is created, and the user’s import is tied to that session. Sessions are good for reporting out validation errors. My script can record that Session 32 had six rejected records.
FileMaker Data Import: But I want it now!
From a user’s perspective, a FileMaker data import, in the manner described, seems like a lot. First the data is imported into a temp table. Then it is validated to some degree. Finally, it is processed into the correct tables.
Yep. The above descriptions do take some time. Users may balk at that. But let’s consider the alternatives:
- The data is imported into the actual tables where resides many calculated fields. The user sees the “Importing” dialog for many minutes. She can do no other work in FileMaker.
- The data is imported incorrectly into the actual tables. The test score column ends up in the studentID column.
- The data is imported into the actual tables, but the data source contains extra columns that need to be placed into related tables.
Patience is a (common) virtue
When I was a teacher I also worked as the database administrator (two full-time jobs). Every year I’d have to set up our external services with student data. Those online services, from big testing and student-organization companies, would ask me to upload the file to a site and then WAIT for up to 24 hours until the data was processed. Inevitably I’d get a report back saying some records (or an entire import session) was rejected with clear reasons why it was rejected.
I think it is reasonable to ask users to wait a few minutes or hours for their imported data to show up in the custom app. I think they’d rather have the data validated and in the correct tables and fields than see wrong data instantly.
FileMaker to the rescue
So we’re not out of luck. Users get to see their correct data sometime after their import.
The target table is an important consideration when designing an import process. It seems to me a temporary table is the right choice for all user-driven imports.
In the next post, we’ll take a look at other user-specific needs for an import process. Stay tuned.