Lean Construction Ireland Annual Book of Cases 2021 51 l f s s 2 number of relevant data points each document has, the time taken to process a single document can range between 20 to 80 seconds, with longer durations if the form is a hard copy that needs to be scanned and processed. The inputting process required a large amount of time from skilled individuals who could have been assigned to more productive work, with approximately 800 documents per month being processed on a single sample site. In addition, this administrative-heavy task would often result in a backlog, meaning that the information was not as timely as it should have been. Such processes are undertaken across the majority of our projects, and it was therefore apparent that providing a more lean and automated process would remove a significant amount of tedious, unrewarding work and free up thousands of hours of staff time across the business. Improved Methodology –Automated Processing An improved methodology was developed using a suite of data processing tools available through our Microsoft licence.The core of this was the Microsoft AI Builder tool, in which a Machine Learning model is trained to understand where on a form the information is and then an in-built “Optical Character Recognition” (OCR) tool detects the content of the fields and extracts the data. Figure 1: A screenshot from theAI Builder tool, showing a sample form on the left hand side and the designated fields on the right hand side While the theory sounds complicated, the process of using the tool is extremely straight forward.The first step is to identify and list the required fields to be extracted from the form. In the case of a concrete cube certificate, this would be items such as “date”, “strength”,“location”etc.Once this list has been produced, a sample set of at least 5 PDF forms is uploaded into the AI Builder and the user manually identifies where on the form the various fields can be found.The principle of machine learning is that a model is trained to identify the information, so the more variability in the form design, the more forms that are required to train the model. In the case of concrete cube tests on a standard template, the model can achieve a high level of accuracy with relatively few forms. Once the model has been built, a Microsoft Flow automation is developed, which facilitates an end-to-end process connecting the various tools of the Microsoft suite as follows: 1. Microsoft Outlook: Email arrives into a designated inbox.The tool recognises the email address as being from the testing company and recognises the attached cube certificate. 2. Microsoft OneDrive:The attached PDF is automatically transferred to a designated folder on OneDrive, where the system recognises that a new file has been uploaded. 3. MicrosoftAI Builder:The trained machine learning tool ‘reads’ the content of the file and automatically extracts the data fields. 4. Microsoft Excel:The extracted data is stored in an Excel tracker as a new line item, with a column for each of the identified fields. 5. Microsoft Outlook (future functionality): Once the data is in the tracker, it should be possible to set up an automated email alert to the project engineers, if a cube test is identified as having failed.This is currently being explored by the Sisk team. Figure 2: Overview of the data extract process While the initial setup time for this process may take up to an hour, once the tool is set up it will be fully automated and require no user input, except to monitor the results and action any nonconformances associated with failed tests. In addition, the power of the machine learning tool and optical character recognition is that it should be possible to reach a point where, having been trained with enough variety of different certificate formats, it should be able to recognise and extract data from any certificate, as long as similar data fields are present. Sisk have not fully explored this, having only targeted forms on a single sample project. However, this is an area of proposed future functionality. The tool that has been developed at the pilot project has been a great success and has proven a concept that we will be looking to roll-out further.The tool frees up time for junior engineers to be away from their computers and out on site where they are best positioned to learn, develop and grow.The Quality Department have also been keen to see this pilot expanded to more sites, as in the past the data in this area has often been out of date and at times unreliable, with the tool alleviating both issues. There are a few potential risks that should be managed and identified. The most important of these is that the automation and lack of human input leads to the data being ignored, at least the manual approach will see an engineer directly interacting with the data. In addition, if the automation process were to fail (for example if there was a significant change in format that the Machine Learning model could not work with), this could result in incorrect data being extracted. As a result, it is important to embed an auditing process, which sees the results reviewed and analysed at regular intervals. Lean Initiative Improvements & Impact Case 11
RkJQdWJsaXNoZXIy MTIzMTIxMw==