Deep Learning Turns Professor’s Whiteboard Sketches into Working Code
AI Helps Students Visualize Control Systems in Time and Frequency Domains
Some research projects emerge to answer a theoretical question. Some arise to solve a researcher’s own problems. The latter was the case for Tufan Kumbasar, the director of the Artificial Intelligence and Intelligent Systems Laboratory and a professor of control and automation engineering at Istanbul Technical University.
Control and automation engineers often use feedback control architectures (FCAs) to design closed-loop controls with the aim of managing or regulating the behavior of a system to achieve a desired result. The current state is fed back and compared with the desired state and the system adjusts to maintain the desired state. This type of control system is used to set cruise control in automobiles, regulate heating systems, and govern factories.
Typically, a professor teaching control theory draws an FCA on a whiteboard. To show and analyze the dynamic behavior of the system—how it behaves with various inputs—the professor draws a plot next to the block diagram. But that takes time, and the plot is redrawn for each change to the FCA. It also might be hard to interpret the drawings.
“The main challenge for the students was recognizing my handwriting and my awful drawings, which are not scaled at all,” Kumbasar says. “It was difficult for them to follow this kind of visualization.”
One alternative is to recreate the FCA in technical computing software such as MATLAB® and Simulink® and project a computed plot onto a screen. “MATLAB has good visualization capabilities,” Kumbasar says. “But coding during class requires extra time and effort.”
Kumbasar then had an idea to apply artificial intelligence (AI) to help the control systems class.
“The origins are a bit selfish,” he says. “At that time, I was the vice dean of faculty and was busy with lots of administrative tasks. Then I also had to teach. One day, I was totally exhausted and told my research students, ‘I’m really tired. Either I do it all on the computer or all on the whiteboard.’”
Afterward, he approached graduate students Dorukhan Erdem and Aykut Beke who were actively working together on deep learning approaches. “I said, ‘Okay, Dorukhan, you’re working on novel structures together. So why don’t we make an application just for me in the beginning? I will be your guinea pig.’”
The application Kumbasar is referring to is a deep learning system that takes a picture of the whiteboard and automatically recreates the FCA in MATLAB. This idea was possible within the MathWorks framework, where control tools like Simulink live in the same environment as discipline-based toolboxes like Deep Learning Toolbox™ and Computer Vision Toolbox™.
From Whiteboard to Code with Deep Learning
Being the guinea pig meant drawing lots of FCAs. Erdem photographed and labeled the drawings so a computer could learn from them. Five department lecturers drew at least 10 block diagrams for each of the six types of FCAs, for a total of 306 images. The lecturers drew the sketches in different lighting conditions, making the challenge more realistic. Erdem and Beke then manually labeled all the blocks and symbols inside them. During training, the deep learning model would guess those labels and adjust its internal parameters when incorrect.
The resulting software pipeline, published in the journal IEEE Access, included five steps.
The first was to recognize which of the six architecture types the sketch best matched. Kumbasar’s team used a type of pretrained deep neural network called a ResNet-50, which uses 50 network layers and some long-distance connections that skip over intervening layers, improving performance.
A set of 306 images is not a lot to train a neural network, especially one this large, so the team employed two tricks. The first is a practice called transfer learning. They used the pretrained ResNet-50 network, which is trained on thousands of photographs of everyday objects, and replaced the final layer of the network, training only that layer on their own photos. The second trick is data augmentation, where the team multiplied their number of images by creating versions that were slightly rotated or scaled. After training, the first step of their pipeline was 89% accurate on classifying the handwritten feedback control architecture.
The second step in the pipeline was to detect the blocks. The software converted the image to binary—white lines on a black background, no colors or greys. Then it removed written characters and noise. Next, it filled in the closed shapes and separated blocks from closed shapes created by feedback loops between blocks.
Third, the pipeline recognized characters written inside the blocks, including numbers and arithmetic operators. It made the image binary, cropped around each character, and applied another neural network trained on 3920 images.
One difficulty was differentiating “5” from “s.” “My five and my s are pretty close,” Kumbasar says. “Dorukhan always complained about my handwriting. Control engineers, by using intuition, will say, ‘He wouldn’t write “55 + 1,” he would write “5s + 1.”’ But providing this intuition to the AI system is difficult.” Nevertheless, this stage was 96% accurate.
In the fourth step, the software combined the characters into MATLAB functions. If one character was above and to the right of another, the system called it an exponent and inserted a “^.” If a horizontal line had characters above and below it, the system saw it as division rather than subtraction.
In the fifth step, the pipeline placed these functions into the right blocks and connected them into a complete FCA in MATLAB. It also created a Simulink diagram based on templates the team had created.
Erdem says MATLAB was a helpful solution for deep learning using neural networks like ResNet-50. Kumbasar adds that MATLAB makes integration between pipeline steps convenient. “Sending data arrays and images from one toolbox to another toolbox or reading it from a common workspace is easy in MATLAB,” he says.
Starting with Sketches
“I was amazed when the FCA software pipeline was working in real time,” Kumbasar says. “We tested the whole pipeline in different lighting conditions, with mixed handwriting, so we kind of forced everything to its limits.”
They posted a video on YouTube in which instructors draw FCAs on a whiteboard. Moments later, the same FCA appears in Simulink projected above the whiteboard, followed by a plot graphing its behavior.
The parts worked well on their own. Kumbasar adds, “But once you connect these types of pipelines, if one part makes an error, another one is also affected. So, the error is always amplified from the input to the output.” Here, everything stayed on the rails. “And Dorukhan said it was a fun project that pushed him. He found all these kinds of crazy test images.”
Overall success is difficult to measure, and the team could not do a study of student experiences in time before the pandemic hit. They posted a video on YouTube in which instructors draw FCAs on a whiteboard. Moments later, the same FCA appears in Simulink projected above the whiteboard, followed by a plot graphing its behavior.
İlker Üstoğlu, one of the professors who provided whiteboard drawings, says he was impressed. If the software existed as a product, he would use it. “It accelerates teaching.”
Into the Theoretical Realm
No matter how good it is, some classrooms are not prepared to use MATLAB to convert whiteboard FCAs. Even though Erdem optimized the software to run in real time on only one GPU, by, for instance, limiting the ResNet to 50 layers, it requires a high-resolution camera and a laptop with a fast graphical processing unit, which many classrooms do not have.
In the meantime, the researchers are thinking of several improvements to make. Erdem would like the software to handle more than six architectures. The system also struggled when characters touched the surrounding box, so Kumbasar is adding confidence levels, enabling the system to indicate the certainty of its labels.
Integrating fuzzy logic is also a priority. “In fuzzy logic, you wouldn’t say everyone over a certain height is tall and everyone else is not,” Kumbasar explains. “People might be called somewhat tall, leading to more nuanced decisions down the line. It’s easy to build fuzzy layers in MATLAB that can be combined with neural nets or composed into complete fuzzy systems.”
If the pipeline does become a product, it could have applications beyond the classroom. Professors might use it to translate students’ handwritten tests or homework into MATLAB code for grading. Students might use it to digitize their own lecture notes and play around with the FCAs in MATLAB. Kumbasar says he would like to do the same for his own research. Marco Rossi, an engineer at MathWorks, notes that many students and researchers prefer to analyze problems with a pen and paper and then put them into Simulink afterward to generate C code. This habit may result from how control theory is taught. But a tool like Kumbasar’s might encourage people to move from paper to screen earlier in the design process, so they can more easily play around with blocks.
Kumbasar mentions one more benefit to the project, besides solving his own problem and advancing theory. “It is also a nice motivation for the undergrad students,” he says. They see that engineering inspiration can come from the puzzles they face daily. “Either you are curious and doing science, or there is a need and you are doing science,” Kumbasar says. “In this situation, I had a need.”