Automation of an Internal Model Based on MATLAB Core Calculations - MATLAB
Video length is 27:35

Automation of an Internal Model Based on MATLAB Core Calculations

Christian Dzierzon, Hannover Re

The internal model of an insurance company models stochastic cash flows to predict figures such as required solvency capital for a one-year time horizon. Growing over the past twenty years from rather simple to more complex and sophisticated, Hannover Re’s internal model was based on numerous manual processes until 2018, when they started an automation project. As MATLAB® usage has steadily increased over the years, it was only natural to build an automated version based on MATLAB as a programming language. In this presentation, learn how Hannover Re managed to keep flexibility, reduce run time and manual steps, avoid redundancies, and meet regulatory and compliance requirements.

Published: 22 Oct 2024

My name is Christian. I'm working at Hannover Re, Group Risk Management Department. And yeah, we are working on the internal model of Hannover Re. We will see in a minute what that means. And we ran through an automation process using Matlab. Yeah, and thank you for having me and having this opportunity to talk at the Finance Conference. Really a pleasure for me.

So let's see what's on the agenda for the next-- sorry, it doesn't work-- so what's on the agenda for the next 25 minutes. So first section would be just a brief introduction, "What is an Internal Model?" for those who might not be familiar with that. Then, the second part, interesting for us, I think. Why do we use Matlab, and how do we use it in our particular case?

The third section will be about the architecture of our automation, and the fourth section about an example workflow in the internal model in the automated world, so to speak. And the second-to-last part will be, briefly, about some security and audit aspects you have to consider. And finally, then, we will have some time for questions and, hopefully, answers from my side to your questions.

OK, so Section 1, what is an Internal Model? Just a very brief introduction. So in insurance and reinsurance companies-- as you might know, Hannover is a reinsurance company insuring other insurance companies-- internal models project cash flows, or, in general, the financial situation of the company over a one-year time horizon.

And in doing so, we need some just basic ingredients. At first, we need to define risk landscapes. So we ask ourselves, what risk do we want to cover in our internal model? And risk, in this context, can be something you might think of as, say, mortality risk of people in an insured portfolio, a natural catastrophe, risk factors-- things like that. Economic risk factors as well, of course, counterparty defaults.

Second ingredient would be the calibration of those risks. So all parameters we need to take into account to model those risks. And this also defines, sometimes, the granularity level for modeling the risks.

And then, yeah, the basic modeling of risk factors is done with Monte Carlo simulation. I guess you have heard about that in the previous talk, so you might have got an impression what the Monte Carlo simulation means. Usually, we simulate 100,000 scenarios at Hannover Re, but other companies might use different scenario numbers-- 10K, or 1 million. So that depends.

Then the next step would be aggregation of cash flows of projected cash flows to present values at T equals 1. So we call this T equals 1 because we model a one-year time horizon. Of course, today would be, then, T equals 0, and the end of this model projection would be, then, T equals 1. So just to give it a standard name-- so, for example, Q2 2024 to Q2 2025 or something like that.

And finally, then, valuation. So we need to calculate value at risk as a risk measure, tail value at risk, solvency, capital requirements, and so on. And for example, here in this picture-- can you see my mouse arrow, by the way, I hope?

Yes.

This is well shared? OK, that's great. So have a pointer.

And so for example, the solvency capital requirement in our case is the deviation of the value at risk from the mean value of some distribution, say, for present value of some cash flows. So this is the basic input or, yeah, modeling of the internal model. And why do we do that with internal models?

So one purpose is reporting in general, or particularly solvency-- Solvency II reporting for insurance companies and reinsurance companies. And also, business case analysis, some capital impact assessment-- everything the executive board is interested in, or other departments at Hannover Re. Then, of course, estimation of capital requirements. And yeah, that's basically why we do that.

So the workflow of an internal model might be depicted as follows in just a few basic steps. We have heterogeneous data, yeah, around the company, like calibration parameters, cash flows, projected cash flows. We have volume measures like reserves, stuff like that. And we have values from the balance sheet. These are, then, values for the T equals 0 state of the model.

Then, in the next step, we would do the stochastic modeling based on that. So we model several hundred risk factors, numerous dependencies between those risk factors, and Monte-Carlo simulation, as already said, with 100 case scenarios to just predict the T equals 1 point in time. And finally, we do a lot of aggregation and statistics. This includes, also, discounting of cash flows, of stochastic cash flows, then aggregation to different levels of interest-- say, per currency, per entity, model, et cetera-- and several statistics.

So now, the question, why do we use Matlab for that? I think it is not that unusual for an insurance company, as I heard so many insurance companies use other tools like Python, R, stuff like that. Why do we use Matlab?

So we already started in 2008, and when we started using Matlab, it was a support. In addition, a support means in addition to Excel VBA to ReMetrica, which is a insurance broker tool we used at that time, and also the Economic Scenario Generator from Moody's.

And the arguments at that time, still valid. Yeah, it's renowned software with worldwide customers in various industries-- so like car industry, plane industry. Everyone uses Matlab, so it's very common.

And what is important, it has an easy-to-learn programming language. So we can introduce it to new employees, and it has an excellent editor for programming. And all this is also supported by seminars and the Matlab help webinars, and the documentation is fast and stable.

I mean, I guess you know all of that, but these were some main arguments at the time. And also, we have several built-in features. So you don't need to program everything on your own. So you can rely on those built-in features and on various toolboxes that extend those basic features.

And also-- I just noticed that it's not on the list-- at that time, we applied for a certification of our internal model by the regulator, the German regulator BaFin. And we thought it would be easier with such a tool like Matlab to get the certificate than using something some open source language.

And, yeah, the years passed, and so it grew more and more important, all those arguments. And we noticed that some features that we wanted to implement in our internal model, we couldn't implement with ReMetrica, Excel VBA, or stuff like that. And, of course, every time someone built a new part of the internal model, he or she was using Matlab and not the other tools. And it was generally accepted and really appreciated to work with that.

So then, in 2018, our executive board said, OK, here's some money. You can do some automation in your internal model. Think about what would you like, what you would like to do.

And so we started a five-year project with several goals. So one of the goals was the complete migration to Matlab with all our core calculations, and also automation of all relevant processes. And this is quite important because, I mean, yeah, we've been using it 10 years or more, and we created several small tools and functions. And these had to be executed manually in different orders.

And this meant we had to copy data from A to B and copy code from A to B, and run the tool, and copy the output to another tool. So it was really a bit difficult to understand for new employees. And so we really wanted to have an automation built around that.

OK, so more specific some examples why we use Matlab. So general data processing was really comfortable with Matlab. So every time we do scenario aggregation, like adding up huge matrices or doing cash flow discounting, reading, writing data, generating random numbers, or drawing, generating distributions for risk factors and handling of large matrices, we did that with Matlab. So this was really important for US.

It's still a bit unspecific. More specific is we had to implement a dependency tree. And the dependency tree, as shown here by this illustrated picture, you can think of a tree where the leaves are the risk factors for property and casualty business. And these are more than 700 risk factors. And in between those risk factors factors, we want to establish dependencies via copula simulation.

So that's why it is arranged in a tree, so you don't need to define 700 times 700 dependencies or something like that. So just add the nodes, which are not leaves. You need to define dependencies. And this could be done with the self-developed Matlab tool. And this was really, yeah, comfortable for us.

And also, in the past years, we switched to Gaussian process regression for life and health cash flows. So this is GPR. And we do that because it is very, very time-consuming to generate life and health cash flows because those cash flows have 100 years projection-- so cash flow years. And we wanted to have 100,000 of them.

And yeah, what we are actually doing is we just ask the departments to generate 300 supporting points, and then we do the GPR regression and predict the remaining 100K cash flows from that. So this is quite helpful, and also done with the built in GPR function. I think it's in the Statistics Toolbox.

And also, I think I already mentioned that several times, statistics and valuation-- so generating outputs, analysis for solvency reporting-- is really comfortable with Matlab. So some precise reasons.

Just for your information about versions and licenses, currently, we're using 23B version, and we plan to update soon to 24B. And we have about 40 network-named user licenses, and also some concurrent licenses. And in terms of toolboxes, yeah, this varies between 1 and 40 licenses, depending on how important the Toolbox is for us or how many users would like to use it. And this is, I think, the complete list of toolboxes we're using.

So alphabetical order. So for us, important is the compiler one, also the statistics, machine learning, financial, and the optimization, and also parallel computing and database. And the remaining ones, I think, is very special cases where one or two users wanted to have the Toolbox for special applications.

Yeah, OK, so now having a look at the architecture of our automation. So first, maybe looking at the targets we wanted to achieve.

So please have a look at this, the puzzle pieces on the right-hand side. So there are some main targets we collected. So one target is, yeah, we wanted to have a single source of truth. That means, as I said in the beginning, we had several manual processes, copying things from A to B. And, of course, if the source of the original version of the copy changes, it's impossible to adjust all the copies of that. So we had different versions of the same truth and we wanted to get rid of that.

Also I'm going for on the top from left to right. We wanted to have automated processes and a simple user interface. We wanted to use Matlab for core calculations, and also cloud computing, rather than running all the code files on our notebooks or somewhere on premise. And also, we wanted to revise processes and avoid redundancies.

And what was really important is the highlighted puzzle piece. So we wanted to be flexible and independent of our IT department. And that means if we wanted to change something in the internal model, this should be done fast and independent from, say, IT release processes and something like that. And that's why we wanted to rely on Matlab as for the core calculations and on Matlab code we could maintain in our non-IT department.

So taking this into account, the main targets on the right-hand side also were to take into account the project scope and time constraints-- so 4 to 5 years, the cost, of course, and regulatory requirements. And that's simply a simple picture how the architecture of this automation looks today. So maybe we begin on the top left with the box which is the graphical user interface.

And I will show you some screenshots in a minute. But at the moment, let's just think of a graphical user interface. The language behind this is Angular, and this is maintained by our IT department. So we have one or two colleagues there who write the code for the graphical user interface and do bug fixes, and new implementations, changes, et cetera.

And then there is a process control software, which is called Camunda. Maybe you've heard of this. So a business process control software, which you can think of as a robot that starts processes, copies files, creates folders, runs executable files, informs users, writes email once the process has finished, and things like that. So this is, then, also, of course, maintained by our IT department.

Then, yeah, maybe this box here, we have the data part, or also the calculation part. So I call this server.

So on the left-hand side, this is Amazon Web Server. So this is the cloud computing-- the place where the calculations are done and the data is stored, and uploaded, and downloaded. And this GDW stands for Global Data Warehouse, which is a placeholder for all databases that are on-premise at Hannover Re. So this is where the data and the calculation happens.

And then, here on the right-hand side, we have the core calculations, or what are called core calculation. And these are basically executable files created from Matlab code. And the process control software then just starts a process, and calls the function, and executes it. And this function then reads and writes data, starts parallel computing in the cloud, et cetera.

Once this is finished, the Matlab code then generates the reporting. This is still in Excel because-- sorry, just need to have a bit of water. This is still in Excel because this was quite important for many of our colleagues, because in the end, you can download the Excel reports from the server, and you can edit them, and add notes and comments, and do some additional calculations, and copy data from it easily.

So at the moment, this is, yeah, the Reporting Tool but we also have some reporting in MicroStrategy, which is some BI tool. But this is just a minor part.

And in addition to that, we have, of course, the Matlab code that is run, we say, manually or on our notebooks. So we can just, yeah, run Matlab code directly, and also refer to data and reporting from the server. We still have this in place for additional analysis and some special applications.

OK, so this is just a rough overview to give you an impression how many processes we have created since 2018. So the dotted boxes is what we call modules. So the internal model is divided into several modules like MKT, the market risk, counterparty default, life and health reinsurance, property casualty, a general module, which covers FX rates and some basic interest rates, and some special modules. And, of course, this one we just called Group because it's the Hannover Re group, and this is the aggregation model collecting all the output from all other modules and some operational risk also in place.

And yeah, I just added some data flow errors. We won't get into detail here because it's just to give you an impression. And that means over the years, the complexity grew. So we have nowadays more than 30 processes, automated, but at the same time, we could decrease our runtime and partly up to more than 120, 1 over 50.

And this is, of course, mainly due to parallel computing. But also, I mean, running on the cloud is faster than running on your notebook, of course. And when we had this in manual processes, we run the processes on our notebooks. So we have more memory available on the cloud, et cetera. So this is really convenient.

OK, and looking at the time, I have still some minutes left. Then let's have a quick look at the example workflow, some basics about our our IM application.

So this is just a screenshot how the Internal Model application user interface looks. You can open it in a browser. So this is from Chrome browser, a screenshot.

And here, I picked out the counterparty default module because it's quite simple. You see here, the dependency structure, like counterparty aggregation, is the final process of the module. And all those other processes are predecessor that feed into this final process. And you see also successor processes from other modules.

Below, you have the part where the calculation runs are listed. You can start a run. You can upload files and stuff like that. So I won't get into detail because I think it would take 1 or 1.5 hours to give an introduction to all the features of this application, just to give you an impression from that.

And I think I have another one from the aggregation process-- So just collecting all the data. And why I'm showing you that? Because you might ask, where does Matlab come into play here?

So if you want to start a new calculation run, you will get a Create Task in this user interface. And that means, on the one, first, you need to give a name. I'll call it just Demo here for a screenshot. And you need to refer to other process runs that have already been created because they feed data into your group process run.

And there will be a calculation created automatically. So it's all, say, numbered by calculation IDs that are unique so that you can refer uniquely to process outputs. And once you've selected your general setup, in this case, for this group aggregation run, then you will get an active calculation run. So here is just here, where there's those completed calculation runs, and they're now cut out.

And this is in waiting status because you need to fulfill a setup task. And now, Matlab comes into play. So when opening the setup task, you need to select the core calculation script for this process.

So this process has two sub-processes, P1 and P2. So I've just shown here P1. And for P1, you need to select a Matlab script, which is actually the executable file that we've created. You need to select the parameter file, which is an Excel spreadsheet containing parameters that do not change too often. And also, you need to select some JSON file, which will then create this interface. And this is for parameters that change frequently, so you can change them on the fly when creating a run.

But here, this is the part where the Matlab script can be selected. And it can be selected according to its Git commit message. So we use Git source control not only for the code, but also for the executable files to maintain them.

So here's some-- a few minutes, some screenshots from input parameter, Excel, and output. It's still important for us. But maybe let's skip this and get to the last section before the questions and answers, which is then, , a brief recommendation about security and audit aspects.

So what do I want to say? So we have several, let's say, best practices around when it comes to writing code. So we try to follow the Matlab style guide, and we do code reviews. We use Git source control. We apply pull request when changing code. We apply unit testing, and so on.

But however, I mean, in the automated application, we were running executable files. And those are binary files, so you can't really see differences between them in the Git source control. So how can you be sure that what it's running there is actually what you see in the code?

And that's why we developed a compiler tool. And this is a small tool. This is just using the Matlab App Designer, I think. And here, you can just give some folders for the search path.

So we store common functions in a separate place, such that several functions can refer to those common functions so you don't need to copy code to different parts of the processes. You can, for example, add files to your compilation. You can include the parallel profile to use more workers than 24, I guess. For that, you need that, and so on.

And now, why are we doing this? Because it creates an output. On the one hand, this is the executable file and some technical stuff that's not very interesting. Then, it also creates log messages from the compilation to see what happened there, which user did the compiling, and what is really important, and the reason for creating it. It creates a snapshot of the tools that have been used during compilation.

That means every time we create an executable file, we get a snapshot of code that is the basis for that. And once we need to, yeah, say debug or look into errors when running the code, we can refer to this code. Because in the meantime, in the source control, the code might have changed. So if your application says there's an error in line 25, line 25 might be different because the code has changed. And it's then far more easy to get directly to the snapshot and see what has been used for creating the executable file.

OK, So this was a bit quick, I'm afraid, but now time is running out. So thank you for your attention, and I'm happy to answer your questions.

View more related videos