Industrial Problems

Since late 2021, I have been working full-time at a major bank in the United States in Quantitative Finance. For this reason, it is unlikely that my industrial research will be made public, since I work mostly with confidential information. Here is a list of the most recent industrial projects that I have worked on:


A problem by McKinsey: Predictive maintenance of naval heavy cutting machinery

7th Workshop on Mathematical Solutions for Industrial Problems - IMPA

McKinsey and Shape proposed a Predictive Maintenance challenge: modeling failure prediction in a factory of cutting equipment for production of ship parts. The industrial blades wear out from time to time according to the necessary intensity of the applied cuts. Taking into account the machine's operating data, collected by sensors, the goal is to predict when the next failure will occur. There are different types of failures, which are not explicitly categorized. Part of the challenge is to identify when each of these types of failure occur. The goal is to predict failure, allowing for scheduling of maintenance and minimal machine-idle time.
* I put together a couple of widgets for data exploration. Taking advantage of the ternary classification of cuts, I explored the lifespan of the blades per "cut composition": can we find a correlation between quality of cut product and frequency of failure or blade lifespan? You can check them at my Github repository: A-problem-by-McKinsey-Predictive-maintenance-of-naval-heavy-cutting-machinery.

Report: LINK

A problem by Rumo: Exports predictive models with Agribusiness and weather variables

7th Workshop on Mathematical Solutions for Industrial Problems - IMPA

The goal of the proposed project is to understand the producer decision making and predict the pace of grain export over the subsequential months. Key questions to determine the monthly export curve: 1) Knowing the export history of regions of interest, can one predict the amount of grain that will leave the region within a number of months following the date of forecasting? For how many months does the prediction is still valid? 2) Do historical monthly export curves differ from year to year? Do corn and soy present the same behavior? 3) Does the behavior of the price curve of grain (soybean or corn) in the international market affects the curve of export? 4) In addition to the international price, does exchange also affects the behavior of the curve? 5) How does the weather affects how the producer chooses to export or store grains month to month?

Report: LINK

A problem by Scotiabank: Monotonic Constrained Machine Learning Regression Methods in Credit Risk

Fields-CFI-CQAM Industrial Problem-Solving Workshop 2021

The object of this study is to assess monotonic regression methods in credit risk modeling. The motivation of the problem is to enhance the internal credit risk rating system for small and medium size businesses. Qualitative constraints are a common requirement, and the monotonicity hypothesis on the relationship between certain input features and predictions is one of such strong beliefs adopted in the industry. Different classes of models are explored, following Scotiabank's guidelines for the project. Decision tree methods, regression-based models and neural network-based models are tested and compared. Keywords: Credit Risk, Decision Trees, Regression, Neural Nets, Python, R, TensorFlow Lattice

A problem by StepWise: ZIP Code Versus Georeference in predictive modeling of credit-granting

VI Brazilian Study Group with Industry

When dealing with predictive modeling of credit-granting, different types of attributes are used: Cadastral, Behavioral, Business / Proposal, Credit Bureaux, in addition to Public, Private or Subsidiaries Sources. The Postal Address Code (C├│digo de Endere├žamento Postal CEP in Portuguese) in Brazil, in particular, has a unique contribution capacity (uncorrelated with most other attributes in general) and reasonably good predictive power. CEP is frequently used by truncating its numeric representation, considering the first d digits, for example. The report prepared (see Work) proposes a preliminary methodology, aiming to elaborate clustering sets of CEPs by considering the information of clients' defaults over a period of time. Additionally, we tested the number of clusters obtained using the Information Value criterion. Promising solutions are obtained using statistical and optimizing approaches. Other methodologies are suggested and could be complementary with the principal methodology proposed.

Report: Mathematics in Industry Reports (2021), Cambridge Open Engage, doi:10.33774/miir-2021-4lgsp-v2