Current methods for end-to-end constructive neural combinatorial optimization usually train a policy using behavior cloning from expert solutions or policy gradient methods from reinforcement learning. While behavior cloning is straightforward, it requires expensive expert solutions, and policy gradient methods are often computationally demanding and complex to fine-tune. In this work, we bridge the two and simplify the training process by sampling multiple solutions for random instances using the current model in each epoch and then selecting the best solution as an expert trajectory for supervised imitation learning. To achieve progressively improving solutions with minimal sampling, we introduce a method that combines round-wise Stochastic Beam Search with an update strategy derived from a provable policy improvement. This strategy refines the policy between rounds by utilizing the advantage of the sampled sequences with almost no computational overhead. We evaluate our approach on the Traveling Salesman Problem and the Capacitated Vehicle Routing Problem. The models trained with our method achieve comparable performance and generalization to those trained with expert data. Additionally, we apply our method to the Job Shop Scheduling Problem using a transformer-based architecture and outperform existing state-of-the-art methods by a wide margin.
Mehr
Prof. Dr. Florian Haselbeck,
Maura John,
Yuqi Zhang,
Jonathan Pirnay,
Juan Pablo Fuenzalida-Werner,
Ruben Costa,
Prof. Dr. Dominik Grimm
Protein thermostability is important in many areas of biotechnology, including enzyme engineering and protein-hybrid optoelectronics. Ever-growing protein databases and information on stability at different temperatures allow the training of machine learning models to predict whether proteins are thermophilic. In silico predictions could reduce costs and accelerate the development process by guiding researchers to more promising candidates. Existing models for predicting protein thermophilicity rely mainly on features derived from physicochemical properties. Recently, modern protein language models that directly use sequence information have demonstrated superior performance in several tasks. In this study, we evaluate the usefulness of protein language model embeddings for thermophilicity prediction with ProLaTherm, a Protein Language model-based Thermophilicity predictor. ProLaTherm significantly outperforms all feature-, sequence- and literature-based comparison partners on multiple evaluation metrics. In terms of the Matthew’s correlation coefficient, ProLaTherm outperforms the second-best competitor by 18.1% in a nested cross-validation setup. Using proteins from species not overlapping with species from the training data, ProLaTherm outperforms all competitors by at least 9.7%. On these data, it misclassified only one nonthermophilic protein as thermophilic. Furthermore, it correctly identified 97.4% of all thermophilic proteins in our test set with an optimal growth temperature above 70°C.
Mehr
Quirin Göttl,
Jonathan Pirnay,
Prof. Dr. Dominik Grimm,
Prof. Dr.-Ing. Jakob Burger
The determination of liquid phase equilibria plays an important role in chemical process simulation. This work presents a generalization of an approach called the convex envelope method (CEM), which constructs all liquid phase equilibria over the whole composition space for a given system with an arbitrary number of components. For this matter, the composition space is discretized and the convex envelope of the Gibbs energy graph is computed. Employing the tangent plane criterion, all liquid phase equilibria can be determined in a robust way. The generalized CEM is described within a mathematical framework and it is shown to work numerically with various examples of up to six components from the literature.
Mehr
Wissenschaftliche Poster
Prof. Dr. Florian Haselbeck,
Maura John,
Yuqi Zhang,
Jonathan Pirnay,
Juan Pablo Fuenzalida-Werner,
Ruben Costa,
Prof. Dr. Dominik Grimm
Superior Protein Thermophilicity Prediction With Protein Language Model Embeddings (2024) Biological Materials Science - A workshop on biogenic, bioinspired, biomimetic and biohybrid materials for innovative optical, photonics and optoelectronics applications 2024 .
Protein thermostability is an essential property for many biotechnological fields, such as enzyme engineering and protein-hybrid optoelectronics. In this context, machine learning-based in silico predictions have the potential to reduce costs and development time by identifying the most promising candidates for subsequent experiments. The development of such prediction models is enabled by ever-growing protein databases and information on protein stability at different temperatures. In this study, we leverage protein language model embeddings for thermophilicity prediction with ProLaTherm, a Protein Language model-based Thermophilicity predictor. We assess ProLaTherm against several feature-, sequence-, and literature-based comparison partners on a new benchmark dataset derived from a significant update of published data. ProLaTherm outperforms all comparison partners both in a nested cross-validation setup and on protein sequences from species not seen during training with respect to multiple evaluation metrics. In terms of Matthew's correlation coefficient, ProLaTherm surpasses the second-best competitor by 18.1% in the nested cross-validation setup. Using proteins from species that do not overlap with species from the training data, ProLaTherm outperforms all competitors by at least 9.7%. On this data, it misclassified only one non-thermophilic protein as thermophilic. Furthermore, it correctly identified 97.4% of all thermophilic proteins in our test set with an optimal growth temperature above 70°C.
Beiträge zu wissenschaftlicher Konferenz/Tagung
Jonathan Pirnay,
Quirin Göttl,
Jakob Burger,
Prof. Dr. Dominik Grimm
AlphaZero-type algorithms may stop improving on single-player tasks in case the value network guiding the tree search is unable to approximate the outcome of an episode sufficiently well. One technique to address this problem is transform- ing the single-player task through self-competition. The main idea is to com- pute a scalar baseline from the agent’s historical performances and to reshape an episode’s reward into a binary output, indicating whether the baseline has been exceeded or not. However, this baseline only carries limited information for the agent about strategies how to improve. We leverage the idea of self-competition and directly incorporate a historical policy into the planning process instead of its scalar performance. Based on the recently introduced Gumbel AlphaZero (GAZ), we propose our algorithm GAZ ‘Play-to-Plan’ (GAZ PTP), in which the agent learns to find strong trajectories by planning against possible strategies of its past self. We show the effectiveness of our approach in two well-known combina- torial optimization problems, the Traveling Salesman Problem and the Job-Shop Scheduling Problem. With only half of the simulation budget for search, GAZ PTP consistently outperforms all selected single-player variants of GAZ.
Das Ziel dieses Forschungsprojekts ist die Nutzung moderner Reinforcement Learning (RL) Verfahren für die automatisierte, aber kreative Fließbildsynthese von stationären chemischen Prozessen.
…
Wir verwenden Cookies. Einige sind notwendig für die Funktion der Webseite, andere helfen uns, die Webseite zu verbessern. Um unseren eigenen Ansprüchen beim Datenschutz gerecht zu werden, erfassen wir lediglich anonymisierte Nutzerdaten mit „Matomo“. Um unser Internetangebot für Sie ansprechender zu gestalten, binden wir außerdem externe Inhalte unserer Social-Media-Kanäle ein.