Abdelrhman Eldallal

NLP & ML Engineer

My career objective is to fully utilize my technical, analytical, and problem-solving abilities and knowledge coupled with my communication and group work skills to empower people, grow personally, and build my professional career.

Location: Berlin, Germany
Website: https://abdelrhmaneldallal.com

LinkedIn: Abdelrhman Eldallal

Education

Sep 2019 – Jul 2021

Master of Science in Computer Science in Data Science from University of Tatru with GPA of 4.43/5.0 Thesis title: BibRank: Automatic Keyphrase Extraction Platform Using Metadata

Aug 2011 – Jun 2016

Bachelor of Science (BS) in Computer Engineering from The American University in Cairo with GPA of 3.54/4.0, Graduated with Honor’s Awards Thesis title: Sentameter: Arabic Sentiment Analysis

Feb 2014 – Jun 2014

Study Abroad in Computer Science from University of Rochester

Experience

May 2024 – Apr 2026

Machine Learning Engineer at Aleph Alpha

Established company-wide evaluation datasets (English and German) and benchmarks for LLMs and agent systems, standardizing quality assessment and enabling adoption across teams.
Implemented key features and improvements for MCP-based agent services and protocols, enhancing data accessibility, output quality, and system stability across multiple services, resulting in over 30% improvement in core end-to-end quality metrics.
Built and delivered a production-grade finetuning platform (PyTorch, Hugging Face, DeepSpeed) with automated evaluation and traceability, cutting manual engineering effort by 90%+ and speeding up LLM iteration.

Mar 2023 – Apr 2024

Machine Learning Engineer at Lengoo

Enhanced automated workflows for Machine Translation (MT) models, streamlining data preparation, training, evaluation, deployment processes, and monitoring resulting in over 50% reduction in costs and deployment times.
Led the design and implementation of a scalable workflow for fine-tuning Large Language Models (LLMs),including data processing and advanced model evaluation techniques.
Accelerated fine-tuning initiatives using Causal Language Modeling (CLM) and Supervised Fine-Tuning methods, significantly improving the performance and accuracy of models on specific tasks. Achieved notable efficiency in fine-tuning efforts for large-scale models like LLama-2 70B through multiple optimizations.

Mar 2022 – Dec 2022

Machine Learning Engineer NLP at Oxolo

Developed and executed information extraction modules for processing unstructured data from thousands of web pages utilizing Large Language Models (LLMs).
Trained a BERT model to predict pauses in an automatically generated text as part of a text-to-speech pipeline and improved prediction performance by 15%.
Incorporated a CI/CD pipeline to deploy and test updates on the staging and production servers

Mar 2021 – Jan 2022

Machine Learning Engineer at Brainbase

Designed, Developed, and tested an information extraction model to process contracts (legal documents).
Utilized an ML pipeline to process more than 23,000 contract documents and extracted more than 1 million fields accurately, which reduced processing time from a few weeks to one day.
Fine-tuned BERT model on a question answering task for legal documents using Huggingface Transformers.
Parsed different types of documents using Amazon Textract and processed the structured output.
Performed data cleaning and normalization using a customized Python pipeline.
Lead a data annotation task and communicated with the implementation team to use a 3rd-party annotation tool, resulting in a dataset of 350 documents with more than 2500 fields.

Sep 2019 – Mar 2021

Research Assistant at Data Systems Group - University of Tartu

Conducted research related to combining semantic web and machine learning using Internet Memes.
Created a dataset for 20 Internet Memes with RDF representation to summarize the metadata and connected resources to the Memes.
Designed a model to convert Internet Memes into embeddings using NLP and Image processing techniques.
Created several experiments to collect and analyze data.
Contributed to multiple research projects in the field of machine learning and machine learning automation.

Jan 2019 – Aug 2019

Machine Learning, AI, and NLP Engineer at Self Employed

Developed AWS-based semi-supervised data-driven high-confidence systems to support content creation on Gabor Melli Research Knowledge (Wiki-based online encyclopedia).
Outperformed the state-of-art scores for WikiText error correction and published the results in LREC 2020 conference.
Designed WikiText error correction system and WikiText generation (autocomplete), and other NLP models.
Researched and solved NLP problems and tasks, such as sentiment analysis and co-reference resolution.

Dec 2017 – Jan 2019

Machine Learning Engineer at Smartly.AI

Improved a Question Answering model response time by 3x using a clustering-based solution.
Implemented context-based spelling correction model in Python for English, French, and German.
Built different NLP modules such as sentiment analysis, entity extraction, and text cleaning using various tools and libraries such as SpaCy, RASA NLU, Duckling, and more.
Provided language support for multiple languages such as Japanese and different Arabic dialects.
Developed speech recognition models for English and French using Recurrent Neural Networks (LSTM) models.

Jul 2016 – Dec 2017

Machine Learning, AI, and NLP Engineer at Hyphen (Skylar Labs)

Designing, implementing, and testing libraries in Python to be used to create and train Chat-bots.
Building a full system for questions similarity evaluation using deep learning and testing against Quora dataset.
Implementing sentiment analysis, spelling checking with auto correction, keywords and named entities extraction, co-reference resolution, and question classifications algorithms with the use of different neural nets.

Jul 2016 – Aug 2017

Software Engineer Intern at Zlabs

Designing, developing, debugging and maintaining web applications.

Feb 2013 – Dec 2014

Teaching Assistant at American University in Cairo

Undergraduate Teaching Assistant in the Computer Science and Engineering Department
Courses Programming Fundamentals with C++, Data Structures and Algorithms, Concepts of Programming Languages.

Publications

Oct 2023

BibRank: Automatic Keyphrase Extraction Platform Using Metadata by MDPI: Information

This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms. The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing bibliographic data in BibTeX format. BibRank combines innovative weighting techniques with positional, statistical, and word co-occurrence information to extract keyphrases from documents. The platform proves valuable for researchers and developers seeking to enhance their keyphrase extraction algorithms and advance the field of natural language processing.

May 2020

GM-RKB WikiText Error Correction Task and Baselines by European Language Resources Association

We introduce the GM-RKB WikiText Error Correction Task for the automatic detection and correction of typographical errors in WikiText annotated pages.

Jan 2021

The impact of Auto-Sklearn’s Learning Settings: Meta-learning, Ensembling, Time Budget, and Search Space Size by EDBT/ICDT Workshops 2021

we present a micro-level analysis of the AutoML process by empirically evaluating and analyzing the impact of several learning settings and parameters.

Certifications

Machine Learning Modeling Pipelines in Production from Coursera , 2021.11

Machine Learning Data Lifecycle in Production from Coursera , 2021.10

Introduction to Machine Learning in Production from Coursera , 2021.9

Languages

Arabic: Fluency: Native speaker
English: Fluency: Native speaker
German: Fluency: Elementary proficiency

Skills

Machine Learning

Expert

Deep Learning, Python, Clustering & Classification, Statistical Modeling

Natural Language Processing

Expert

Text and speech processing, Language Modeling, Deep Learning, Rule Based NLP

Data Engineering

Advanced

SQL, Apache Airflow, Apache Spark

Infrastructure

Advanced

Container based environments, Cloud service providers including AWS and GCP

Interests

Books

Literature
Technical

Education

Experience

Highlights

Highlights

Highlights

Highlights

Highlights

Highlights

Highlights

Highlights

Highlights

Highlights

Publications

Certifications

Languages

Skills

Interests

Templates (for web app):

Error