Abdelrhman Eldallal

Abdelrhman Eldallal

NLP & ML Engineer

My career objective is to fully utilize my technical, analytical, and problem-solving abilities and knowledge coupled with my communication and group work skills to empower people, grow personally, and build my professional career.

Location
Berlin, Germany
Website
https://abdelrhmaneldallal.com
LinkedIn
Abdelrhman Eldallal

Education

Master of Science in Computer Science in Data Science from University of Tatru with GPA of 4.43/5.0 Thesis title: BibRank: Automatic Keyphrase Extraction Platform Using Metadata

Bachelor of Science (BS) in Computer Engineering from The American University in Cairo with GPA of 3.54/4.0, Graduated with Honor’s Awards Thesis title: Sentameter: Arabic Sentiment Analysis

Study Abroad in Computer Science from University of Rochester

Experience

present

Machine Learning Engineer at Aleph Alpha

Aleph Alpha

Machine Learning Engineer at Lengoo

Highlights

  • Enhanced automated workflows for Machine Translation (MT) models, streamlining data preparation, training, evaluation, deployment processes, and monitoring resulting in over 50% reduction in costs and deployment times.
  • Led the design and implementation of a scalable workflow for fine-tuning Large Language Models (LLMs),including data processing and advanced model evaluation techniques.
  • Accelerated fine-tuning initiatives using Causal Language Modeling (CLM) and Supervised Fine-Tuning methods, significantly improving the performance and accuracy of models on specific tasks. Achieved notable efficiency in fine-tuning efforts for large-scale models like LLama-2 70B through multiple optimizations.

Machine Learning Engineer NLP at Oxolo

Highlights

  • Developed and executed information extraction modules for processing unstructured data from thousands of web pages utilizing Large Language Models (LLMs).
  • Trained a BERT model to predict pauses in an automatically generated text as part of a text-to-speech pipeline and improved prediction performance by 15%.
  • Incorporated a CI/CD pipeline to deploy and test updates on the staging and production servers

Machine Learning Engineer at Brainbase

Highlights

  • Designed, Developed, and tested an information extraction model to process contracts (legal documents).
  • Utilized an ML pipeline to process more than 23,000 contract documents and extracted more than 1 million fields accurately, which reduced processing time from a few weeks to one day.
  • Fine-tuned BERT model on a question answering task for legal documents using Huggingface Transformers.
  • Parsed different types of documents using Amazon Textract and processed the structured output.
  • Performed data cleaning and normalization using a customized Python pipeline.
  • Lead a data annotation task and communicated with the implementation team to use a 3rd-party annotation tool, resulting in a dataset of 350 documents with more than 2500 fields.

Research Assistant at Data Systems Group - University of Tartu

Highlights

  • Conducted research related to combining semantic web and machine learning using Internet Memes.
  • Created a dataset for 20 Internet Memes with RDF representation to summarize the metadata and connected resources to the Memes.
  • Designed a model to convert Internet Memes into embeddings using NLP and Image processing techniques.
  • Created several experiments to collect and analyze data.
  • Contributed to multiple research projects in the field of machine learning and machine learning automation.

Machine Learning, AI, and NLP Engineer at Self Employed

Highlights

  • Developed AWS-based semi-supervised data-driven high-confidence systems to support content creation on Gabor Melli Research Knowledge (Wiki-based online encyclopedia).
  • Outperformed the state-of-art scores for WikiText error correction and published the results in LREC 2020 conference.
  • Designed WikiText error correction system and WikiText generation (autocomplete), and other NLP models.
  • Researched and solved NLP problems and tasks, such as sentiment analysis and co-reference resolution.

Machine Learning Engineer at Smartly.AI

Highlights

  • Improved a Question Answering model response time by 3x using a clustering-based solution.
  • Implemented context-based spelling correction model in Python for English, French, and German.
  • Built different NLP modules such as sentiment analysis, entity extraction, and text cleaning using various tools and libraries such as SpaCy, RASA NLU, Duckling, and more.
  • Provided language support for multiple languages such as Japanese and different Arabic dialects.
  • Developed speech recognition models for English and French using Recurrent Neural Networks (LSTM) models.

Machine Learning, AI, and NLP Engineer at Hyphen (Skylar Labs)

Highlights

  • Designing, implementing, and testing libraries in Python to be used to create and train Chat-bots.
  • Building a full system for questions similarity evaluation using deep learning and testing against Quora dataset.
  • Implementing sentiment analysis, spelling checking with auto correction, keywords and named entities extraction, co-reference resolution, and question classifications algorithms with the use of different neural nets.

Software Engineer Intern at Zlabs

Highlights

  • Designing, developing, debugging and maintaining web applications.

Teaching Assistant at American University in Cairo

Highlights

  • Undergraduate Teaching Assistant in the Computer Science and Engineering Department
  • Courses Programming Fundamentals with C++, Data Structures and Algorithms, Concepts of Programming Languages.

Publications

BibRank: Automatic Keyphrase Extraction Platform Using Metadata by MDPI: Information

This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms. The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing bibliographic data in BibTeX format. BibRank combines innovative weighting techniques with positional, statistical, and word co-occurrence information to extract keyphrases from documents. The platform proves valuable for researchers and developers seeking to enhance their keyphrase extraction algorithms and advance the field of natural language processing.

GM-RKB WikiText Error Correction Task and Baselines by European Language Resources Association

We introduce the GM-RKB WikiText Error Correction Task for the automatic detection and correction of typographical errors in WikiText annotated pages.

The impact of Auto-Sklearn’s Learning Settings: Meta-learning, Ensembling, Time Budget, and Search Space Size by EDBT/ICDT Workshops 2021

we present a micro-level analysis of the AutoML process by empirically evaluating and analyzing the impact of several learning settings and parameters.

Certifications

Machine Learning Modeling Pipelines in Production from Coursera , 2021.11

Machine Learning Data Lifecycle in Production from Coursera , 2021.10

Introduction to Machine Learning in Production from Coursera , 2021.9

Languages

Arabic
Fluency: Native speaker
English
Fluency: Native speaker
German
Fluency: Elementary proficiency

Skills

Machine Learning
Level: Expert
Keywords:
  • Deep Learning, Python, Clustering & Classification, Statistical Modeling
Natural Language Processing
Level: Expert
Keywords:
  • Text and speech processing, Language Modeling, Deep Learning, Rule Based NLP
Data Engineering
Level: Advanced
Keywords:
  • SQL, Apache Airflow, Apache Spark
Infrastructure
Level: Advanced
Keywords:
  • Container based environments, Cloud service providers including AWS and GCP

Interests

Books
Keywords:
  • Literature
  • Technical