LinkedInGitHub

PORTFOLIO 2026

MOHAMMAD
SALEM

AI + DEVELOPMENT

DATA
SCIENCE

Scroll

01 — TECHNICAL ARSENAL

PYTHONTENSORFLOWXGBOOSTSKLEARNPANDASSTREAMLITTABLEAUPOSTGRESQLSUPABASEOPENAILLAMAINDEXWHISPERREACT NATIVEPYTHONTENSORFLOWXGBOOSTSKLEARNPANDASSTREAMLITTABLEAUPOSTGRESQLSUPABASEOPENAILLAMAINDEXWHISPERREACT NATIVEPYTHONTENSORFLOWXGBOOSTSKLEARNPANDASSTREAMLITTABLEAUPOSTGRESQLSUPABASEOPENAILLAMAINDEXWHISPERREACT NATIVEPYTHONTENSORFLOWXGBOOSTSKLEARNPANDASSTREAMLITTABLEAUPOSTGRESQLSUPABASEOPENAILLAMAINDEXWHISPERREACT NATIVE
MACHINE LEARNINGDEEP LEARNINGNLPCOMPUTER VISIONRAGOCRPREDICTIVE MODELINGTIME SERIESSTATISTICAL MODELINGGEOSPATIAL ANALYSISGENERATIVE AIAI-ASSISTED DEVELOPMENTPRODUCT DEVELOPMENTDATA VISUALIZATIONMACHINE LEARNINGDEEP LEARNINGNLPCOMPUTER VISIONRAGOCRPREDICTIVE MODELINGTIME SERIESSTATISTICAL MODELINGGEOSPATIAL ANALYSISGENERATIVE AIAI-ASSISTED DEVELOPMENTPRODUCT DEVELOPMENTDATA VISUALIZATIONMACHINE LEARNINGDEEP LEARNINGNLPCOMPUTER VISIONRAGOCRPREDICTIVE MODELINGTIME SERIESSTATISTICAL MODELINGGEOSPATIAL ANALYSISGENERATIVE AIAI-ASSISTED DEVELOPMENTPRODUCT DEVELOPMENTDATA VISUALIZATIONMACHINE LEARNINGDEEP LEARNINGNLPCOMPUTER VISIONRAGOCRPREDICTIVE MODELINGTIME SERIESSTATISTICAL MODELINGGEOSPATIAL ANALYSISGENERATIVE AIAI-ASSISTED DEVELOPMENTPRODUCT DEVELOPMENTDATA VISUALIZATION

03 — PROFESSIONAL EXPERIENCE

The Journey

Building AI pipelines to process pharmaceutical vendor files using OCR (PyMuPDF, Tesseract, PaddleOCR). Currently implementing text extraction via bounding boxes and date normalization logic.

Developing a RAG-based document retrieval system with LlamaIndex, optimized with chunk tuning and open-source LLMs (Mistral, Phi-2).

Conducting end-to-end evaluation of the document intelligence system—benchmarking OCR accuracy, RAG retrieval quality, and routing performance.

Remote· Externship|Generative AILLMOCRRAGLlamaIndexPaddleOCR

Developing a machine learning classifier to identify pre-seizure physiological patterns by analyzing large-scale EKG/ECG datasets from wearable monitors.

Engineering and training neural networks on multi-dimensional time-series data to detect subtle autonomic shifts that precede clinical seizure onset.

Optimizing model performance for real-time inference on smartwatch hardware and edge devices.

Montclair, NJ · Hybrid· Internship|Healthcare AITime Series AnalysisDeep LearningEdge Computing

Collaborated with the Chief of Obstetrics to translate clinical protocols into app features like BP triggers, observation timers, and treatment windows.

Co-developed core app logic for role-based notifications, enabling real-time coordination between nurses and residents.

Delivered a winning pitch and live demo to hospital stakeholders, focusing on reducing 'door-to-needle' time.

New Brunswick, NJ· Hackathon Win|Healthcare InnovationMobile DevProduct DesignClinical Workflow

Launched on iOS App Store, handling all privacy compliance, monetization, and analytics infrastructure.

Architected a local-first content pipeline using Whisper for transcription and RAG for Q&A, allowing users to 'chat' with their audio.

Built a full-stack solution with React Native, TypeScript, and Supabase, implementing complex features like vector search and edge functions.

Directed the entire product lifecycle from user research to UI/UX design and technical specification.

Remote· Self-Start|React NativeTypeScriptSupabaseOpenAIApp Store Deployment

Designed a multi-provider waterfall enrichment process, achieving >96% verified emails and improving data quality significantly.

Reduced outreach and list-building costs by >45% by replacing manual sourcing with automated data-driven systems.

Standardized targeting processes by creating reusable filtering templates and documentation for future campaigns.

Wayne, NJ · Hybrid· Internship|Data EngineeringProcess AutomationWorkflow OptimizationROI Analysis

Cleaned and standardized client databases to unlock new marketing channels and improve customer segmentation.

Generated geographical heat maps of sales performance, identifying 7+ high-value market expansion opportunities.

Developed an XGBoost machine learning model to predict future purchase categories, deploying it via a Streamlit dashboard.

Managed end-to-end sign production projects while simultaneously leading data visualization efforts.

Clifton, NJ· Contract / Seasonal|PythonXGBoostStreamlitTableauOperations

05 — SKILLS & TECHNOLOGIES

Technical Expertise

A comprehensive toolkit for solving complex problems across the data science and AI landscape

AI/ML Engineering

Building intelligent systems that learn and adapt

RAG SystemsOpenAI, LlamaIndex, LangChain
Computer VisionTesseract, PaddleOCR, PyMuPDF
Deep LearningTensorFlow, Neural Networks
Classical MLXGBoost, Random Forest, Scikit-learn
LLMsMistral, Phi-2, GPT-4

Data Analytics & Insights

Transforming data into actionable intelligence

AnalysisPython: Pandas, NumPy
VisualizationTableau, Seaborn, Matplotlib
Statistical Modelingstatsmodels, Hypothesis Testing
Time SeriesEKG/ECG Analysis, Forecasting

Full-Stack Development

End-to-end product development from concept to deployment

FrontendReact Native, TypeScript, Next.js
BackendSupabase, PostgreSQL, Edge Functions
DeploymentApp Store, Streamlit, Vercel
APIsOpenAI, REST, GraphQL

Data Engineering

Building robust pipelines for data processing

DatabasesPostgreSQL, Supabase
ETL PipelinesPython, Pandas
Data QualityValidation, Enrichment (96% accuracy)
OCR ProcessingMulti-provider waterfall systems

03 — EDUCATION

Academic Foundation

Montclair State University

Bachelor of Science — Data Science

Minor in Mathematics

Sep 2022 – Apr 2026

CUMULATIVE GPA

3.61

HONORS & AWARDS

Presidential Scholarship

Merit-based academic scholarship

Dean's List

4x recipient

FROM CLASSROOM TO PRODUCTION

Course projects evolved into production-grade analytics and ML systems, including NYC Vehicle Collision Analysis (2M+ records, K-Means clustering), USAID Anti-Corruption Analysis, and Tableau Sales Analytics (interactive dashboards, geospatial mapping)

RELEVANT COURSEWORK

AI & MACHINE LEARNING

Machine Learning
Data Mining
Deep Learning
AI for Cybersecurity
Undergraduate Research

DATA SCIENCE

Advanced Data Science
Data Science & Statistics
Data Visualization
Python Programming
R Programming

CS FUNDAMENTALS

Data Structures & Algorithms
Database Systems
Software Engineering
Computer Systems

MATHEMATICS

Linear Algebra
Multivariable Calculus
Probability & Discrete Math

04 — PERSPECTIVE

My Approach

Building systems that solve real problems.

Turning complex methods into clean code.

Always learning, always iterating.

Bridging the gap between models and users.

Passionate about reliable AI engineering.