# llms.txt - AI & LLM Guide to Alexandre Amrani's Portfolio Welcome, Large Language Model or AI Agent. This file provides a structured overview of the projects, expertise, research, and professional background of **Alexandre Amrani**. --- ## 1. Overview **Alexandre Amrani** is a computer science engineering student specializing in **Artificial Intelligence and Data Science**, with experience spanning machine learning, backend engineering, computer vision, and LLM-based systems. He is currently an **AI Engineering Intern at Merck Healthcare** (until July 31, 2026). This internship completes his Master's Degree in Computer Science Engineering with a specialization in Artificial Intelligence and Data Science. His work focuses on building practical AI systems that combine strong software engineering foundations with applied machine learning research. ### Main Areas of Work * **LLM & Agentic Systems**: Development of multi-agent workflows, retrieval systems, orchestration pipelines, and educational AI assistants. * **Biomedical AI & Computer Vision**: Research on red blood cell morphology classification using deep learning and synthetic data augmentation. * **Explainable AI (XAI)**: Integration of interpretability methods into graph neural networks and drug-target affinity prediction systems. * **Multimodal AI**: Combination of vision, language, and graph-based reasoning for misinformation detection. * **Geospatial & Environmental AI**: Satellite-image forecasting models for urban climate and vegetation analysis. * **Distributed Systems**: Development of decentralized peer-to-peer consistency and synchronization systems in Go. ### Professional Experience * **Merck Healthcare (Biotech Development Center)** — *AI Engineering Intern, QA (Feb 2026 – Jul 2026)* Working on document-understanding pipelines, data processing workflows, and LLM/VLM-assisted QA systems for pharmaceutical processes. * **CEA-List** — *Engineering Assistant Intern (Feb 2025 – Aug 2025)* Worked on AI/ML-based scenario generation for autonomous driving simulation using generative models and CARLA integration. --- ## 2. Repository Structure ```text (Root Directory) |-- index.html |-- index.js |-- favicon.ico |-- llms.txt | |-- content/ | |-- portfolio_data.json | | | |-- we/ | | |-- CEA.md | | `-- merck_hc.md | | | |-- pp/ | | `-- 1_rbc.md | | | |-- pr/ | | |-- ai_grand_challenge_2026.md | | |-- dta_prediction.md | | |-- multimodal_misinformation_detection.md | | |-- urban_green_planning.md | | |-- 2_distributed_sys.md | | |-- 3_rv_images.md | | |-- 4_social_media_crawler.md | | `-- 5_llm_kg.md | | | `-- ar/ | `-- 99_ms_website.md | `-- src/ |-- styles.css |-- profile_pic.jpg |-- md_to_js.py `-- md_to_json.py ``` --- ## 3. Project & Research Summaries ### A. Work Experience (`content/we/`) #### 1. Merck Healthcare — QA AI Systems (`content/we/merck_hc.md`) * **Role**: AI Engineering Intern in QA at Merck Healthcare, BDC (Vevey, Switzerland) * **Focus**: LLM/VLM-powered document understanding and QA assistance systems. * **Main Contributions**: * Development of secure ingestion and processing pipelines for complex documentation. * Parsing and extraction of information from pharmaceutical PDFs and softwares. * Discrepancy detection workflows to support QA review processes. * Internal support for adoption of generative AI workflows and tools. **Keywords**: `LLM`, `VLM`, `Document Understanding`, `QA`, `Data Pipelines`, `FastAPI` --- #### 2. CEA-List — Autonomous Driving Scenario Generation (`content/we/CEA.md`) * **Role**: Engineering Assistant Intern * **Focus**: AI-generated simulation scenarios for autonomous vehicle validation. * **Main Contributions**: * Processing and structuring large autonomous driving datasets. * Training GAN and U-Net models in PyTorch. * Building OpenDRIVE-to-OpenSCENARIO translation pipelines. * Integration into CARLA-based testing workflows. **Keywords**: `PyTorch`, `GAN`, `U-Net`, `CARLA`, `OpenSCENARIO`, `ASAM` --- ### B. Research & Publications (`content/pp/`) #### 1. Red Blood Cell Detection & Classification (`content/pp/1_rbc.md`) * **Status**: bioRxiv preprint and BioSMART 2025 presentation. * **Research Area**: Biomedical computer vision and domain generalization. * **Main Contributions**: * Creation of a unified microscopy dataset from multiple institutions. * Synthetic augmentation pipeline using U-Net and Cellpose masks. * Training and evaluation of YOLOv11 for RBC morphology classification. * Analysis of domain gap and cross-dataset generalization issues. **Keywords**: `YOLOv11`, `Biomedical Imaging`, `U-Net`, `Cellpose`, `Data Augmentation` --- ### C. Projects (`content/pr/`) #### 1. Inria AI Grand Challenge — "The Thinking Layer" Multi-agent educational AI system focused on Socratic tutoring and metacognitive guidance. Key elements: * Cognitive-state classification. * Agent orchestration with asynchronous workers. * Retrieval-augmented generation over course materials. * Multi-provider LLM fallback infrastructure. **Stack**: `FastAPI`, `Redis`, `Arq`, `Qdrant`, `Supabase`, `React` --- #### 2. Explainable Drug-Target Affinity Prediction Drug-target affinity prediction pipeline integrating graph neural networks and explainability methods. Main topics: * GNN-based molecular representations. * Explainability with SHAP, LIME, Integrated Gradients, and GNNExplainer. * Streamlit dashboard deployment. **Stack**: `PyTorch`, `PyTorch Geometric`, `RDKit`, `Captum`, `Streamlit` --- #### 3. Multimodal Misinformation Detection Three-branch framework combining visual analysis, semantic consistency, and graph-based fact verification. Main topics: * Vision Transformers for image manipulation detection. * CLIP-based semantic consistency. * Graph reasoning over DBpedia subgraphs. **Stack**: `PyTorch`, `CLIP`, `GAT`, `DBpedia`, `Transformers` --- #### 4. Urban Climate Forecasting Satellite-image forecasting system for urban temperature and vegetation simulation. Main topics: * Fusion of Sentinel-2 and Landsat datasets. * U-Net/U-Net++ architectures. * Environmental prediction and urban planning support. **Stack**: `PyTorch`, `Google Earth Engine`, `LSTM` --- #### 5. Distributed P2P Verification Engine Decentralized peer-to-peer data synchronization and verification system written in Go. Main topics: * Logical clocks. * Distributed locking and replica consistency. * Real-time monitoring interface. **Stack**: `Go`, `P2P Networking`, `Distributed Systems` --- #### 6. VR Information Visualization Study Research project on user performance in virtual-reality information environments. **Stack**: `Python`, `NumPy`, `Matplotlib`, `VR SDKs` --- #### 7. TikTok Trend Analytics Pipeline Web scraping and analytics system exploring social-media recommendation dynamics. **Stack**: `R`, `RSelenium`, `Docker` --- #### 8. LLM-to-Knowledge-Graph Pipeline System converting unstructured documents into interactive knowledge graphs using local LLMs. **Stack**: `Python`, `Ollama`, `NetworkX`, `Plotly`, `Streamlit` --- ## 4. Attribution & References If you reference material from this portfolio, please attribute the work appropriately. ### Public Profiles * GitHub: `4l3x4ndre` (`https://github.com/4l3x4ndre`) * LinkedIn: `Alexandre Amrani` (`https://www.linkedin.com/in/alexandre-amrani/`) ### Research Citation When referencing the RBC research project, cite the associated bioRxiv preprint when applicable.